talk-data.com talk-data.com

Event

Data + AI Summit 2025

2025-06-09 – 2025-06-13 Databricks Summit Visit website ↗

Activities tracked

715

Sessions & talks

Showing 26–50 of 715 · Newest first

Search within this event →
Scaling Smarter: Technical Dive Into How Databricks Optimizes Model Serving

Scaling Smarter: Technical Dive Into How Databricks Optimizes Model Serving

2025-06-12 Watch
talk
Asfandyar Qureshi (Databricks)

Learn from the experts on how Databricks’ Mosaic AI Model Serving delivers unparalleled speed and scalability for deploying AI models. This session delves into the architecture and innovations that showcase the impressive improvements in throughput for the AI-serving infrastructure that powers Mosaic AI.

Securely Deploying AI/BI to All Users in Your Enterprise

Securely Deploying AI/BI to All Users in Your Enterprise

2025-06-12 Watch
talk
Austin Green (Databricks) , Keegan Dubbs (Databricks)

Bringing AI/BI to every business user starts with getting security, access and governance right. In this session, we’ll walk through the latest best practices for configuring Databricks accounts, setting up workspaces, and managing authentication protocols to enable secure and scalable onboarding. Whether you're supporting a small team or an entire enterprise, you'll gain practical insights to protect your data while ensuring seamless and governed access to AI/BI tools.

Sponsored by: Galileo Technologies Inc. | Taming Rogue AI Agents with Observability-Driven Evaluation

2025-06-12
talk
Atindriyo Sanyal (Galileo)

LLM agents often drift into failure when prompts, retrieval, external data, and policies interact in unpredictable ways. This technical session introduces a repeatable, metric-driven framework for detecting, diagnosing, and correcting these undesirable behaviors in agentic systems at production scale. We demonstrate how to instrument the agent loop with fine-grained signals—tool-selection quality, error rates, action progression, latency, and domain-specific metrics—and send them into an evaluation layer (e.g. Galileo). This telemetry enables a virtuous cycle of system improvement. We present a practical example of a stock-trading system and show how brittle retrieval and faulty business logic cause undesirable behavior. We refactor prompts, adjust the retrieval pipeline—verifying recovery through improved metrics. Attendees will learn how to: add observability with minimal code change, pinpoint root causes via tracing, and drive continuous, metric-validated improvement.

Sponsored by: Twilio | From Data to Impact: Scaling AI with Unified Customer Intelligence

Sponsored by: Twilio | From Data to Impact: Scaling AI with Unified Customer Intelligence

2025-06-12 Watch
talk
Srinivas Yadlapati (StockX) , Neerav Vyas (Capgemini)

In a landscape where customer expectations are evolving faster than ever, the ability to activate real-time, first-party data is becoming the difference between reactive and intelligent businesses. This fireside chat brings together experts from Capgemini, Twilio Segment, and leading marketplace StockX to explore how organizations are building future-proof data foundations that power scalable, responsible AI.

Supercharging Sales Intelligence: Processing Billions of Events via Structured Streaming

Supercharging Sales Intelligence: Processing Billions of Events via Structured Streaming

2025-06-12 Watch
talk
Anurag Bharati (DigiCert) , Nikita Raje (DigiCert)

DigiCert is a digital security company that provides digital certificates, encryption and authentication services and serves 88% of the Fortune 500, securing over 28 billion web connections daily. Our project aggregates and analyzes certificate transparency logs via public APIs to provide comprehensive market and competitive intelligence. Instead of relying on third-party providers with limited data, our project gives full control, deeper insights and automation. Databricks has helped us reliably poll public APIs in a scalable manner that fetches millions of events daily, deduplicate and store them in our Delta tables. We specifically use Spark for parallel processing, structured streaming for real-time ingestion and deduplication, Delta tables for data reliability, pools and jobs to ensure our costs are optimized. These technologies help us keep our data fresh, accurate and cost effective. This data has helped our sales team with real-time intelligence, ensuring DigiCert's success.

The New Competitive Edge: Building Resilient Supply Chains With Data + AI

The New Competitive Edge: Building Resilient Supply Chains With Data + AI

2025-06-12 Watch
talk
Dee Fitzgerald (Danone) , Andy Hancock (SAP) , Usman Zubair (Databricks)

Consumer-facing industries are evolving faster than ever — and in today’s competitive landscape, it’s supply chains, not companies, that are truly competing. While data and AI offer huge potential for optimization, many organizations struggle to turn use cases into real business impact. In this session, leaders from retail, consumer goods, travel and hospitality will share how they’re building strong data foundations to unlock AI-driven supply chain optimization. Learn how they're using generative AI to boost productivity, streamline operations and improve insights through better data collaboration.

Tracking Data and AI Lineage: Ensuring Transparency and Compliance

Tracking Data and AI Lineage: Ensuring Transparency and Compliance

2025-06-12 Watch
talk
Prithvi Kannan (Databricks) , Murt Neemuchwala (Databricks)

As AI becomes more deeply integrated into data platforms, understanding where data comes from — and where it goes — is essential for ensuring transparency, compliance and trust. In this session, we’ll explore the newest advancements in data and AI lineage across the Databricks Platform, including during model training, evaluation and inference. You’ll also learn how lineage system tables can be used for impact analysis and to gain usage insights across your data estate. We’ll cover newly released capabilities — such as Bring Your Own Lineage — that enable an end-to-end view of your data and AI assets in Unity Catalog. Plus, get a sneak peek at what’s coming next on the lineage roadmap!

Unlocking the Databricks Marketplace: A Hands-On Guide for Data Consumers and Providers

Unlocking the Databricks Marketplace: A Hands-On Guide for Data Consumers and Providers

2025-06-12 Watch
talk
Tia Chang (Databricks)

Curious about how to get real value from the Databricks Marketplace—whether you're consuming data or sharing it? This demo-heavy session answers the top 10 questions we hear from both data consumers and providers, with real examples you can put into practice right away. We’ll show consumers how to find the right product listing whether that's tables, files, AI models, solution accelerators, or Partner Connect integrations, try them out using sample notebooks, and access them with ease. You’ll also see how the Private Marketplace helps teams work more efficiently with a curated catalog of approved data. For providers, learn how to list your product in a way that stands out, use notebooks and documentation to help users get started, reach new audiences, and securely share data across your company or with trusted partners using the Private Marketplace. If you’ve ever asked, “How do I get started?” or “How do I make my data available internally or externally?”—this session has the answers, with demos to match.

What’s New in Databricks SQL: Latest Features and Live Demos

What’s New in Databricks SQL: Latest Features and Live Demos

2025-06-12 Watch
talk
Gaurav Saraf (Databricks) , Kent Marten (Databricks)

Databricks SQL has added significant features in the last year at a fast pace. This session will share the most impactful features and the customer use cases that inspired them. We will highlight the new SQL editor, SQL coding features, streaming tables and materialized views, BI integrations, cost management features, system tables and observability features, and more. We will also share AI-powered performance optimizations.

Kill Bill-ing? Revenge is a Dish Best Served Optimized with GenAI

Kill Bill-ing? Revenge is a Dish Best Served Optimized with GenAI

2025-06-12 Watch
lightning_talk
Abdul Furkhan (Sportsbet)

In an era where cloud costs can spiral out of control, Sportsbet achieved a remarkable 49% reduction in Total Cost of Ownership (TCO) through an innovative AI-powered solution called 'Kill Bill.' This presentation reveals how we transformed Databricks' consumption-based pricing model from a challenge into a strategic advantage through an intelligent automation and optimization. Understand how to use GenAI to reduce Databricks TCO Leverage generative AI within Databricks solutions enables automated analysis of cluster logs, resource consumption, configurations, and codebases to provide Spark optimization suggestions Create AI agentic workflows by integrating Databricks' AI tools and Databricks Data Engineering tools Review a case study demonstrating how Total Cost of Ownership was reduced in practice. Attendees will leave with a clear understanding of how to implement AI within Databricks solutions to address similar cost challenges in their environments.

Rust and Lakehouse Format — Ask Us Anything

2025-06-12
lightning_talk
Denny Lee (Databricks) , Robert Pack (Databricks) , Tyler Croy (Scribd, Inc.)

Join us for an in-depth Ask Me Anything (AMA) on how Rust is revolutionizing Lakehouse formats like Delta Lake and Apache Iceberg through projects like delta-rs and iceberg-rs! Discover how Rust’s memory safety, zero-cost abstractions and fearless concurrency unlock faster development and higher-performance data operations. Whether you’re a data engineer, Rustacean or Lakehouse enthusiast, bring your questions on how Rust is shaping the future of open table formats!

Solving Exclusive Data Access With Role-Based Access Control

Solving Exclusive Data Access With Role-Based Access Control

2025-06-12 Watch
lightning_talk
Siddharth Bhai (Databricks) , Stefania Leone (Databricks)

Do you have users that wear multiple hats over a day? Like working with data from various customers and hoping they don’t inadvertently aggregate data? Or are they working on sensitive datasets such as clinical trials that should not be combined, or are data sets that are subject to regulations? We have a solution! In this session, we will present a new capability that allows users wearing multiple hats to switch roles in the Databricks workspace to work exclusively on a dedicated project, data of a particular client or clinical trial. When switching to a particular role, the workspace adapts in such a way that only workspace objects and UC data of that particular role are accessible. We will also showcase the administrative experience of setting up exclusive access using groups and UC permissions.

Solving Health AI’s Data Problem

Solving Health AI’s Data Problem

2025-06-12 Watch
lightning_talk
Alex Aitken (Datavant)

AI in healthcare has a data problem. Fragmented data remains one of the biggest challenges, and bottlenecks the development and deployment of AI solutions across life sciences, payers, and providers. Legacy paper-driven workflows and fragmented technology perpetuate silos, making it difficult to create a comprehensive, real-time picture of patient health. Datavant is leveraging Databricks and AWS technology to solve this problem at scale. Through our partnership with Databricks, we are centralizing storage of clinical data from what is arguably the largest health data network so that we can transform it into structured, AI-ready data – and shave off 80 percent of the work of deploying a new AI use case. Learn how we are handling the complexity of this effort while preserving the integrity of source data. We’ll also share early use cases now available to our healthcare customers.

Sponsored by: Dagster Labs | The Age of AI is Changing Data Engineering for Good

Sponsored by: Dagster Labs | The Age of AI is Changing Data Engineering for Good

2025-06-12 Watch
lightning_talk
Pedram Navid (Dagster Labs)

The last major shift in data engineering came during the rise of the cloud, transforming how we store, manage, and analyze data. Today, we stand at the cusp of the next revolution: AI-driven data engineering. This shift promises not just faster pipelines, but a fundamental change in the way data systems are designed and maintained. AI will redefine who builds data infrastructure, automating routine tasks, enabling more teams to contribute to data platforms, and (if done right) freeing up engineers to focus on higher-value work. However, this transformation also brings heightened pressure around governance, risk, and data security, requiring new approaches to control and oversight. For those prepared, this is a moment of immense opportunity – a chance to embrace a future of smarter, faster, and more responsive data systems.

Sponsored by: DataHub | Beyond the Lakehouse: Supercharging Databricks with Contextual Intelligence

Sponsored by: DataHub | Beyond the Lakehouse: Supercharging Databricks with Contextual Intelligence

2025-06-12 Watch
lightning_talk
Gabriel Lyons (Datahub)

While Databricks powers your data lakehouse, DataHub delivers the critical context layer connecting your entire ecosystem. We'll demonstrate how DataHub extends Unity Catalog to provide comprehensive metadata intelligence across platforms. DataHub's real-time platform:Cut AI model time-to-market with our unified REST and GraphQL APIs that ensure models train on reliable and compliant data from across platforms, with complete lineage trackingDecrease data incidents by 60% using our event-driven architecture that instantly propagates changes across systems*Transform data discovery from days to minutes with AI-powered search and natural language interfaces.Leaders use DataHub to transform Databricks data into integrated insights that drive business value. See our demo of syncback technology—detecting sensitive data and enforcing Databricks access controls automatically—plus our AI assistant that enhances' LLMs with cross-platform metadata.

Sponsored by: definity | How You Could Be Saving 50% of Your Spark Costs

Sponsored by: definity | How You Could Be Saving 50% of Your Spark Costs

2025-06-12 Watch
lightning_talk
Roy Daniel (definity)

Enterprise lakehouse platforms are rapidly scaling – and so are complexity and cost. After monitoring over 1B vCore-hours across Databricks and other Apache Spark™ environments, we consistently saw resource waste, preventable data incidents, and painful troubleshooting. Join this session to discover how definity’s unique full-stack observability provides job-level visibility in-motion, unifying infrastructure performance, pipeline execution, and data behavior, and see how enterprise teams use definity to easily optimize jobs and save millions – while proactively ensuring SLAs, preventing issues, and simplifying RCA.

Sponsored by: e6data, Inc. | Hybrid Lakehouses with Unity Governance, Local Execution and Egress Control

Sponsored by: e6data, Inc. | Hybrid Lakehouses with Unity Governance, Local Execution and Egress Control

2025-06-12 Watch
lightning_talk
Vishnu Vasanth (e6data)

Data residency laws and legal mandates are driving the need for lakehouses across public and private clouds. This sprawl threatens centralized governance and compliance, while impacting cost, performance, and analytics/AI functionality. This session shows how e6data extends Unity Catalog across hybrid environments for consistent policy enforcement and query execution—regardless of data location—with guarantees around network egress, entitlements, performance, scalability, and cost. Learn how e6data’s “zero-data movement” philosophy powers a cost- and latency-optimized, location-aware architecture. We’ll cover onboarding strategies for hybrid fleets that enforce data movement restrictions and stay close to the data for better performance and lower cost. Discover how a location-aware compute strategy enables hybrid lakehouses with four key value metrics: cross-platform functionality, governed access, low latency, and total cost of ownership.

Sponsored by: Redpanda | IoT for Fun & Prophet: Scaling IoT and predicting the future with Redpanda, Iceberg & Prophet

Sponsored by: Redpanda | IoT for Fun & Prophet: Scaling IoT and predicting the future with Redpanda, Iceberg & Prophet

2025-06-12 Watch
lightning_talk
Bryan Wood (Redpanda Data)

In this talk, we’ll walk through a complete real-time IoT architecture—from an economical, high-powered ESP32 microcontroller publishing environmental sensor data to AWS IoT, through Redpanda Connect into a Redpanda BYOC cluster, and finally into Apache Iceberg for long-term analytical storage. Once the data lands, we’ll query it using Python and perform linear regression with Prophet to forecast future trends. Along the way, we’ll explore the design of a scalable, cloud-native pipeline for streaming IoT data. Whether you're tracking the weather or building the future, this session will help you architect with confidence—and maybe even predict it.

Sponsored by: Retool | Retooling Intelligence: Build Scalable, Secure AI Agents for the Enterprise with Databricks + Retool

Sponsored by: Retool | Retooling Intelligence: Build Scalable, Secure AI Agents for the Enterprise with Databricks + Retool

2025-06-12 Watch
lightning_talk
Tom Konewka (Retool)

Enterprises need AI agents that are both powerful and production-ready while being scalable and secure. In this lightning session, you’ll learn how to leverage Retool’s platform and Databricks to design, deploy, and manage intelligent agents that automate complex workflows. We’ll cover best practices for integrating real-time Databricks data, enforcing governance, and ensuring scalability all while avoiding common pitfalls. Whether you’re automating internal ops or customer-facing tasks, walk away with a blueprint for shipping AI agents that actually work in the real world.

Summit Live: AI/BI Genie & Dashboards - Talk With Your Data With GenAI Powered Business Intelligence

Summit Live: AI/BI Genie & Dashboards - Talk With Your Data With GenAI Powered Business Intelligence

2025-06-12 Watch
talk
Richard Tomlinson (Databricks) , Tim Riddle (Premier Inc)

AI/BI Genie lets anyone simply talk with their own data, using natural language, fully secured through UC to provide accurate answers within the context for your organization. AI/BI Dashboards goes beyond traditional BI tools, democratizing everyone to self-serve immediate interactive visuals on your own secured data. Hear from a customer and Databricks experts on the latest developments.

Accelerating Growth in Capital Markets: Data-Driven Strategies for Success

Accelerating Growth in Capital Markets: Data-Driven Strategies for Success

2025-06-12 Watch
talk
Bobby Grubert (RBC Capital Markets) , Antoine Amend (Databricks) , Raul Chavarria (B3 - Bolsa, Brasil e Balcão) , Jimmy Kozlow (Northern Trust)

Growth in capital markets thrives on innovation, agility and real-time insights. This session highlights how leading firms use Databricks’ Data Intelligence Platform to uncover opportunities, optimize trading strategies and deliver personalized client experiences. Learn how advanced analytics and AI help organizations expand their reach, improve decision-making and unlock new revenue streams. Industry leaders share how unified data platforms break down silos, deepen insights and drive success in a fast-changing market. Key takeaways: Predictive analytics and machine learning strategies for growth Real-world examples of optimized trading and enhanced client engagement Tools to innovate while ensuring operational efficiency Discover how data intelligence empowers capital markets firms to thrive in today’s competitive landscape!

Agentic Architectures to Create Realistic Conversations: Using GenAI to Teach Empathy in Healthcare

Agentic Architectures to Create Realistic Conversations: Using GenAI to Teach Empathy in Healthcare

2025-06-12 Watch
talk
Alex Ralevski (Tegria Consulting/Providence Healthcare)

Medical providers often receive less than 15 minutes of instruction in how to interact with patients during emotionally charged end of life interactions. Continuing education for clinicians is critical to hone these skills but is difficult to scale traditional approaches that require professional patients and instructors. Here, we describe a custom chatbot that plays the role of patient and coach to provide a scaling learning experience. A critical challenge was how to mitigate the persistently cheerful and helpful tone which results from standard pretraining in the Patient Persona AI. We accomplished this by implementing a multi-agent architecture based upon a graphical model of the conversation. System prompts reflecting the patient’s cognitive state are dynamically updated as the conversation progresses. Future extensions of the work are intended to focus on additional custom model fine-tuning in the Mosaic AI platform to further improve the realism of the conversation.

AI-Powered Data Discovery and Curation With Unity Catalog

AI-Powered Data Discovery and Curation With Unity Catalog

2025-06-12 Watch
talk
Peter Wang (Databricks) , Hongyi Zhang (Databricks)

This session is repeated. In today’s data landscape, the challenge isn’t just storing or processing data — it’s enabling every user, from data stewards to analysts, to find and trust the right data, fast. This session explores how Databricks is reimagining data discovery with the new Discover Page Experience — an intuitive, curated interface showcasing key data and workspace assets. We’ll dive into AI-assisted governance and AI-powered discovery features like AI-generated metadata, AI-assisted lineage and natural language data exploration in Unity Catalog. Plus, see how new certifications and deprecations bring clarity to complex data environments. Whether you’re a data steward highlighting trusted assets or an analyst navigating data without deep schema knowledge, this session will show how Databricks is making data discovery seamless for everyone.

Cooking With SQL: From Ingredients to Insights With Minimal Prep

Cooking With SQL: From Ingredients to Insights With Minimal Prep

2025-06-12 Watch
talk
Fabien Contaminard (Databricks) , Serge Rielau (Databricks)
SQL

In this session we’ll dive into the SQL kitchen and use a combination of SQL staples and nouvelle cuisine such as recursive queries, temporary tables, and stored procedures. We’ll leave you with well-scripted recipes to execute immediately or store for later consumption in your Unity Catalog. Think of this session as building your go-to cookbook of SQL techniques. Bon appétit!

Cutting Costs, Not Performance: Optimizing Databricks at Scale

Cutting Costs, Not Performance: Optimizing Databricks at Scale

2025-06-12 Watch
talk
Pedro Ferreira (NTTDATA)

As Databricks transforms data processing, analytics and machine learning, managing platform costs has become crucial for organizations aiming to maximize value while staying within budget. While Databricks offers unmatched scalability and performance, inefficient usage can lead to unexpected cost overruns. This presentation will explore common challenges organizations face in controlling Databricks costs and provide actionable best practices for optimizing resource allocation, preventing over-provisioning and eliminating underutilization. Drawing from NTT DATA’s experience, I'll share how we reduced Databricks costs by up to 50% through strategies like choosing the right compute resource, leveraging manage tables and using Unity Catalog features, such as system tables, to monitor consumption. Join this session to gain practical insights and tools that will empower your team to optimize Databricks without overspending.