Data + AI Summit 2025

Summit Live: Industry-by-Industry - Customers, Solution Accelerators, and Use Cases Across Popular Industries

2025-06-11 Watch

talk

Michael Mylrea (Arctic Wolf) , Adam Crown (Databricks)

AI/ML Databricks

Each industry has its uniqueness, and Databricks provides solutions and use cases tailored to core AI-driven industries. Hear from some of our key customers about solution accelerators and use cases across popular industries.

Summit Live: Databricks Apps - Empowering data and AI teams to build and deploy applications with ease

2025-06-11 Watch

talk

Justin DeBrabant (Databricks)

AI/ML Databricks

Databricks Apps empowers data and AI teams to easily build and deploy applications with ease - in just minutes not months! It’s the fastest and most secure way to deliver impactful solutions, with built-in governance based on Unity Catalog and the Databricks Data Intelligence Platform. See demos on the latest and greatest, and how YOU can get started right away.

Amplifying Human-to-Human Connection in the Face of Mental Health Crisis Using AI

2025-06-11 Watch

talk

Mateo Garcia Pepin (Crisis Text Line) , Margaret Meagher (Crisis Text Line)

AI/ML GenAI

Crisis Text Line has been innovating for ten years in text-based mental health crisis intervention and is now leading the next wave of GenAI use cases in the space. With over 300 million messages exchanged since 2013 and a decade of expertise, Crisis Text Line is unlocking the potential of AI to amplify human connection at a global scale.We will discuss how we leveraged our bedrock application to co-navigate crisis care through a set of early AI agent workflows. First, a simulator that reproduces texter behavior to train responders in taking conversations ranging in difficulty where the texter is in imminent risk of suicide or self-harm. Second, a tool that automatically monitors clinical quality of conversations. Third, predicted summarization to capture key context before conversations are transferred. Through the power of suggestion, this compound system aims to reduce burden and drive efficiency, such that our responders can focus on what they do best — support people in need.

Crypto at Scale: Building a High-Performance Platform for Real-Time Blockchain Data

2025-06-11 Watch

talk

Matthew Moorcroft (Databricks) , Ferran Cabezas Castellvi (Elliptic)

Analytics Blockchain Databricks Delta SQL Data Streaming

In today’s fast-evolving crypto landscape, organizations require fast, reliable intelligence to manage risk, investigate financial crime, and stay ahead of evolving threats. In this session we will discover how Elliptic built a scalable, high-performance Data Intelligence Platform that delivers real-time, actionable Blockchain insights to their customers. We’ll walk you through some of the key components of the Elliptic Platform, including the Elliptic Entity Graph and our User-Facing Analytics. Our focus will be put on the evolution of our User-Facing Analytics capabilities, and specifically how components from the Databricks ecosystem such as Structured Streaming, Delta Lake, and SQL Warehouse have played a vital role. We’ll also share some of the optimizations we’ve made to our streaming jobs to maximize performance and ensure Data Completeness. Whether you’re looking to enhance your streaming capabilities, expand your knowledge of how crypto analytics works or simply discover novel approaches to data processing at scale, this session will provide concrete strategies and valuable lessons learned.

Databricks Observability: Using System Tables to Monitor and Manage Your Databricks Instance

2025-06-11 Watch

talk

Greg Kroleski (Databricks) , Michael Postiglione (Databricks)

Databricks DevOps FinOps

The session will cover how to use Unity Catalog governed system tables to understand what is happening in Databricks. We will touch on key scenarios for FinOps, DevOps and SecOps to ensure you have a well-observed Data Intelligence Platform. Learn about new developments in system tables and other features that will help you observe your Databricks instance.

Data Intelligence for Marketing Breakout: Agentic Systems for Bayesian MMM and Consumer Testing

2025-06-11 Watch

talk

Dan Morris (Databricks) , Luca Fiaschi (PyMC Labs)

AI/ML Databricks GenAI Marketing MMM

This talk dives into leveraging GenAI to scale sophisticated decision intelligence. Learn how an AI copilot interface simplifies running complex Bayesian probabilistic models, accelerating insight generation, and accurate decision making at the enterprise level. We talk through techniques for deploying AI agents at scale to simulate market dynamics or product feature impacts, providing robust, data-driven foresight for high-stakes innovation and strategy directly within your Databricks environment. For marketing teams, this approach will help you leverage autonomous AI agents to dynamically manage media channel allocation while simulating real-world consumer behavior through synthetic testing environments.

Delivering Sub-Second Latency for Operational Workloads on Databricks

2025-06-11 Watch

talk

Karthikeyan Ramasamy (Databricks) , Jerry Peng (Databricks)

Databricks Spark Data Streaming

As enterprise streaming adoption accelerates, more teams are turning to real-time processing to support operational workloads that require sub-second response times. To address this need, Databricks introduced Project Lightspeed in 2022, which recently delivered Real-Time Mode in Apache Spark™ Structured Streaming. This new mode achieves consistent p99 latencies under 300ms for a wide range of stateless and stateful streaming queries. In this session, we’ll define what constitutes an operational use case, outline typical latency requirements and walk through how to meet those SLAs using Real-Time Mode in Structured Streaming.

Developing the Dreamers of Data + AI’s Future: How 84.51˚ builds upskilling to accelerate adoption

2025-06-11 Watch

talk

Michael Carrico (84.51)

AI/ML Data Lakehouse Data Science Databricks

“Once an idea has taken hold of the brain it's almost impossible to eradicate. An idea that is fully formed — fully understood — that sticks, right in there somewhere.” The Data Scientists and Engineers at 84.51˚ utilize the Databricks Lakehouse for a wide array of tasks, including data exploration, analysis, machine learning operations, orchestration, automated deployments and collaboration. In this talk, 84.51˚’s Data Science Learning Lead, Michael Carrico, will share their approach to upskilling a diverse workforce to support the company’s strategic initiatives. This approach includes creating tailored learning experiences for a variety of personas using content curated in partnership with Databricks’ educational offerings. Then he will demonstrate how he puts his 11 years of data science and engineering experience to work by using the Databricks Lakehouse not just as a subject, but also as a tool to create impactful training experiences and a learning culture at 84.51˚.

Empowering the Warfighter With AI

2025-06-11 Watch

talk

Teneika Askew (Navy)

AI/ML Data Management Databricks Delta Spark

The new Budget Execution Validation process has transformed how the Navy reviews unspent funds. Powered by Databricks Workflows, MLflow, Delta Lake and Apache Spark™, this data-driven model predicts which financial transactions are most likely to have errors, streamlining reviews and increasing accuracy. In FY24, it helped review $40 billion, freeing $1.1 billion for other priorities, including $260 million from active projects. By reducing reviews by 80%, cutting job runtime by over 50% and lowering costs by 60%, it saved 218,000 work hours and $6.7 million in labor costs. With automated workflows and robust data management, this system exemplifies how advanced tools can improve financial decision-making, save resources and ensure efficient use of taxpayer dollars.

Healthcare and Life Sciences: Getting Started with AI Agents

2025-06-11 Watch

talk

William Smith (Databricks) , James McCall (Databricks)

AI/ML Data Governance Databricks

Healthcare and life sciences organizations are exploring AI Agents, driving transformation through intelligent supply chains to helping up-level the patient experience via virtual assistants. This session explores how you can get started with AI Agents, powered by Databricks and robust data governance, and tapping into the full potential of all your data. You’ll learn practical steps for getting started: unifying data with Databricks, ensuring compliance with Unity Catalog, and rapidly deploying AI Agents to drive operational efficiency, improve care, and foster innovation across healthcare and life sciences.

How to Migrate From Oracle to Databricks SQL

2025-06-11 Watch

talk

Laurent Léturgez (Databricks)

CSV Databricks DWH Oracle PySpark SQL

Migrating your legacy Oracle data warehouse to the Databricks Data Intelligence Platform can accelerate your data modernization journey. In this session, learn the top strategies for completing this data migration. We will cover data type conversion, basic to complex code conversions, validation and reconciliation best practices. Discover the pros and cons of using CSV files to PySpark or using pipelines to Databricks tables. See before-and-after architectures of customers who have migrated, and learn about the benefits they realized.

Iceberg Table Format Adoption and Unified Metadata Catalog Implementation in Lakehouse Platform

2025-06-11 Watch

talk

Ruotian Wang (Doordash) , Sergey Zavgorodni (DoorDash)

Amazon EMR Data Lake Data Lakehouse Databricks DWH Iceberg

DoorDash Data organization actively adopts LakeHouse paradigm. This presentation describes the methodology which allows to migrate the classic Data Warehouse and Data Lake platforms to unified LakeHouse solution.The objective of this effort include Elimination of excessive data movement.Seamless integration and consolidation of the query engine layers, including Snowflake, Databricks, EMR and Trino.Query performance optimization.Abstracting away complexity of underlying storage layers and table formatsStrategic and justified decision on the Unified Metadata catalog used across varios compute platforms

LLMOps at Intermountain Health: A Case Study on AI Inventory Agents

2025-06-11 Watch

talk

Mark Nielsen (Intermountain Healthcare)

AI/ML CI/CD LLM

In this session, we will delve into the creation of an infrastructure, CI/CD processes and monitoring systems that facilitate the responsible and efficient deployment of Large Language Models (LLMs) at Intermountain Healthcare. Using the "AI Inventory Agents" project as a case study, we will showcase how an LLM Agent can assist in effort and impact estimates, as well as provide insights into various AI products, both custom-built and third-party hosted. This includes their responsible AI certification status, development status and monitoring status (lights on, performance, drift, etc.). Attendees will learn how to build and customize their own LLMOps infrastructure to ensure seamless deployment and monitoring of LLMs, adhering to responsible AI practices.

No More Fragile Pipelines: Kafka and Iceberg the Declarative Way

2025-06-11 Watch

talk

Adi Polak (Confluent)

Analytics Iceberg Kafka Parquet Data Streaming

Moving data between operational systems and analytics platforms is often painful. Traditional pipelines become complex, brittle, and expensive to maintain.Take Kafka and Iceberg: batching on Kafka causes ingestion bottlenecks, while streaming-style writes to Iceberg create too many small Parquet files—cluttering metadata, degrading queries, and increasing maintenance overhead. Frequent updates further strain background table operations, causing retries—even before dealing with schema evolution. But much of this complexity is avoidable. What if Kafka Topics and Iceberg Tables were treated as two sides of the same coin? By establishing a transparent equivalence, we can rethink pipeline design entirely. This session introduces Tableflow—a new approach to bridging streaming and table-based systems. It shifts complexity away from pipelines and into a unified layer, enabling simpler, declarative workflows. We’ll cover schema evolution, compaction, topic-to-table mapping, and how to continuously materialize and optimize thousands of topics as Iceberg tables. Whether modernizing or starting fresh, you’ll leave with practical insights for building resilient, scalable, and future-proof data architectures.

Optimizing Analytics Infrastructure: Lessons from Migrating Snowflake to Databricks

2025-06-11 Watch

talk

AMIT RUSTAGI (DeeplearningAPI)

Analytics Data Lake Data Quality Databricks Snowflake

This session explores the strategic migration from Snowflake to Databricks, focusing on the journey of transforming a data lake to leverage Databricks’ advanced capabilities. It outlines the assessment of key architectural differences, performance benchmarks, and cost implications driving the decision. Attendees will gain insights into planning and execution, including data ingestion pipelines, schema conversion and metadata migration. Challenges such as maintaining data quality, optimizing compute resources and minimizing downtime are discussed, alongside solutions implemented to ensure a seamless transition. The session highlights the benefits of unified analytics and enhanced scalability achieved through Databricks, delivering actionable takeaways for similar migrations.

Scaling Identity Graph Ingestion to 1M Events/Sec with Spark Streaming & Delta Lake

2025-06-11 Watch

talk

Akanksha Nagpal (Adobe) , Jianmei Ye (Adobe, Inc.)

Flink AWS Azure CDP Databricks Delta

Adobe’s Real-Time Customer Data Platform relies on the identity graph to connect over 70 billion identities and deliver personalized experiences. This session will showcase how the platform leverages Databricks, Spark Streaming and Delta Lake, along with 25+ Databricks deployments across multiple regions and clouds — Azure & AWS — to process terabytes of data daily and handle over a million records per second. The talk will highlight the platform’s ability to scale, demonstrating a 10x increase in ingestion pipeline capacity to accommodate peak traffic during events like the Super Bowl. Attendees will learn about the technical strategies employed, including migrating from Flink to Spark Streaming, optimizing data deduplication, and implementing robust monitoring and anomaly detection. Discover how these optimizations enable Adobe to deliver real-time identity resolution at scale while ensuring compliance and privacy.

Securing Capital Markets: AI-Powered Risk Management for Resilience

2025-06-11 Watch

talk

Luis Amador (Moody's Analytics) , Kim Hatton (Databricks) , Eric Suss (Morgan Stanley) , Venkatesh Ganesan (State Street)

AI/ML Analytics Cyber Security

In capital markets, mitigating risk is critical to protecting the firm’s reputation, assets, and clients. This session highlights how firms use technology to enhance risk management, ensure compliance and safeguard operations from emerging threats. Learn how advanced analytics and machine learning models are helping firms detect anomalies, prevent fraud, and manage regulatory complexities with greater precision. Hear from industry leaders who have successfully implemented proactive risk strategies that balance security with operational efficiency. Key Takeaways: Techniques for identifying risks early using AI-powered anomaly detection. Best practices for achieving compliance across complex regulatory environments. Insights into building resilient operations that protect assets without compromising growth potential. Don’t miss this session to discover how data intelligence is transforming risk management in capital markets—helping firms secure their future while driving success!"

Semiconductor AI Success: Marvell’s Data + AI Governance

2025-06-11 Watch

talk

Ram Kaushik Pochiraju (Marvell Semiconductors Inc) , Vinod Chakravarthy (Databricks)

AI/ML Databricks Cyber Security

Marvell’s AI-driven solutions, powered by Databricks’ Data Intelligence Platform, provide a robust framework for secure, compliant and transparent Data and AI workflows leveraging Data & AI Governance through Unity Catalog. Marvell ensures centralized management of data and AI assets with quality, security, lineage and governance guardrails. With Databricks Unity Catalog, Marvell achieves comprehensive oversight of structured and unstructured data, AI models and notebooks. Automated governance policies, fine-grained access controls and lineage tracking help enforce regulatory compliance while streamlining AI development. This governance framework enhances trust and reliability in AI-powered decision-making, enabling Marvell to scale AI innovation efficiently while minimizing risks. By integrating data security, auditability and compliance standards, Marvell is driving the future of responsible AI adoption with Databricks.

Smart Data, Smarter Vehicles: Building the Foundation for the Future of Transportation

2025-06-11 Watch

talk

Jon Brown (Boeing) , David Rogers (Databricks)

AI/ML Delta GenAI

Join industry pioneers Boeing and CARIAD (Volkswagen Group) as they showcase how advanced data platforms are revolutionizing mobility across air and ground transportation. Boeing's Jeppesen Smart NOTAMs system demonstrates the power of compound AI in aviation safety, processing over 4.5M critical flight notices annually and serving 75% of commercial aviation through an innovative combination of MLflow, GenAI, and Delta Sharing technologies. CARIAD follows with insights into their groundbreaking Unified Data Ecosystem (UDE), the singular data platform powering Volkswagen Group's global mobility transformation across all brands and markets. Together, these leaders illustrate how smart data architecture is building the foundation for the future of transportation, from the skies to the streets.

Summit Live: Spark Talk - Everything Spark, Lakeflow Declarative Pipelines, and Open Source

2025-06-11 Watch

talk

Michael Armbrust (Databricks)

Databricks Delta Spark SQL

Databricks co-founders created Spark, the wildly popular open source foundation of Databricks, way back in 2009. Learn from Michael Armbrust, creator of Spark SQL and leader of Databricks Delta, about the latest happenings in Spark, Lakeflow Declarative Pipelines, and open source.

Supercharge Your Enterprise BI: A Practitioner’s Guide for Migrating to AI/BI

2025-06-11 Watch

talk

RK Aduri (Databricks) , Aman Gupta (Databricks)

AI/ML Analytics BI Databricks

Are you striving to build a data-driven culture while managing costs and reducing reporting latency? Are your BI operations bogged down by complex data movements rather than delivering insights? Databricks IT faced these challenges in 2024 and embarked on an ambitious journey to make Databricks AI/BI our enterprise-wide reporting platform. In just two quarters, we migrated 2,000 dashboards from a traditional BI tool — without disrupting business operations. We’ll share how we executed this large-scale transition cost-effectively, ensuring seamless change management and empowering non-technical users to leverage AI/BI. You’ll gain insights into: Key migration strategies that minimized disruption and optimized efficiency Best practices for user adoption and training to drive self-service analytics Measuring success with clear adoption metrics and business impact Join us to learn how your organization can achieve the same transformation with AI-powered enterprise reporting.

Transforming Customer Processes and Gaining Productivity With Lakeflow Declarative Pipelines

2025-06-11 Watch

talk

Marcos Abrantes Gomes (Bradesco Bank) , Ademir Francisquini Junior (Banco Bradesco S.A.)

CDP Databricks Marketing React

Bradesco Bank is one of the largest private banks in Latin America, with over 75 million customers and over 80 years of presence in FSI. In the digital business, velocity to react to customer interactions is crucial to succeed. In the legacy landscape, acquiring data points on interactions over digital and marketing channels was complex, costly and lacking integrity due to typical fragmentation of tools. With the new in-house Customer Data Platform powered by Databricks Intelligent Platform, it was possible to completely transform the data strategy around customer data. Using some key components such Uniform and Lakeflow Declarative Pipelines, it was possible to increase data integrity, reduce latency and processing time and, most importantly, boost personal productivity and business agility. Months of reprocessing, weeks of human labor and cumbersome and complex data integrations were dramatically simplified achieving significant operational efficiency.

Unifying GTM Analytics: The Strategic Shift to Native Analytics and AI/BI Dashboards at Databricks

2025-06-11 Watch

talk

Abhinav Bhatnagar (Databricks) , David Gojo (Databricks)

AI/ML Analytics BI Databricks GTM

The GTM team at Databricks recently launched the GTM Analytics Hub—a native AI/BI platform designed to centralize reporting, streamline insights, and deliver personalized dashboards based on user roles and business needs. Databricks Apps also played a crucial role in this integration by embedding AI/BI Dashboards directly into internal tools and applications, streamlining access to insights without disrupting workflows. This seamless embedding capability allows users to interact with dashboards within their existing platforms, enhancing productivity and collaboration. Furthermore, AI/BI Dashboards leverage Databricks' unified data and governance framework. Join us to learn how we’re using Databricks to build for Databricks—transforming GTM analytics with AI/BI Dashboards, and what it takes to drive scalable, user-centric analytics adoption across the business.

Unlock the Potential of Your Enterprise Data With Zero-Copy Data Sharing, featuring SAP and Salesforce

2025-06-11 Watch

talk

Akram Chetibi (Databricks) , Senthil Krishnapillai (SAP Labs) , Rajkumar Irudayaraj (Salesforce)

AI/ML Analytics Cloud Computing Databricks Delta Marketing

Tired of data silos and the constant need to move copies of your data across different systems? Imagine a world where all your enterprise data is readily available in Databricks without the cost and complexity of duplication and ingestion. Our vision is to break down these silos by enabling seamless, zero-copy data sharing across platforms, clouds, and regions. This unlocks the true potential of your data for analytics and AI, empowering you to make faster, more informed decisions leveraging your most important enterprise data sets. This session you will hear from Databricks, SAP, and Salesforce product leaders on how zero-copy data sharing can unlock the value of enterprise data. Explore how Delta Sharing makes this vision a reality, providing secure, zero-copy data access for enterprises.SAP Business Data Cloud: See Delta Sharing in action to unlock operational reporting, supply chain optimization, and financial planning. Salesforce Data Cloud: Enable customer analytics, churn prediction, and personalized marketing.

Fueling Efficiency: How Pilot Uses Vector Stores, Data Quality, and GenAI to Deliver Business Value

2025-06-11 Watch

lightning_talk

Travis Lawrence (Pilot Travel Centers)

AI/ML Data Quality GenAI

In the complex world of logistics, efficiency and accuracy are paramount. At Pilot, the largest travel center network in North America, managing fuel delivery operations was a time-intensive and error-prone process. Tasks like processing delivery records and validating fuel transaction data posed significant challenges due to the diverse formats and handwritten elements involved. After several attempts to use robotic process automation failed, the team turned to Generative AI to automate and streamline this critical business process. In this session, discover how Pilot leverages GenAI, powered by advanced text and vision models, to revolutionize BOL processing. By implementing few-shot learning and vectorized examples, the data team at Pilot was able to increase document parsing accuracy from 70% to 95%, enabling real-time validation against truck driver inputs, which has resulted in millions of savings from accelerating credit reconciliation and improved financial operations.

talk-data.com

Top Topics

Top Speakers

Summit Live: Industry-by-Industry - Customers, Solution Accelerators, and Use Cases Across Popular Industries

Summit Live: Databricks Apps - Empowering data and AI teams to build and deploy applications with ease

Amplifying Human-to-Human Connection in the Face of Mental Health Crisis Using AI

Crypto at Scale: Building a High-Performance Platform for Real-Time Blockchain Data

Databricks Observability: Using System Tables to Monitor and Manage Your Databricks Instance

Data Intelligence for Marketing Breakout: Agentic Systems for Bayesian MMM and Consumer Testing

Delivering Sub-Second Latency for Operational Workloads on Databricks

Developing the Dreamers of Data + AI’s Future: How 84.51˚ builds upskilling to accelerate adoption

Empowering the Warfighter With AI

Healthcare and Life Sciences: Getting Started with AI Agents

How to Migrate From Oracle to Databricks SQL

Iceberg Table Format Adoption and Unified Metadata Catalog Implementation in Lakehouse Platform

LLMOps at Intermountain Health: A Case Study on AI Inventory Agents

No More Fragile Pipelines: Kafka and Iceberg the Declarative Way

Optimizing Analytics Infrastructure: Lessons from Migrating Snowflake to Databricks

Scaling Identity Graph Ingestion to 1M Events/Sec with Spark Streaming & Delta Lake

Securing Capital Markets: AI-Powered Risk Management for Resilience

Semiconductor AI Success: Marvell’s Data + AI Governance

Smart Data, Smarter Vehicles: Building the Foundation for the Future of Transportation

Summit Live: Spark Talk - Everything Spark, Lakeflow Declarative Pipelines, and Open Source

Supercharge Your Enterprise BI: A Practitioner’s Guide for Migrating to AI/BI

Transforming Customer Processes and Gaining Productivity With Lakeflow Declarative Pipelines

Unifying GTM Analytics: The Strategic Shift to Native Analytics and AI/BI Dashboards at Databricks

Unlock the Potential of Your Enterprise Data With Zero-Copy Data Sharing, featuring SAP and Salesforce

Fueling Efficiency: How Pilot Uses Vector Stores, Data Quality, and GenAI to Deliver Business Value