Data + AI Summit 2025

Sponsored by: Deloitte | Transforming Nestlé USA’s (NUSA) data platform to unlock new analytics and GenAI capabilities

Transforming Data at Rheem: From Silos to Scalable Data Lakehouse With Databricks and Unity Catalog

2025-06-11 Watch

lightning_talk

Elise Haskins (EY) , Joseph Palomba (Rheem)

Data Lakehouse Databricks

Rheem's journey from a fragmented data landscape to a robust, scalable data platform powered by Databricks showcases the power of data modernization. In just 1.5 years, Rheem evolved from siloed reporting to 30+ certified data products, integrated with 20+ source systems, including MDM. This transformation has unlocked significant business value across sales, procurement, service and operations, enhancing decision-making and operational efficiency. This session will delve into Rheem's implementation of Databricks, highlighting how it has become the cornerstone of rapid data product development and efficient data sharing across the organization. We will also explore the upcoming enhancements with Unity Catalog, including the full migration from HMS to UC. Attendees will gain insights into best practices for building a centralized data platform, enhancing developer experience, improving governance capabilities as well as tips and tricks for a successful UC migration and enablement.

Unifying Customer Data to Drive a New Automotive Experience With Lakeflow Connect

2025-06-11 Watch

lightning_talk

Andreas Maier (Porsche Informatik GmbH)

CRM Data Science Databricks

The Databricks Data Intelligence Platform and Lakeflow Connect have transformed how Porsche manages and uses its customer data. By opting to use Lakeflow Connect instead of building a custom solution, the company has reaped the benefits of both operational efficiency and cost management. Internally, teams at Porsche now spend less time managing data integration processes. “Lakeflow Connect has enabled our dedicated CRM and Data Science teams to be more productive as they can now focus on their core work to help innovate, instead of spending valuable time on the data ingestion integration with Salesforce,” says Gruber. This shift in focus is aligned with broader industry trends, where automotive companies are redirecting significant portions of their IT budgets toward customer experience innovations and digital transformation initiatives. This story was also shared as part of a Databricks Success Story — Elise Georis, Giselle Goicochea.

Unity Catalog Implementation & Evolution at Edward Jones

2025-06-11 Watch

lightning_talk

Dattatri Rao (Edward Jones)

Analytics Cloud Computing Databricks

This presentation outlines the evolution of Databricks and its integration with cloud analytics at Edward Jones. It focuses on the transition from Cloud V1.x to Cloud V2.0, which highlights the challenges faced with initial setup, Unity Catalog implementation and the improvements planned for the future particularly in terms of Data Cataloging, Architecture and Disaster Recovery. Highlights: Cloud Analytics Journey Current Setup (Cloud V1.x) Utilizes Medallion architecture customized to Edward Jones need. Challenges & limitations identified with integration, limited catalogs, Disaster Recovery etc. Cloud V2.0 Enhancements Modifications in storage and compute in Medallion layers Next level integration with enterprise suites Disaster Recovery readiness Future outlook

Authoring Data Pipelines With the New Lakeflow Declarative Pipelines Editor

2025-06-11 Watch

talk

Adriana Ispas (Databricks) , Camiel Steenstra (Databricks)

Data Engineering

We’re introducing a new developer experience for Lakeflow Declarative Pipelines designed for data practitioners who prefer a code-first approach and expect robust developer tooling. The new multi-file editor brings an IDE-like environment to declarative pipeline development, making it easy to structure transformation logic, configure pipelines throughout the development lifecycle and iterate efficiently.Features like contextual data previews and selective table updates enable step-by-step development. UI-driven tools, such as DAG previews and DAG-based actions, enhance productivity for experienced users and provide a bridge for those transitioning to declarative workflows.In this session, we’ll showcase the new editor in action, highlighting how these enhancements simplify declarative coding and improve development for production-ready data pipelines. Whether you’re an experienced developer or new to declarative data engineering, join us to see how Lakeflow Declarative Pipelines can enhance your data practice.

Breaking Barriers: Building Custom Spark 4.0 Data Connectors with Python

2025-06-11 Watch

talk

Sourav Gulati (Databricks) , Ashish Saraswat (Databricks)

API Java Python Scala Spark Data Streaming

Building a custom Spark data source connector once required Java or Scala expertise, making it complex and limiting. This left many proprietary data sources without public SDKs disconnected from Spark. Additionally, data sources with Python SDKs couldn't harness Spark’s distributed power. Spark 4.0 changes this with a new Python API for data source connectors, allowing developers to build fully functional connectors without Java or Scala. This unlocks new possibilities, from integrating proprietary systems to leveraging untapped data sources. Supporting both batch and streaming, this API makes data ingestion more flexible than ever. In this talk, we’ll demonstrate how to build a Spark connector for Excel using Python, showcasing schema inference, data reads/writes and streaming support. Whether you're a data engineer or Spark enthusiast, you’ll gain the knowledge to integrate Spark with any data source — entirely in Python.

Breaking Silos: Using SAP Business Data Cloud and Delta Sharing for Seamless Access to SAP Data in Databricks

2025-06-11 Watch

talk

Darshana Sivakumar (Databricks) , Benjamin Mathew (Databricks) , Leonardo Litz (Natura &Co)

AI/ML Analytics Cloud Computing Databricks Delta ETL/ELT

We’re excited to share with you how SAP Business Data Cloud supports Delta Sharing to share SAP data securely and seamlessly with Databricks—no complex ETL or data duplication required. This enables organizations to securely share SAP data for analytics and AI in Databricks while also supporting bidirectional data sharing back to SAP.In this session, we’ll demonstrate the integration in action, followed by a discussion of how the global beauty group, Natura, will leverage this solution. Whether you’re looking to bring SAP data into Databricks for advanced analytics or build AI models on top of trusted SAP datasets, this session will show you how to get started — securely and efficiently.

Busting Data Modeling Myths: Truths and Best Practices for Data Modeling in the Lakehouse

2025-06-11 Watch

talk

Kyle Hale (Databricks) , Shannon Barrow (Databricks)

Analytics Data Lakehouse Data Modelling Data Quality Databricks

Unlock the truth behind data modeling in Databricks. This session will tackle the top 10 myths surrounding relational and dimensional data modeling. Attendees will gain a clear understanding of what Databricks Lakehouse truly supports today, including how to leverage primary and foreign keys, identity columns for surrogate keys, column-level data quality constraints and much more. This session will talk through the lens of medallion architecture, explaining how to implement data models across bronze, silver, and gold tables. Whether you’re migrating from a legacy warehouse or building new analytics solutions, you’ll leave equipped to fully leverage Databricks’ capabilities, and design scalable, high-performance data models for enterprise analytics.

Cross-Cloud Data Mesh with Delta Sharing and UniForm in Mercedes-Benz

2025-06-11 Watch

talk

Aleksandar Dragojevic (Databricks) , Alexander Summa (Mercedes-Benz Group AG)

AWS AWS Glue Azure Azure DevOps Cloud Computing Delta

In this presentation, we'll show how we achieved a unified development experience for teams working on Mercedes-Benz Data Platforms in AWS and Azure. We will demonstrate how we implemented Azure to AWS and AWS to Azure data product sharing (using Delta Sharing and Cloud Tokens), integration with AWS Glue Iceberg tables through UniForm and automation to drive everything using Azure DevOps Pipelines and DABs. We will also show how to monitor and track cloud egress costs and how we present a consolidated view of all the data products and relevant cost information. The end goal is to show how customers can offer the same user experience to their engineers and not have to worry about which cloud or region the Data Product lives in. Instead, they can enroll in the data product through self-service and have it available to them in minutes, regardless of where it originates.

Data Intelligence on Unity Catalog Managed Tables Powered by Predictive Optimization

2025-06-11 Watch

talk

Naga Raju Bhanoori (Databricks) , Cindy Jiang (Databricks)

Databricks

In this session, we’ll explore the data intelligence capabilities within Databricks, focusing on Predictive Optimization. This feature enhances the performance of Unity Catalog managed tables by automatically optimizing data layouts, resulting in improved query performance and reduced storage costs. You’ll learn how Predictive Optimization works and see real-world examples of customers using it to fully automate data layout management. We’ll also share a preview of the exciting features and enhancements coming down the road.

Delta-Kernel-RS: Unparalleled Interoperability Across Query Engines

2025-06-11 Watch

talk

Robert Pack (Databricks) , Zach Schuermann (databricks)

Delta Rust

Join us as we introduce Delta-Kernel-RS, a new Rust implementation of the Delta Lake protocol designed for unparalleled interoperability across query engines. In this session, we will explore how maintaining a native implementation of the Delta specification — with native C and C++ FFI support — can deliver consistent benefits across diverse data processing systems, eliminating the need for repetitive, engine-specific reimplementations. We will dive deep into a real-world case study where a query engine harnessed Delta-Kernel-RS to unlock significant data skipping improvements — enhancements achieved “for free” by leveraging the kernel. Attendees will gain insights into the architectural decisions, interoperability strategies and the practical impact of this innovation on performance and development efficiency in modern data ecosystems.

GenAI for SQL & ETL: Build Multimodal AI Workflows at Scale

2025-06-11 Watch

talk

Ahmed Bilal (Databricks) , Colton Peltier (Databricks)

AI/ML Databricks ETL/ELT GenAI LLM SQL

Enterprises generate massive amounts of unstructured data — from support tickets and PDFs to emails and product images. But extracting insight from that data requires brittle pipelines and complex tools. Databricks AI Functions make this simpler. In this session, you’ll learn how to apply powerful language and vision models directly within your SQL and ETL workflows — no endpoints, no infrastructure, no rewrites. We’ll explore practical use cases and best practices for analyzing complex documents, classifying issues, translating content, and inspecting images — all in a way that’s scalable, declarative, and secure. What you’ll learn: How to run state-of-the-art LLMs like GPT-4, Claude Sonnet 4, and Llama 4 on your data How to build scalable, multimodal ETL workflows for text and images Best practices for prompts, cost, and error handling in production Real-world examples of GenAI use cases powered by AI Functions

How Adobe is Leveraging Agentic AI to Power Their Data Supply Chain

2025-06-11 Watch

talk

Noah Levine (Databricks) , Sharad Narang (Adobe)

AI/ML

Discover how Adobe is redefining its Data Supply Chain through an AI-first, agentic solution that transforms the entire Data Development and Delivery Lifecycle (DDLC). This next-generation engineering workbench empowers data engineers, analysts, and practitioners with intelligent automation, context-aware assistance, and seamless collaboration to accelerate and streamline every phase of the data supply chain — from capturing business intent and sourcing data to building pipelines, validating quality, and delivering trusted, actionable insights.

How Arctic Wolf Modernizes Cloud Security and Enhances Threat Detection with Databricks

2025-06-11 Watch

talk

Amisha Singh (Databricks) , Justin Lai (Arctic Wolf)

Cloud Computing Databricks Cyber Security

In this session, you’ll gain actionable insights to modernize your security operations and strengthen cyber resilience. Arctic Wolf will highlight how they eliminated data silos & enhanced their MDR pipeline to investigate suspicious threat actors for customers using Databricks.

How Blue Origin Accelerates Innovation With Databricks and AWS GovCloud

2025-06-11 Watch

talk

Seths Sethuraman (Blue Origin) , Filippo Seracini (Databricks)

AI/ML Analytics AWS Databricks Delta GenAI

Blue Origin is revolutionizing space exploration with a mission-critical data strategy powered by Databricks on AWS GovCloud. Learn how they leverage Databricks to meet ITAR and FedRAMP High compliance, streamline manufacturing and accelerate their vision of a 24/7 factory. Key use cases include predictive maintenance, real-time IoT insights and AI-driven tools that transform CAD designs into factory instructions. Discover how Delta Lake, Structured Streaming and advanced Databricks functionalities like Unity Catalog enable real-time analytics and future-ready infrastructure, helping Blue Origin stay ahead in the race to adopt generative AI and serverless solutions.

How Feastables Partners With Engine to Leverage Advanced Data Models and AI for Smarter BI

2025-06-11 Watch

talk

Daniel Palmer (Engine) , Mary Beth Pittman (Feastables, Inc.)

AI/ML Analytics BI Databricks Marketing

Feastables, founded by YouTube sensation MrBeast, partnered with Engine to build a modern, AI-enabled BI ecosystem that transforms complex, disparate data into actionable insights, driving smarter decision-making across the organization. In this session, learn how Engine, a Built-On Databricks Partner, brought expertise combined with strategic partnerships that enabled Feastables to rapidly stand up a secure, modern data estate to unify complex internal and external data sources into a single, permissioned analytics platform. Feastables unlocked the power of cross-functional collaboration by democratizing data access throughout their enterprise and seamlessly integrating financial, retailer, supply chain, syndicated, merchandising and e-commerce data. Discover how a scalable analytics framework combined with advanced AI models and tools empower teams with Smarter BI across sales, marketing, supply chain, finance and executive leadership to enable real-time decision-making at scale.

How FedEx Achieved Self-Serve Analytics and Data Democratization on Databricks

2025-06-11 Watch

talk

Patrick Brown (Fedex)

Analytics Big Data Data Governance Databricks Cyber Security

FedEx, a global leader in transportation and logistics, faced a common challenge in the era of big data: how to democratize data and foster data-driven decision making with thousands of data practitioners at FedEx wanting to build models, get real-time insights, explore enterprise data, and build enterprise-grade solutions to run the business. This breakout session will highlight how FedEx overcame challenges in data governance and security using Unity Catalog, ensuring that sensitive information remains protected while still allowing appropriate access across the organization. We'll share their approach to building intuitive self-service interfaces, including the use of natural-language processing to enable non-technical users to query data effortlessly. The tangible outcomes of this initiative are numerous, but chiefly: increased data literacy across the company, faster time-to-insight for business decisions, and significant cost-savings through improved operational efficiency.

How to Migrate from Teradata to Databricks SQL

2025-06-11 Watch

talk

Fabien Contaminard (Databricks) , Mehran Golestaneh (Databricks)

Databricks DWH LLM SQL Teradata

Storage and processing costs of your legacy Teradata data warehouses impact your ability to deliver. Migrating your legacy Teradata data warehouse to the Databricks Data Intelligence Platform can accelerate your data modernization journey. In this session, learn the top strategies for completing this data migration. We will cover data type conversion, basic to complex code conversions, validation and reconciliation best practices. How to use Databricks natively hosted LLMs to assist with migration activities. See before-and-after architectures of customers who have migrated, and learn about the benefits they realized.

How We Turned 200+ Business Users Into Analysts With AI/BI Genie

2025-06-11 Watch

talk

Thomas Russell (Databricks)

AI/ML Analytics BI Databricks Marketing SQL

AI/BI Genie has transformed self-service analytics for the Databricks Marketing team. This user-friendly conversational AI tool empowers marketers to perform advanced data analysis using natural language — no SQL required. By reducing reliance on data teams, Genie increases productivity and enables faster, data-driven decisions across the organization. But realizing Genie’s full potential takes more than just turning it on. In this session, we’ll share the end-to-end journey of implementing Genie for over 200 marketing users, including lessons learned, best practices and the real business impact of this Databricks-on-Databricks solution. Learn how Genie democratizes data access, enhances insight generation and streamlines decision-making at scale.

Intelligent Document Processing: Building AI, BI, and Analytics Systems on Unstructured Data

2025-06-11 Watch

talk

Adam Gurary (Databricks) , Jason Ping (Product) (Databricks)

AI/ML Analytics BI Databricks

Most enterprise data is trapped in unstructured formats — documents, PDFs, scanned images and tables — making it difficult to access, analyze and use. This session shows how to unlock that hidden value by building intelligent document processing workflows on the Databricks Data Intelligence Platform. You’ll learn how to ingest unstructured content using Lakeflow Connect, extract structured data with AI Parse — even from complex tables and scanned documents — and apply analytics or AI to this newly structured data. What you’ll learn: How to build scalable pipelines that transform unstructured documents into structured tables Techniques for automating document workflows with Databricks tools Strategies for maintaining quality and governance with Unity Catalog Real-world examples of AI applications built with intelligent document processing

Introducing AI Builder: Building High-Quality or Low-Cost Agents for Your Domain

2025-06-11

talk

Archika Dogra (Databricks) , Jun Choi (Databricks)

AI/ML

Ever struggled with getting AI to work effectively for your specific domain needs? Join us to discover how establishing the right data foundation transforms the way you build and deploy specialized AI agents. This session demonstrates how to prepare and structure your information assets to enable more powerful, efficient AI applications. Learn proven approaches for extracting value from both structured and unstructured data, creating knowledge bases that serve as the backbone for domain-specific agents, and implement optimization techniques that balance quality and resource constraints.Key takeaways: Preparing diverse data sources for AI consumption and knowledge extraction Building scalable data foundations that naturally facilitate agent development Optimizing domain-specific agents for performance and cost efficiency Implementing practical governance frameworks for AI systems built on your data

Introducing Lakeflow: The Future of Data Engineering on Databricks

2025-06-11 Watch

talk

Michael Armbrust (Databricks) , Bilal Aslam (Databricks)

Data Engineering Databricks

Join us to explore Lakeflow, Databricks' end-to-end solution for simplifying and unifying the most complex data engineering workflows. This session builds on keynote announcements, offering an accessible introduction for newcomers while emphasizing the transformative value Lakeflow delivers.We’ll cover: What is Lakeflow? – A cohesive overview of its components: Lakeflow Connect, Lakeflow Declarative Pipelines, and Lakeflow Jobs. Core Capabilities in Action – Live demos showcasing no-code data ingestion, code-optional declarative pipelines, and unified, end-to-end orchestration. Vision for the Future – Unveil the roadmap, introducing no-code and open-source initiatives. Discover how Lakeflow equips data teams with a seamless experience for ingestion, transformation, and orchestration, reducing complexity and driving productivity. By unifying these capabilities, Lakeflow lays the groundwork for scalable, reliable, efficient data pipelines in a governed and high-performing environment.

talk-data.com

Top Topics

Top Speakers

Sponsored by: Deloitte | Transforming Nestlé USA’s (NUSA) data platform to unlock new analytics and GenAI capabilities

Sponsored by: Domo, Inc | Enabling AI-Powered Business Solutions w/Databricks & Domo

Sponsored by: EY | Unlocking Value Through AI at Takeda Pharmaceuticals

Sponsored by: Sigma | Trading Spreadsheets for Speed: TradeStation’s Self-Service Revolution

Transforming Data at Rheem: From Silos to Scalable Data Lakehouse With Databricks and Unity Catalog

Unifying Customer Data to Drive a New Automotive Experience With Lakeflow Connect

Unity Catalog Implementation & Evolution at Edward Jones

Authoring Data Pipelines With the New Lakeflow Declarative Pipelines Editor

Breaking Barriers: Building Custom Spark 4.0 Data Connectors with Python

Breaking Silos: Using SAP Business Data Cloud and Delta Sharing for Seamless Access to SAP Data in Databricks

Busting Data Modeling Myths: Truths and Best Practices for Data Modeling in the Lakehouse

Cross-Cloud Data Mesh with Delta Sharing and UniForm in Mercedes-Benz

Data Intelligence on Unity Catalog Managed Tables Powered by Predictive Optimization

Delta-Kernel-RS: Unparalleled Interoperability Across Query Engines

GenAI for SQL & ETL: Build Multimodal AI Workflows at Scale

How Adobe is Leveraging Agentic AI to Power Their Data Supply Chain

How Arctic Wolf Modernizes Cloud Security and Enhances Threat Detection with Databricks

How Blue Origin Accelerates Innovation With Databricks and AWS GovCloud

How Feastables Partners With Engine to Leverage Advanced Data Models and AI for Smarter BI

How FedEx Achieved Self-Serve Analytics and Data Democratization on Databricks

How to Migrate from Teradata to Databricks SQL

How We Turned 200+ Business Users Into Analysts With AI/BI Genie

Intelligent Document Processing: Building AI, BI, and Analytics Systems on Unstructured Data

Introducing AI Builder: Building High-Quality or Low-Cost Agents for Your Domain

Introducing Lakeflow: The Future of Data Engineering on Databricks