Data + AI Summit 2025

TAO and Reinforcement Learning: Building AI With the Data You Have

2025-06-12 Watch

talk

Brandon Cui (Databricks) , Jonathan Frankle (Databricks)

AI/ML

Curious about the cutting-edge technology that's revolutionizing AI model performance? Join us for an in-depth exploration of TAO and discover how this innovative approach is transforming the capabilities of modern AI systems. This research-focused session peels back the layers of theoretical foundations, implementation challenges, and breakthrough applications that make TAO one of the most promising advancements in AI development. Key takeaways: Understanding the fundamental principles behind TAO and how it differs from conventional optimization techniques Examining the quantifiable improvements in model accuracy, efficiency, and generalization capabilities Exploring real-world case studies where TAO has solved previously intractable AI challenges Analyzing current research directions and future potential for further enhancements Whether you're a research scientist, AI engineer, or technical leader, this session will equip you with valuable insights into how TAO can be leveraged to push your AI models beyond current limitations.

Tech Industry Session: Building Collaborative Ecosystems With Openness and Portability

2025-06-12 Watch

talk

Matthew Houser (Tealium) , Bob Pisani (Addepar) , Adrian Bolosan (Databricks) , Davis Matson (Health Catalyst)

AI/ML Analytics Databricks Delta

Join us to discover how leading tech companies accelerate growth using open ecosystems and built-on solutions to foster collaboration, accelerate innovation and create scalable data products. This session will explore how organizations use Databricks to securely share data, integrate with partners and enable teams to build impactful applications powered by AI and analytics. Topics include: Using Delta Sharing for secure, real-time data collaboration across teams and partners Embedding analytics and creating marketplaces to extend product capabilities Building with open standards and governance frameworks to ensure compliance without sacrificing agility Hear real-world examples of how open ecosystems empower organizations to widen the aperture on collaboration, driving better business outcomes. Walk away with insights into how open data sharing and built-on solutions can help your teams innovate faster at scale.

Telecom Innovation Exchange: Demos and Dialogues

2025-06-12 Watch

talk

Steve Jones (Capgemini) , Randeep Raghu (Wipro) , Prakash Trivedi (Accenture) , Nevash Pillay (Databricks)

AI/ML Databricks

Join us for an interactive breakout session designed to explore scalable, real-world solutions powered by Partners with Databricks. In this high-energy session, you'll hear from three of our leading partners — Accenture, Capgemini and Wipro — as they each deliver rapid-fire, 5-minute demos of their most impactful, production-grade solutions built for the telecom industry. From network intelligence to customer experience to AI-driven automation, these solutions are already driving tangible outcomes at scale. After the demos, you’ll have the unique opportunity to engage directly with each partner in a “speed dating” style format. Dive deep into the solutions, ask your questions and explore how these approaches can be tailored to your organization’s needs. Whether you're solving for churn, fraud, network ops or enterprise AI use cases, this session is your chance to connect, collaborate and walk away with practical ideas you can take back to your teams.

The Future of Open Table Formats: Delta Lake, Iceberg, and More

2025-06-12 Watch

talk

Daniel Weeks (Databricks) , Ryan Blue (Databricks)

Delta Iceberg

Open table formats are evolving quickly. In this session, we’ll explore the latest features of Delta Lake and Apache Iceberg™ , including a look at the emerging Iceberg v3 specification. Join us to learn about what’s driving format innovation, how interoperability is becoming real, and what it means for the future of data architecture.

What's New and What's Next: Building Impactful AI/BI Dashboards

2025-06-12 Watch

talk

Eason Gao (Databricks) , Rory Jacobs (Databricks)

AI/ML Analytics BI Databricks

Ready to take your AI/BI dashboards to the next level? This session dives into the latest capabilities in Databricks AI/BI Dashboards and how to maximize impact across your organization. Learn how data authors can tailor visualizations for different audiences, optimize performance and seamlessly integrate with Genie for a unified analytics experience. We’ll also share practical tips on how business users and data teams can better collaborate — ensuring insights are accessible, actionable and aligned to business goals.

What’s New in Apache Spark™ 4.0?

2025-06-12 Watch

talk

Wenchen Fan (Databricks) , Daniel Tenedorio (Databricks)

AI/ML API Java Python Scala Spark

Join this session for a concise tour of Apache Spark™ 4.0’s most notable enhancements: SQL features: ANSI by default, scripting, SQL pipe syntax, SQL UDF, session variable, view schema evolution, etc. Data type: VARIANT type, string collation Python features: Python data source, plotting API, etc. Streaming improvements: State store data source, state store checkpoint v2, arbitrary state v2, etc. Spark Connect improvements: More API coverage, thin client, unified Scala interface, etc. Infrastructure: Better error message, structured logging, new Java/Scala version support, etc. Whether you’re a seasoned Spark user or new to the ecosystem, this talk will prepare you to leverage Spark 4.0’s latest innovations for modern data and AI pipelines.

What’s New in Unity Catalog With Live Demos

2025-06-12 Watch

talk

Paul Roome (Databricks) , Murt Neemuchwala (Databricks)

Data Governance

Join the Unity Catalog product team for an exclusive deep dive into the latest innovations and upcoming features of Unity Catalog! Explore cutting-edge advancements in access control, discovery, lineage and monitoring — plus get a sneak peek at what’s coming next. Packed with live demos, expert insights and best practices from thousands of customers running Unity Catalog in production, this session is also your chance to engage directly with product experts and get answers to your most pressing questions. Don’t miss this opportunity to stay ahead of the curve and elevate your data governance strategy!

AI-Assisted BI: Everything You Need to Know

2025-06-12 Watch

lightning_talk

Chung Wu (Databricks) , Alex Lichen (Databricks)

AI/ML Analytics BI Data Analytics Databricks

Explore how AI is transforming business intelligence and data analytics across the Databricks platform. This session offers a comprehensive overview of AI-assisted capabilities, from generating dashboards and visualizations to integrating Genie on dashboards for conversational analytics. Whether you’re a data engineer, analyst or BI developer, this session will equip you to leverage AI with BI for better, smarter decisions.

A No-Code ML Forecasting Platform for Retail and CPG Companies

2025-06-12 Watch

lightning_talk

Moez Ali (Zebra Technologies)

AI/ML

Retail and CPG companies face growing pressure to better forecast demand, optimize pricing and manage inventory — yet traditional approaches take months to deploy and often require extensive engineering support. In this session, we will showcase Workcloud Modeling Studio, a low-code/no-code ML platform designed for data scientists working in retail and CPG. Learn how this tool improves forecasting accuracy and accelerates time-to-value from months to hours. We will walk through a real-world use case of demand forecasting for a retailer using Zebra's Modeling Studio. This talk will demonstrate how to build, train and deploy an ML forecasting pipeline — without reinventing the wheel.

A Practical Roadmap to Becoming an Expert Databricks Data Engineer

2025-06-12 Watch

lightning_talk

Derar Alhussein (Acadford)

Data Engineering Databricks

The demand for skilled Databricks data engineers continues to rise as enterprises accelerate their adoption of the Databricks platform. However, navigating the complex ecosystem of data engineering tools, frameworks and best practices can be overwhelming. This session provides a structured roadmap to becoming an expert Databricks data engineer, offering a clear progression from foundational skills to advanced capabilities. Acadford, a leading training provider, has successfully trained thousands of data engineers on Databricks, equipping them with the skills needed to excel in their careers and obtain professional certifications. Drawing on this experience, we will guide attendees through the most in-demand skills and knowledge areas through a combination of structured learning and practical insights. Key takeaways: Understand the core tech stack in Databricks Explore real-world code examples and live demonstrations Receive an actionable learning path with recommended resources

Automating Engineering with AI - LLMs in Metadata Driven Frameworks

2025-06-12 Watch

lightning_talk

Simon Whiteley (Advancing Analytics)

AI/ML Data Engineering Data Quality LLM

The demand for data engineering keeps growing, but data teams are bored by repetitive tasks, stumped by growing complexity and endlessly harassed by an unrelenting need for speed. What if AI could take the heavy lifting off your hands? What if we make the move away from code-generation and into config-generation — how much more could we achieve? In this session, we’ll explore how AI is revolutionizing data engineering, turning pain points into innovation. Whether you’re grappling with manual schema generation or struggling to ensure data quality, this session offers practical solutions to help you work smarter, not harder. You’ll walk away with a good idea of where AI is going to disrupt the data engineering workload, some good tips around how to accelerate your own workflows and an impending sense of doom around the future of the industry!

Sponsored by: Airbyte | How Data Movement Powers GenAI

Founder discussion: Matei on UC, Data Intelligence and AI Governance

2025-06-12 Watch

talk

Matei Zaharia (Databricks)

AI/ML Databricks Delta LLM Spark

Matei is a legend of open source: he started the Apache Spark project in 2009, co-founded Databricks, and worked on other widely used data and AI software, including MLflow, Delta Lake, and Dolly. His most recent research is about combining large language models (LLMs) with external data sources, such as search systems, and improving their efficiency and result quality. This will be a conversation coverering the latest and greatest of UC, Data Intelligence, AI Governance, and more.

Summit Live: Data Sharing and Collaboration

2025-06-12 Watch

talk

Zaheera Valani (Databricks)

AI/ML Analytics Databricks Delta

Hear more on the latest in data collaboration, which is paramount to unlocking business success. Delta Sharing is an open-source approach to share and govern data, AI models, dashboards, and notebooks across clouds and platforms - without the costly need for replication. Databricks Clean Rooms provide safe hosting environments for data collaboration across companies, also without the costly duplication of data. And the Databricks Marketplace is the open marketplace for all your data, analytics, and AI needs.

Summit Live: Data Strategy - Democratizing Consumption and Growing ROI

2025-06-12 Watch

talk

Robin Sutara (Databricks)

With organizations collecting more and more data every day, the accuracy, trustworthiness, and accessibility of data must be prioritized if businesses want to unlock its value and grow ROI. And, that's where data democratization and data strategy can help. Hear from Robin Sutara, Field CDO, on top insights from her global experiences.

Thursday Keynote

2025-06-12

keynote

Michael Armbrust (Databricks) , Miranda Luna (Databricks) , Arsalan Tavakoli-Shiraji (Databricks) , Ken Wong (Databricks) , Ali Ghodsi (Databricks) , Michael Flynn (Rivian) , Bilal Aslam (Databricks) , Keegan Dubbs (Databricks) , Matei Zaharia (Databricks) , Michelle Leon (Databricks) , Michael Piatek (Databricks)

Discover the latest advances on the Data Intelligence Platform and hear from the companies who are already enjoying success.

Wednesday Keynote (Virtual Replay)

2025-06-12

keynote

AI/ML Databricks

Be first to witness the latest breakthroughs from Databricks and share the success of innovative data and AI companies.

Data After Hours

2025-06-12

talk

Kick back at Data After Hours for drinks and dialogue with new (and old) friends. Enjoy live music, drinks, food and good company!

AI/BI Genie: A Look Under the Hood of Everyone's Friendly, Neighborhood GenAI Product

2025-06-12 Watch

talk

Amir Hormati (Databricks) , Alnur Ali (Databricks)

AI/ML BI GenAI LLM RAG

Go beyond the user interface and explore the cutting-edge technology driving AI/BI Genie. This session breaks down the AI/BI Genie architecture, showcasing how LLMs, retrieval-augmented generation (RAG) and finely tuned knowledge bases work together to deliver fast, accurate responses. We’ll also explore how AI agents orchestrate workflows, optimize query performance and continuously refine their understanding. Ideal for those who want to geek out about the tech stack behind Genie, this session offers a rare look at the magic under the hood.

AI-Powered Profits: Smarter Order and Inventory Management

2025-06-12 Watch

talk

Anders Poirel (Joby Aviation) , David Rogers (Databricks) , Samuel Ceriale (Xylem)

AI/ML Cloud Computing Databricks ERP

Join this session to hear from two incredible companies, Xylem and Joby Aviation. Xylem shares their successful journey from fragmented legacy systems to a unified Enterprise Data Platform, demonstrating how they integrated complex ERP data across four business segments to achieve breakthrough improvements in parts management and operational efficiency. Following Xylem's story, learn how Joby Aviation leveraged Databricks to automate and accelerate flight test data checks, cutting processing times from over two hours to under thirty minutes. This session highlights how advanced cloud tools empower engineers to quickly build and run custom data checks, improving both speed and safety in flight test operations.

Better Together: Change Data Feed in a Streaming Data Flow

2025-06-12 Watch

talk

Mattias Moser (84.51 LLC) , Scott Gordon (84.51˚)

Delta Marketing Data Streaming

Traditional streaming works great when your data source is append-only, but what if your data source includes updates and deletes? At 84.51 we used Lakeflow Declarative Pipelines and Delta Lake to build a streaming data flow that consumes inserts, updates and deletes while still taking advantage of streaming checkpoints. We combined this flow with a materialized view and Enzyme incremental refresh for a low-code, efficient and robust end-to-end data flow.We process around 8 million sales transactions each day with 80 million items purchased. This flow not only handles new transactions but also handles updates to previous transactions.Join us to learn how 84.51 combined change data feed, data streaming and materialized views to deliver a “better together” solution.84.51 is a retail insights, media & marketing company. We use first-party retail data from 60 million households sourced through a loyalty card program to drive Kroger’s customer-centric journey.

talk-data.com

Top Topics

Top Speakers

TAO and Reinforcement Learning: Building AI With the Data You Have

Tech Industry Session: Building Collaborative Ecosystems With Openness and Portability

Telecom Innovation Exchange: Demos and Dialogues

The Future of Open Table Formats: Delta Lake, Iceberg, and More

What's New and What's Next: Building Impactful AI/BI Dashboards

What’s New in Apache Spark™ 4.0?

What’s New in Unity Catalog With Live Demos

AI-Assisted BI: Everything You Need to Know

A No-Code ML Forecasting Platform for Retail and CPG Companies

A Practical Roadmap to Becoming an Expert Databricks Data Engineer

Automating Engineering with AI - LLMs in Metadata Driven Frameworks

Sponsored by: Airbyte | How Data Movement Powers GenAI

Sponsored by: Anomalo | Reconciling IoT, Policy, and Insurer Data to Deliver Better Customer Discounts

Sponsored by: IBM | How to leverage unstructured data to build more accurate, trustworthy AI agents

Sponsored by: MathCo | Powering Contextualized Intelligence with NucliOS, MathCo’s Databricks-Native Platform

Sponsored by: Soda Data Inc. | Clean Energy, Clean Data: How Data Quality Powers Decarbonization

Founder discussion: Matei on UC, Data Intelligence and AI Governance

Summit Live: Data Sharing and Collaboration

Summit Live: Data Strategy - Democratizing Consumption and Growing ROI

Thursday Keynote

Wednesday Keynote (Virtual Replay)

Data After Hours

AI/BI Genie: A Look Under the Hood of Everyone's Friendly, Neighborhood GenAI Product

AI-Powered Profits: Smarter Order and Inventory Management

Better Together: Change Data Feed in a Streaming Data Flow