Data + AI Summit 2025

Unlocking AI Value: Build AI Agents on SAP Data in Databricks

2025-06-10 Watch

talk

Qi Su (Databricks)

AI/ML Databricks Delta ETL/ELT SAP

Discover how enterprises are turning SAP data into intelligent AI. By tapping into contextual SAP data through Delta Sharing on Databricks - no messy ETL needed - they’re accelerating AI innovation and business insights. Learn how they: - Build domain-specific AI that can reason on private SAP data- Deliver data intelligence to power insights for business leaders- Govern and secure their new unified data estate

AI Agents for Marketing: Leveraging Mosaic AI to Create a Multi-Purpose Agentic Marketing Assistant

2025-06-10 Watch

talk

Sailesh Bharathwaaj Krishnamurthy (7-Eleven Inc)

AI/ML Databricks Marketing

Marketing professionals build campaigns, create content and use effective copywriting to tell a good story to promote a product/offer. All of this requires a thorough and meticulous process for every individual campaign. In order to assist marketing professionals at 7-Eleven, we built a multi-purpose assistant that could: Use campaign briefs to generate campaign ideas and taglines Do copy-writing for marketing content Verify images for messaging accuracy Answer general questions and browse the web as a generic assistant We will walk you through how we created multiple agents as different personas with LangGraph and Mosaic AI to create a chat assistant that assumes a different persona based on the user query. We will also explain our evaluation methodology in choosing models and prompts and how we implemented guardrails for high reliability with sensitive marketing content. This assistant by 7-Eleven was showcased at the Databricks booth at NRF earlier this year.

AI/BI Driving Speed to Value in Supply Chain

2025-06-10 Watch

talk

Adrian McClure (Conagra Brands) , Heather Cooley (Conagra Brands)

AI/ML Analytics BI Data Science Databricks

Conagra is a global food manufacturer with $12.2B in revenue, 18K+ employees, 45+ plants in US, Canada and Mexico. Conagra's Supply Chain organization is heavily focused on delivering results in productivity, waste reduction, inventory rationalization, safety and customer service levels. By migrating the Supply Chain reporting suite to Databricks over the past 2 years, Conagra's Supply Chain Analytics & Data Science team has been able to deliver new AI solutions which complement traditional BI platforms and lay the foundation for additional AI/ML applications in the future. With Databricks Genie integrated within traditional BI reports, Conagra Supply Chain users can now go from insight to action faster and with fewer clicks, enabling speed to value in a complex Supply Chain. The Databricks platform also allows the team to curate data products to be consumed by traditional BI applications today as well as the ability to rapidly scale for the AI/ML applications of tomorrow.

Best Practices for Building User-Facing AI Systems on Databricks

2025-06-10 Watch

talk

Jyotsna Bharadwaj (Databricks) , Arthur Dooner (Databricks)

AI/ML Databricks GenAI Cyber Security

This session is repeated. Integrating AI agents into business systems requires tailored approaches for different maturity levels (crawl-walk-run) that balance scalability, accuracy and usability. This session addresses the critical challenge of making AI agents accessible to business users. We will explore four key integration methods: Databricks apps: The fastest way to build and run applications that leverage your data, with the full security and governance of Databricks Genie: Tool enabling non-technical users to gain data insights on Structured Data through natural language queries Chatbots: Combine real-time data retrieval with generative AI for contextual responses and process automation Batch inference: Scalable, asynchronous processing for large-scale AI tasks, optimizing efficiency and cost We'll compare these approaches, discussing their strengths, challenges and ideal use cases to help businesses select the most suitable integration strategy for their specific needs.

Building Responsible and Resilient AI: The Databricks AI Governance Framework

2025-06-10 Watch

talk

Abhi Arikapudi (Databricks) , David Wells (Databricks)

AI/ML Databricks GenAI Cyber Security

GenAI & machine learning are reshaping industries, driving innovation and redefining business strategies. As organizations embrace these technologies, they face significant challenges in managing AI initiatives effectively, such as balancing innovation with ethical integrity, operational resilience and regulatory compliance. This presentation introduces the Databricks AI Governance Framework (DAGF), a practical framework designed to empower organizations to navigate the complexities of AI. It provides strategies for building scalable, responsible AI programs that deliver measurable value, foster innovation and achieve long-term success. By examining the framework's five foundational pillars — AI organization, ethics, legal and regulatory compliance, transparency and interpretability, AI operations and infrastructure and AI security — this session highlights how AI governance aligns programs with the organization's strategic goals, mitigates risks and builds trust across stakeholders.

Creating LLM Judges to Measure Domain-Specific Agent Quality

2025-06-10 Watch

talk

Samraj Moorjani (Databricks) , Nikhil Thorat (Databricks)

AI/ML LLM

This session is repeated. Measuring the effectiveness of domain-specific AI agents requires specialized evaluation frameworks that go beyond standard LLM benchmarks. This session explores methodologies for assessing agent quality across specialized knowledge domains, tailored workflows, and task-specific objectives. We'll demonstrate practical approaches to designing robust LLM judges that align with your business goals and provide meaningful insights into agent capabilities and limitations. Key session takeaways include: Tools for creating domain-relevant evaluation datasets and benchmarks that accurately reflect real-world use cases Approach for creating LLM judges to measure domain-specific metrics Strategies for interpreting those results to drive iterative improvement in agent performance Join us to learn how proper evaluation methodologies can transform your domain-specific agents from experimental tools to trusted enterprise solutions with measurable business value.

GPU Accelerated Spark Connect

2025-06-10 Watch

talk

Gera Shegalov (NVIDIA) , Erik eordentlich (NVIDIA)

AI/ML API ETL/ELT Cyber Security Spark SQL

Spark Connect, first included for SQL/DataFrame API in Apache Spark 3.4 and recently extended to MLlib in 4.0, introduced a new way to run Spark applications over a gRPC protocol. This has many benefits, including easier adoption for non-JVM clients, version independence from applications and increased stability and security of the associated Spark clusters. The recent Spark Connect extension for ML also included a plugin interface to configure enhanced server-side implementations of the MLlib algorithms when launching the server. In this talk, we shall demonstrate how this new interface, together with Spark SQL’s existing plugin interface, can be used with NVIDIA GPU-accelerated plugins for ML and SQL to enable no-code change, end-to-end GPU acceleration of Spark ETL and ML applications over Spark Connect, with optimal performance up to 9x at 80% cost reduction compared to CPU baselines.

How an Open, Scalable and Secure Data Platform is Powering Quick Commerce Swiggy's AI

2025-06-10 Watch

talk

Vasan Vembu Srini (Databricks) , Akash Agarwal (Swiggy)

AI/ML Analytics Flink Data Lakehouse Databricks Delta

Swiggy, India's leading quick commerce platform, serves ~13 million users across 653 cities, with 196,000 restaurant partners and 17,000 SKUs. To handle this scale, Swiggy developed a secure, scalable AI platform processing millions of predictions per second. The tech stack includes Apache Kafka for real-time streaming, Apache Spark on Databricks for analytics and ML, and Apache Flink for stream processing. The Lakehouse architecture on Delta ensures data reliability, while Unity Catalog enables centralized access control and auditing. These technologies power critical AI applications like demand forecasting, route optimization, personalized recommendations, predictive delivery SLAs, and generative AI use cases.Key Takeaway:This session explores building a data platform at scale, focusing on cost efficiency, simplicity, and speed, empowering Swiggy to seamlessly support millions of users and AI use cases.

How to Get the Most Out of Your BI Tools on Databricks

2025-06-10 Watch

talk

Kyle Hale (Databricks)

AI/ML Analytics BI Databricks DWH Power BI

Unlock the full potential of your BI tools with Databricks. This session explores how features like Photon, Databricks SQL, Liquid Clustering, AI/BI Genie and Publish to Power BI enhance performance, scalability and user experience. Learn how Databricks accelerates query performance, optimizes data layouts and integrates seamlessly with BI tools. Gain actionable insights and best practices to improve analytics efficiency, reduce latency and drive better decision-making. Whether migrating from a data warehouse or optimizing an existing setup, this talk provides the strategies to elevate your BI capabilities.

Lakeflow Connect: The Game-Changer for Complex Event-Driven Architectures

2025-06-10 Watch

talk

Giancarlo Costa (European Food Safety Authority) , Jeroen De Clercq (delaware) , Tim Bal (delaware)

AI/ML Data Quality React

In 2020, Delaware implemented a state-of-the-art, event-driven architecture for EFSA, enabling a highly decoupled system landscape, presented at the Data&AI Summit 2021. By centrally brokering events in near real-time, consumer applications react instantly to events from producer applications as they occur. Event producers are decoupled from consumers via a publisher/subscriber mechanism. Over the past years, we noticed some drawbacks. The processing of these custom events, primarily aimed for process integration weren’t covering all edge cases, the data quality was not always optimal due to missing events and we needed to create a complex logic for SCD2 tables. Lakeflow Connect allows us to extract the data directly from the source without the complex architecture in between, avoiding data loss and thus, data quality issues, and with some simple adjustments, an SCD2 table is created automatically. Lakeflow Connect allows us to create more efficient and intelligent data provisioning.

Laying Data and AI Foundations for the Agentic Future at P&G

2025-06-10

talk

Alfredo Colas (Procter & Gamble)

Agile/Scrum AI/ML Analytics BI Data Governance Cyber Security

In today's rapidly evolving digital landscape, organizations must prioritize robust data architectures and AI strategies to remain competitive. In this session, we will explore how Procter & Gamble (P&G) has embarked on a transformative journey to digitize its operations via scalable data, analytics and AI platforms, establishing a strong foundation for data-driven decision-making and the emergence of agentic AI.Join us as we delve into the comprehensive architecture and platform initiatives undertaken at P&G to create scalable and agile data platforms unleashing BI/AI value. We will discuss our approach to implementing data governance and semantics, ensuring data integrity and accessibility across the organization. By leveraging advanced analytics and Business Intelligence (BI) tools, we will illustrate how P&G harnesses data to generate actionable insights at scale, all while maintaining security and speed.

Let the LLM Write the Prompts: An Intro to DSPy in Compound AI Pipelines

2025-06-10 Watch

talk

Drew Breunig (Overture Maps Foundation)

AI/ML LLM

Large Language Models (LLMs) excel at understanding messy, real-world data, but integrating them into production systems remains challenging. Prompts can be unruly to write, vary by model and can be difficult to manage in the large context of a pipeline. In this session, we'll demonstrate incorporating LLMs into a geospatial conflation pipeline, using DSPy. We'll discuss how DSPy works under the covers and highlight the benefits it provides pipeline creators and managers.

Marketing Data + AI Leaders Forum

2025-06-10 Watch

talk

Dan Morris (Databricks) , Calen Holbrooks (Airtable) , Elizabeth Dobbs (Databricks) , David Geisinger (Deloitte) , Kristen Brophy (ThredUp) , Joyce Hwang (Dropbox) , Zeynep Inanoglu Ozdemir (Atlassian Pty Ltd.) , Bryan Saftler (Databricks) , Alex Dean (Snowplow) , Derek Slager (Amperity) , Rick Schultz (Databricks) , Bryce Peake (Domino's) , Julie Foley Long (Grammarly)

AI/ML Databricks Marketing

Join us Tuesday June 10th, 9:10-12:10 PM PT Hosted by Databricks CMO, Rick Schultz, hear from executives and speakers at PetSmart, Valentino, Domino’s, AirTable, Dropbox, ThredUp, Grammarly, Deloitte, and more. Come for actionable strategies and real-world examples: Hear from marketing experts on how to build data and AI-driven marketing organizations. Learn how Databricks Marketing supercharges impact using the Data Intelligence Platform; scaling personalization, building more efficient campaigns, and empowering marketers to self-serve insights.

Responsible AI at Scale: Balancing Democratization and Regulation in the Financial Sector

2025-06-10 Watch

talk

Aman Thind (State Street)

AI/ML Databricks LLM Cyber Security

We partnered with Databricks to pioneer a new standard in financial sector's enterprise AI, balancing rapid AI democratization with strict regulatory and security requirements. At the core is our Responsible AI Gateway, enforcing jailbreak prevention and compliance on every LLM query. Real-time observability, powered by Databricks, calculates risk and accuracy metrics, detecting issues before escalation. Leveraging Databricks' model hosting ensures scalable LLM access, fortifying security and efficiency. We built frameworks to democratize AI without compromising guardrails. Operating in a regulated environment, we showcase how Databricks enables democratization and responsible AI at scale, offering best practices for financial organizations to harness AI safely and efficiently.

Simplifying Data Pipelines With Lakeflow Declarative Pipelines: A Beginner’s Guide

2025-06-10 Watch

talk

Matt Jones (Databricks) , Brad Turnbaugh (84.51)

AI/ML Analytics Data Engineering ETL/ELT Kafka SQL

As part of the new Lakeflow data engineering experience, Lakeflow Declarative Pipelines makes it easy to build and manage reliable data pipelines. It unifies batch and streaming, reduces operational complexity and ensures dependable data delivery at scale — from batch ETL to real-time processing.Lakeflow Declarative Pipelines excels at declarative change data capture, batch and streaming workloads, and efficient SQL-based pipelines. In this session, you’ll learn how we’ve reimagined data pipelining with Lakeflow Declarative Pipelines, including: A brand new pipeline editor that simplifies transformations Serverless compute modes to optimize for performance or cost Full Unity Catalog integration for governance and lineage Reading/writing data with Kafka and custom sources Monitoring and observability for operational excellence “Real-time Mode” for ultra-low-latency streaming Join us to see how Lakeflow Declarative Pipelines powers better analytics and AI with reliable, unified pipelines.

Sponsored by: Atlan | How Fox & Atlan are Partnering to Make Metadata a Common System of Trust, Context, and Governance

Unity Catalog Managed Tables: Faster Queries, Lower Costs, Effortless Data Management

2025-06-10 Watch

talk

Elizabeth Bowman (Databricks) , Sirui Sun (Databricks)

AI/ML Data Management

What if you could simplify data management, boost performance, and cut costs-all at once? Join us to discover how Unity Catalog managed tables can slash your storage costs, supercharge query speeds, and automate optimizations with AI on the Data Intelligence Platform. Experience seamless interoperability with third-party clients, and be among the first to preview our new game-changing tool that makes moving to UC managed tables effortless. Don’t miss this exciting session that will redefine your data strategy!

AI-Powered Marketing Data Management: Solving the Dirty Data Problem with Databricks

2025-06-10 Watch

talk

Steven Kostrzewski (Acxiom) , Ankur Jain (Acxiom)

AI/ML Data Management Databricks Delta GDPR/CCPA Marketing

Marketing teams struggle with ‘dirty data’ — incomplete, inconsistent, and inaccurate information that limits campaign effectiveness and reduces the accuracy of AI agents. Our AI-powered marketing data management platform, built on Databricks, solves this with anomaly detection, ML-driven transformations and the built-in Acxiom Referential Real ID Graph with Data Hygiene.We’ll showcase how Delta Lake, Unity Catalog and Lakeflow Declarative Pipelines power our multi-tenant architecture, enabling secure governance and 75% faster data processing. Our privacy-first design ensures compliance with GDPR, CCPA and HIPAA through role-based access, encryption key management and fine-grained data controls.Join us for a live demo and Q&A, where we’ll share real-world results and lessons learned in building a scalable, AI-driven marketing data solution with Databricks.

Boosting Data Science and AI Productivity With Databricks Notebooks

2025-06-10 Watch

talk

Vijay Raghavan (Thumbtack) , Jason Cui (Databricks)

AI/ML Data Science Databricks Git

This session is repeated. Want to accelerate your team's data science workflow? This session reveals how Databricks Notebooks can transform your productivity through an optimized environment designed specifically for data science and AI work. Discover how notebooks serve as a central collaboration hub where code, visualizations, documentation and results coexist seamlessly, enabling faster iteration and development. Key takeaways: Leveraging interactive coding features including multi-language support, command-mode shortcuts and magic commands Implementing version control best practices through Git integration and notebook revision history Maximizing collaboration through commenting, sharing and real-time co-editing capabilities Streamlining ML workflows with built-in MLflow tracking and experiment management You'll leave with practical techniques to enhance your notebook-based workflow and deliver AI projects faster with higher-quality results.

Data Management and Governance With UC

2025-06-10

talk

AI/ML API Cloud Computing Data Governance Data Management Databricks

In this course, you'll learn concepts and perform labs that showcase workflows using Unity Catalog - Databricks' unified and open governance solution for data and AI. We'll start off with a brief introduction to Unity Catalog, discuss fundamental data governance concepts, and then dive into a variety of topics including using Unity Catalog for data access control, managing external storage and tables, data segregation, and more. Pre-requisites: Beginner familiarity with the Databricks Data Intelligence Platform (selecting clusters, navigating the Workspace, executing notebooks), cloud computing concepts (virtual machines, object storage, etc.), production experience working with data warehouses and data lakes, intermediate experience with basic SQL concepts (select, filter, groupby, join, etc), beginner programming experience with Python (syntax, conditions, loops, functions), beginner programming experience with the Spark DataFrame API (Configure DataFrameReader and DataFrameWriter to read and write data, Express query transformations using DataFrame methods and Column expressions, etc.) Labs: Yes Certification Path: Databricks Certified Data Engineer Associate

Easy Ways to Optimize Your Databricks Costs

2025-06-10 Watch

talk

Youssef Mrini (Databricks) , Yassine Essawabi (Databricks)

AI/ML BI Databricks

In this session, we will explore effective strategies for optimizing costs on the Databricks platform, a leading solution for handling large-scale data workloads. Databricks, known for its open and unified approach, offers several tools and methodologies to ensure users can maximize their return on investment (ROI) while managing expenses efficiently. Key points: Understanding usage with AI/BI tools Organizing costs with tagging Setting up budgets Leveraging System Tables By the end of this session, you will have a comprehensive understanding of how to leverage Databricks' built-in tools for cost optimization, ensuring that their data and AI projects not only deliver value but do so in a cost-effective manner. This session is ideal for data engineers, financial analysts, and decision-makers looking to enhance their organization’s efficiency and financial performance through strategic cost management on Databricks.

From Code Completion to Autonomous Software Engineering Agents

2025-06-10 Watch

talk

Kilian Lieret (Princeton University)

AI/ML GitHub

As language models have advanced, they have moved beyond code completion and are beginning to tackle software engineering tasks in a more autonomous, agentic way. However, evaluating agentic capabilities is challenging. To address this, we first introduce SWE-bench, a benchmark built from real GitHub issues that has become the standard for assessing AI’s ability to resolve complex software tasks in large codebases. We will discuss the current state of the field, the limitations of today’s models, and how far we still are from truly autonomous AI developers. Next, we will explore the fundamentals of agents based on hands-on demonstrations with SWE-agent, a simple yet powerful agent framework designed for software engineering but adaptable to a variety of domains. By the end of this session, you will have a clear understanding of the current frontier of agentic AI in software engineering, the challenges ahead and how you can experiment with AI agents in your own workflows.

Gen AI Evaluation and Governance

2025-06-10

talk

AI/ML Databricks GenAI RAG Cyber Security Vector DB

This course introduces learners to evaluating and governing GenAI (generative artificial intelligence) systems. First, learners will explore the meaning behind and motivation for building evaluation and governance/security systems. Next, the course will connect evaluation and governance systems to the Databricks Data Intelligence Platform. Third, learners will be introduced to a variety of evaluation techniques for specific components and types of applications. Finally, the course will conclude with an analysis of evaluating entire AI systems with respect to performance and cost. Pre-requisites: Familiarity with prompt engineering, and experience with the Databricks Data Intelligence Platform. Additionally, knowledge of retrieval-augmented generation (RAG) techniques including data preparation, embeddings, vectors, and vector databases Labs: Yes Certification Path: Databricks Certified Generative AI Engineer Associate

talk-data.com

Top Topics

Top Speakers

Unlocking AI Value: Build AI Agents on SAP Data in Databricks

AI Agents for Marketing: Leveraging Mosaic AI to Create a Multi-Purpose Agentic Marketing Assistant

AI/BI Driving Speed to Value in Supply Chain

Best Practices for Building User-Facing AI Systems on Databricks

Building Responsible and Resilient AI: The Databricks AI Governance Framework

Creating LLM Judges to Measure Domain-Specific Agent Quality

GPU Accelerated Spark Connect

How an Open, Scalable and Secure Data Platform is Powering Quick Commerce Swiggy's AI

How to Get the Most Out of Your BI Tools on Databricks

Lakeflow Connect: The Game-Changer for Complex Event-Driven Architectures

Laying Data and AI Foundations for the Agentic Future at P&G

Let the LLM Write the Prompts: An Intro to DSPy in Compound AI Pipelines

Marketing Data + AI Leaders Forum

Responsible AI at Scale: Balancing Democratization and Regulation in the Financial Sector

Simplifying Data Pipelines With Lakeflow Declarative Pipelines: A Beginner’s Guide

Sponsored by: Atlan | How Fox & Atlan are Partnering to Make Metadata a Common System of Trust, Context, and Governance

Sponsored by: dbt Labs | Empowering the Enterprise for the Next Era of AI and BI

Sponsored by: EY | Navigating the Future: Knowledge-Powered Insights on AI, Information Governance, Real-Time Analytics

Unity Catalog Managed Tables: Faster Queries, Lower Costs, Effortless Data Management

AI-Powered Marketing Data Management: Solving the Dirty Data Problem with Databricks

Boosting Data Science and AI Productivity With Databricks Notebooks

Data Management and Governance With UC

Easy Ways to Optimize Your Databricks Costs

From Code Completion to Autonomous Software Engineering Agents

Gen AI Evaluation and Governance