talk-data.com talk-data.com

Event

Data + AI Summit 2025

2025-06-09 – 2025-06-13 Databricks Summit Visit website ↗

Activities tracked

715

Sessions & talks

Showing 326–350 of 715 · Newest first

Search within this event →
Sponsored by: Atlan | Domain-driven Data Governance in the AI Era: A Conversation with General Motors and Atlan

Sponsored by: Atlan | Domain-driven Data Governance in the AI Era: A Conversation with General Motors and Atlan

2025-06-11 Watch
lightning_talk

Now the largest automaker in the United States, selling more than 2.7 million vehicles in 2024, General Motors is setting a bold vision for its future, with Software-defined vehicles and AI as a driving force. With data as a crucial asset, a transformation of this scale calls for a modern approach to Data Governance. Join Sherri Adame, Enterprise Data Governance Leader at General Motors, to learn about GM’s novel governance approach, supported by technologies like Atlan and Databricks. Hear how Sherri and her team are shifting governance to the left with automation, implementing data contracts, and accelerating data product discovery across domains, creating a cultural shift that emphasizes data as a competitive advantage.

Sponsored by: Hexaware | Global Data at Scale: Powering Front Office Transformation with Databricks

Sponsored by: Hexaware | Global Data at Scale: Powering Front Office Transformation with Databricks

2025-06-11 Watch
lightning_talk
Bindu Birur (KPMG)

Global Data at Scale: Powering Front Office Transformation with DatabricksJoin KPMG for an engaging session on how we transformed our data platform and built a cutting-edge Global Data Store (GDS)—a game-changing data hub for our Front Office Transformation (FOT). Discover how we seamlessly unified data from various member firms, turning it into a dynamic engine for and enabled our business to leverage our Front Office ecosystem to enable smarter analytics and decision-making. Learn about our unique approach that rapidly integrates diverse datasets into the GDS and our hub-and-spoke model, connecting member firms’ data lakes, enabling secure, high-speed collaboration via Delta Sharing. Hear how we are leveraging Unity Catalog to help ensure data governance, compliance, and straight forward data lineage. We’ll share strategies for risk management, security (fine-grained access, encryption), and scaling a cloud-based data ecosystem.

Sponsored by: Tiger Analytics | Data-Driven Transformation to Hypercharge Predictive and Diagnostic Supply Chain Intelligence

Sponsored by: Tiger Analytics | Data-Driven Transformation to Hypercharge Predictive and Diagnostic Supply Chain Intelligence

2025-06-11 Watch
lightning_talk
Vishal Puri (Tiger Analytics)

Manufacturers today need efficient, accurate, and flexible integrated planning across supply, demand, and finance. A leading industrial manufacturer is pursuing a competitive edge in Integrated Business Planning through data and AI.Their strategy: a connected, real-time data foundation with democratized access across silos. Using Databricks, we’re building business-centric data products to enable near real-time, collaborative decisions and scaled AI. Unity Catalog ensures data reliability and adoption. Increased data visibility is driving better on-time delivery, inventory optimization, and forecasting,resulting in measurable financial impact. In this session, we’ll share our journey to the north star of “driving from the windshield, not the rearview,” including key data, organization, and process challenges in enabling data democratization; architectural choices for Integrated Business Planning as a data product; and core capabilities delivered with Tiger’s Accelerator.

Summit Live: Fireside Chat with Arsalan Tavakoli, Databricks co-Founder

Summit Live: Fireside Chat with Arsalan Tavakoli, Databricks co-Founder

2025-06-11 Watch
fireside_chat
Arsalan Tavakoli-Shiraji (Databricks) , Ari Kaplan (Databricks)

Arsalan Tavakoli-Shiraji co-founded Databricks, growing it to one of the most influential tech companies in history. Arsalan will expand from the mainstage keynotes, and discuss how companies are implementing, monetizing, and scaling data intelligence.

Achieving AI Success with a Solid Data Foundation

Achieving AI Success with a Solid Data Foundation

2025-06-11 Watch
talk
Santosh Kudva (GE Vernova) , Kevin Tollison (EY)

Join for an insightful presentation on creating a robust data architecture to drive business outcomes in the age of Generative AI. Santosh Kudva, GE Vernova Chief Data Officer and Kevin Tollison, EY AI Consulting Partner, will share their expertise on transforming data strategies to unleash the full potential of AI. Learn how GE Vernova, a dynamic enterprise born from the 2024 spin-off of GE, revamped its diverse landscape. They will provide a look into how they integrated the pre-spin-off Finance Data Platform into the GE Vernova Enterprise Data & Analytics ecosystem utilizing Databricks to enable high-performance AI-led analytics. Key insights include: Incorporating Generative AI into your overarching strategy Leveraging comprehensive analytics to enhance data quality Building a resilient data framework adaptable to continuous evolution Don't miss this opportunity to hear from industry leaders and gain valuable insights to elevate your data strategy and AI success.

Agent Bricks: Building Multi-Agent Systems for Structured and Unstructured Information

Agent Bricks: Building Multi-Agent Systems for Structured and Unstructured Information

2025-06-11 Watch
talk
Elise Gonzales (Databricks)

Learn how to build sophisticated systems that enable natural language interactions with both your structured databases and unstructured document collections. This session explores advanced techniques for creating unified and governed AI systems that can seamlessly interpret questions, retrieve relevant information and generate accurate answers across your entire data ecosystem. Key takeaways include: Strategies for combining vector search over unstructured documents with retrieval from structured databases Techniques for optimizing unstructured data processing through effective parsing, metadata enrichment and intelligent chunking Methods for integrating different retrieval mechanisms while ensuring consistent data governance and security Practical approaches for evaluating and improving KBQA system quality through automated and human feedback

AI Agents Fundamentals Training

2025-06-11
talk

This course will introduce you to AI agents, their transformative impact on organizations, and how Databricks enables the creation of AI agent systems. We’ll begin by exploring what AI agents are, how they differ from traditional AI systems, and why they are becoming essential in today’s data-driven landscape. Next, we’ll examine how AI agents can be used to automate tasks, enhance decision-making, and unlock new efficiencies for businesses of all sizes. Finally, we’ll review real-world examples of AI agent systems on Databricks, showcasing practical applications across industries and sharing key considerations for successful adoption. You can pass a short quiz and earn a badge to validate your learning on completion.

Autonomous AI Agents in AI Infrastructure

Autonomous AI Agents in AI Infrastructure

2025-06-11 Watch
lightning_talk
Apurva Kumar (Walmart Global Tech)

Autonomous AI agents are transforming industries by enabling systems to perform tasks, make decisions and adapt in real time without human intervention. In this talk, I will delve into the architecture and design principles required to build these agents within scalable AI infrastructure. Key topics will include constructing modular, reusable frameworks, optimizing resource allocation and enabling interoperability between agents and data pipelines. I will discuss practical use cases in which attendees will learn how to leverage containerization and orchestration techniques to enhance the flexibility and performance of these agents while ensuring low-latency decision-making. This session will also highlight challenges like ensuring robustness, ethical considerations and strategies for real-time feedback loops. Participants will gain actionable insights into building autonomous AI agents that drive efficiency, scalability and innovation in modern AI ecosystems.

Best Practices for Moving to Unity Catalog Managed Tables

Best Practices for Moving to Unity Catalog Managed Tables

2025-06-11 Watch
talk
Youssef Mrini (Databricks) , Elizabeth Bowman (Databricks)

Are you ready to unlock the full power of Unity Catalog managed tables? This session delivers actionable insights for transitioning to UC managed tables. Learn why managed tables are the default for performance and ease of use, and how automatic feature upgrades future-proof your architecture. Whether you manage thousands of tables or want to streamline operations, you’ll gain the tools and strategies to thrive in the era of intelligent data management. Join us and discover how easy it is to move to UC managed tables!

Building a Self-Service Data Platform With a Small Data Team

Building a Self-Service Data Platform With a Small Data Team

2025-06-11 Watch
talk
Gleb Lesnikov (Dodo Brands) , Evgenii Dobrynin (Dodo Brands)

Discover how Dodo Brands, a global pizza and coffee business with over 1,200 retail locations and 40k employees, revolutionized their analytics infrastructure by creating a self-service data platform. This session explores the approach to empowering analysts, data scientists and ML engineers to independently build analytical pipelines with minimal involvement from data engineers. By leveraging Databricks as the backbone of their platform, the team developed automated tools like a "job-generator" that uses Jinja templates to streamline the creation of data jobs. This approach minimized manual coding and enabled non-data engineers to create over 1,420 data jobs — 90% of which were auto-generated by user configurations. Supporting thousands of weekly active users via tools like Apache Superset. This session provides actionable insights for organizations seeking to scale their analytics capabilities efficiently without expanding their data engineering teams.

Building Intelligent AI Agents With Claude Models and Databricks Mosaic AI Framework

Building Intelligent AI Agents With Claude Models and Databricks Mosaic AI Framework

2025-06-11 Watch
talk
Sam Flamini (Anthropic)

This session is repeated. Explore how Anthropic's frontier models power AI agents in Databricks Mosaic AI Agent Framework. Learn to leverage Claude's state-of-the-art capabilities for complex agentic workflows while benefiting from Databricks unified governance, credential management and evaluation tools. We'll demonstrate how Anthropic's models integrate seamlessly to create production-ready applications that combine Claude's reasoning with Databricks data intelligence capabilities.

Comprehensive Guide to MLOps on Databricks

Comprehensive Guide to MLOps on Databricks

2025-06-11 Watch
talk
Arpit Jasapara (Databricks) , Eric Golinko (Databricks)

This in-depth session explores advanced MLOps practices for implementing production-grade machine learning workflows on Databricks. We'll examine the complete MLOps journey from foundational principles to sophisticated implementation patterns, covering essential tools including MLflow, Unity Catalog, Feature Stores and version control with Git. Dive into Databricks' latest MLOps capabilities including MLflow 3.0, which enhances the entire ML lifecycle from development to deployment with particular focus on generative AI applications. Key session takeaways include: Advanced MLflow 3.0 features for LLM management and deployment Enterprise-grade governance with Unity Catalog integration Robust promotion patterns across development, staging and production CI/CD pipeline automation for continuous deployment GenAI application evaluation and streamlined deployment

Databricks on Databricks: Transforming the Sales Experience using GenAI Agents at Scale

Databricks on Databricks: Transforming the Sales Experience using GenAI Agents at Scale

2025-06-11 Watch
talk
Manjeet Singh Chhabra (Databricks) , Akhil Aggrawal (Databricks)

Databricks is transforming its sales experience with a GenAI agent — built and deployed entirely on Databricks — to automate tasks, streamline data retrieval, summarize content, and enable conversational AI for over 4,000 sellers. This agent leverages the AgentEval framework, AI Bricks, and Model Serving to process both structured and unstructured data within Databricks, unlocking deep sales insights. The agent seamlessly integrates across multiple data sources including Salesforce, Google Drive, and Glean securely via OAuth. This session includes a live demonstration and explores the business impact, architecture as well as agent development and evaluation strategies, providing a blueprint for deploying secure, scalable GenAI agents in large enterprises.

Defending Revenue With GenAI

Defending Revenue With GenAI

2025-06-11 Watch
lightning_talk
Garrison Nakanelua (Blueprint)

Defending revenue is critical to any business strategy, and predicting customer churn is difficult. Until now. In this session, Blueprint will share how their clients use GenAI on Databricks to reduce customer churn, grow average revenue per user, and create overall revenue growth. This presentation will demonstrate how they helped a customer take a GenAI-powered personalization engine from proof-of-concept to production to improve customer churn propensity, personalized retention, and customer satisfaction. Learn how to turn your lakehouse from a cost center into a profit center.

Enhancing Efficiency With Security: How Morgan Stanley is Adopting a Fully-Managed Lakehouse

2025-06-11
talk
Boris Dank (Morgan Stanley) , Samrat Ray (Databricks)

Morgan Stanley, a highly regulated financial institution, needs to meet stringent security and regulatory requirements around data storage and processing. Traditionally, this has necessitated maintaining control over data and compute within their own accounts with the associated management overhead. In this session, we will cover how Morgan Stanley has partnered with Databricks on a fully-managed compute and storage solution that allows them to meet their regulatory obligations with significantly reduced effort. This innovative approach enables rapid onboarding of new projects onto the platform, improving operational efficiency while maintaining the highest levels of security and compliance.

Franchise IP and Data Governance at Krafton: Driving Cost Efficiency and Scalability

Franchise IP and Data Governance at Krafton: Driving Cost Efficiency and Scalability

2025-06-11 Watch
lightning_talk
hwaeium yeom (KRAFTON)

Join us as we explore how KRAFTON optimized data governance for PUBG IP, enhancing cost efficiency and scalability. KRAFTON operates a massive data ecosystem, processing tens of terabytes daily. As real-time analytics demands increased, traditional Batch-based processing faced scalability challenges. To address this, we redesigned data pipelines and governance models, improving performance while reducing costs. Transitioned to real-time pipelines (batch to streaming) Optimized workload management (reducing all-purpose clusters, increasing Jobs usage) Cut costs by tens of thousands monthly (up to 75%) Enhanced data storage efficiency (lower S3 costs, Delta Tables) Improved pipeline stability (Medallion Architecture) Gain insights into how KRAFTON scaled data operations, leveraging real-time analytics and cost optimization for high-traffic games. Learn more: https://www.databricks.com/customers/krafton

From Prediction to Prevention: Transforming Risk Management in Insurance

From Prediction to Prevention: Transforming Risk Management in Insurance

2025-06-11 Watch
talk
Sebastien Gignac (Intact Financial Corp) , Dylani Herath (Thrivent Financial) , Marcela Granados (Databricks) , Michael Ban (Nationwide)

Protecting insurers against emerging threats is critical. This session reveals how leading companies use Databricks’ Data Intelligence Platform to transform risk management, enhance fraud detection, and ensure compliance. Learn how advanced analytics, AI, and machine learning process vast data in real time to identify risks and mitigate threats. Industry leaders will share strategies for building resilient operations that protect against financial losses and reputational harm. Key takeaways: AI-powered fraud prevention using anomaly detection and predictive analytics Real-time risk assessment models integrating IoT, behavioral, and external data Strategies for robust compliance and governance with operational efficiency Discover how data intelligence is revolutionizing insurance risk management and safeguarding the industry’s future.

Hands-on Learning: AI-Powered Data Engineering with Lakeflow: Techniques for Modern Data Professionals

2025-06-11
talk
Frank Munz (Databricks)

This introductory workshop caters to data engineers seeking hands-on experience and data architects looking to deepen their knowledge. The workshop is structured to provide a solid understanding of the following data engineering and streaming concepts: Introduction to Lakeflow and the Data Intelligence Platform Getting started with Lakeflow Declarative Pipelines for declarative data pipelines in SQL using Streaming Tables and Materialized Views Mastering Databricks Workflows with advanced control flow and triggers Understanding serverless compute Data governance and lineage with Unity Catalog Generative AI for Data Engineers: Genie and Databricks Assistant We believe you can only become an expert if you work on real problems and gain hands-on experience. Therefore, we will equip you with your own lab environment in this workshop and guide you through practical exercises like using GitHub, ingesting data from various sources, creating batch and streaming data pipelines, and more.

Hands-On Learning: Build Custom Data Intelligence Apps on Databricks

2025-06-11
talk
Justin DeBrabant (Databricks) , Giran Moodley (Databricks) , Ivan Trusov (Databricks)

Want to learn how to build your own custom data intelligence applications directly in Databricks? In this workshop, we’ll guide you through a hands-on tutorial for building a Streamlit web app that leverages many of the key products at Databricks as building blocks. You’ll integrate a live DB SQL warehouse, use Genie to ask questions in natural language, and embed AI/BI dashboards for interactive visualizations. In addition, we’ll discuss key concepts and best practices for building production-ready apps, including logging and observability, scalability, different authorization models, and deployment. By the end, you'll have a working AI app—and the skills to build more.

How Skyscanner Runs Real-Time AI at Scale with Databricks

How Skyscanner Runs Real-Time AI at Scale with Databricks

2025-06-11 Watch
talk
Ahmed Bilal (Databricks) , Michael Ewins (Skyscanner)

Deploying AI in production is getting more complex — with different model types, tighter timelines, and growing infrastructure demands. In this session, we’ll walk through how Mosaic AI Model Serving helps teams deploy and scale both traditional ML and generative AI models efficiently, with built-in monitoring and governance.We’ll also hear from Skyscanner on how they’ve integrated AI into their products, scaled to 100+ production endpoints, and built the processes and team structures to support AI at scale. Key Takeaways: How Skyscanner ships and operates AI in real-world products How to deploy and scale a variety of models with low latency and minimal overhead Building compound AI systems using models, feature stores, and vector search Monitoring, debugging, and governing production workloads

How to Build an Open Lakehouse: Best Practices for Interoperability

How to Build an Open Lakehouse: Best Practices for Interoperability

2025-06-11 Watch
talk
James Malone (Databricks) , Aniruth Narayanan (Databricks)

Building an open data lakehouse? Start with the right blueprint. This session walks through common reference architectures for interoperable lakehouse deployments across AWS, Google Cloud, Azure and tools like Snowflake, BigQuery and Microsoft Fabric. Learn how to design for cross-platform data access, unify governance with Unity Catalog and ensure your stack is future-ready — no matter where your data lives.

Lakebase: Fully Managed Postgres for the Lakehouse

Lakebase: Fully Managed Postgres for the Lakehouse

2025-06-11 Watch
talk
Abbey Russell (Databricks) , Dave Nettleton (Databricks)

Lakebase is a new Postgres-compatible OLTP database designed to support intelligent applications. Lakebase eliminates custom ETL pipelines with built-in lakehouse table synchronization, supports sub-10ms latency for high-throughput workloads, and offers full Postgres compatibility, so you can build applications more quickly.In this session, you’ll learn how Lakebase enables faster development, production-level concurrency, and simpler operations for data engineers and application developers building modern, data-driven applications. We'll walk through key capabilities, example use cases, and how Lakebase simplifies infrastructure while unlocking new possibilities for AI and analytics.

Lakeflow Connect: Easy, Efficient Ingestion From Databases

Lakeflow Connect: Easy, Efficient Ingestion From Databases

2025-06-11 Watch
talk
Peter Pogorski (Databricks) , Bret Grantham (Databricks)

Lakeflow Connect streamlines the ingestion of incremental data from popular databases like SQL Server and PostgreSQL. In this session, we’ll review best practices for networking, security, minimizing database load, monitoring and more — tailored to common industry scenarios. Join us to gain practical insights into Lakeflow Connect's functionality so that you’re ready to build your own pipelines. Whether you're looking to optimize data ingestion or enhance your database integrations, this session will provide you with a deep understanding of how Lakeflow Connect works with databases.

Lakeflow Connect: Seamless Data Ingestion From Enterprise Apps

2025-06-11
talk
Manish Dalwadi (Databricks) , Andreas Maier (Porsche Informatik GmbH)

Lakeflow Connect enables you to easily and efficiently ingest data from enterprise applications like Salesforce, ServiceNow, Google Analytics, SharePoint, NetSuite, Dynamics 365 and more. In this session, we’ll dive deep on using connectors for the most popular SaaS applications, reviewing common use cases such as analyzing consumer behavior, predicting churn and centralizing HR analytics. You'll also hear from an early customer about how Lakeflow Connect helped unify their customer data to drive an improved automotive experience. We’ll wrap up with a Q&A so you have the opportunity to learn from our experts.

Leveraging GenAI for Synthetic Data Generation to Improve Spark Testing and Performance in Big Data

Leveraging GenAI for Synthetic Data Generation to Improve Spark Testing and Performance in Big Data

2025-06-11 Watch
lightning_talk
Satej Kumar Sahu (Zalando SE)

Testing Spark jobs in local environments is often difficult due to the lack of suitable datasets, especially under tight timelines. This creates challenges when jobs work in development clusters but fail in production, or when they run locally but encounter issues in staging clusters due to inadequate documentation or checks. In this session, we’ll discuss how these challenges can be overcome by leveraging Generative AI to create custom synthetic datasets for local testing. By incorporating variations and sampling, a testing framework can be introduced to solve some of these challenges, allowing for the generation of realistic data to aid in performance and load testing. We’ll show how this approach helps identify performance bottlenecks early, optimize job performance and recognize scalability issues while keeping costs low. This methodology fosters better deployment practices and enhances the reliability of Spark jobs across environments.