talk-data.com talk-data.com

Event

Data + AI Summit 2025

2025-06-09 – 2025-06-13 Databricks Summit Visit website ↗

Activities tracked

715

Sessions & talks

Showing 251–275 of 715 · Newest first

Search within this event →
Data Triggers and Advanced Control Flow With Lakeflow Jobs

Data Triggers and Advanced Control Flow With Lakeflow Jobs

2025-06-11 Watch
talk

Lakeflow Jobs is the production-ready fully managed orchestrator for the entire Lakehouse with 99.95% uptime. Join us for a dive into how you can orchestrate your enterprise data operations, from triggering your jobs only when your data is ready to advanced control flow with conditionals, looping and job modularity — with demos! Attendees will gain practical insights into optimizing their data operations by orchestrating with Lakeflow Jobs: New task types: Publish AI/BI Dashboards, push to Power BI or ingest with Lakeflow Connect Advanced execution control: Reference SQL Task outputs, run partial DAGs and perform targeted backfills Repair runs: Re-run failed pipelines with surgical precision using task-level repair Control flow upgrades: Native for-each loops and conditional logic make DAGs more dynamic + expressive Smarter triggers: Kick off jobs based on file arrival or Delta table changes, enabling responsive workflows Code-first approach to pipeline orchestration

Delta Sharing in Action: Architecture and Best Practices

Delta Sharing in Action: Architecture and Best Practices

2025-06-11 Watch
talk
Darshana Sivakumar (Databricks) , Mengxi Chen (Databricks)

Delta Sharing is revolutionizing how enterprises share live data and AI assets securely, openly and at scale. As the industry’s first open data-sharing protocol, it empowers organizations to collaborate seamlessly across platforms and with any partner, whether inside or outside the Databricks ecosystem. In this deep-dive session, you’ll learn best practices and real-world use cases that show how Delta Sharing helps accelerate collaboration and fuel AI-driven innovation. We’ll also unveil the latest advancements, including: Managed network configurations for easier, secure setup OIDC identity federation for trusted, open sharing Expanded asset types including dynamic views, materialized views, federated tables, read clones and more Whether you’re a data engineer, architect, or data leader, you’ll leave with practical strategies to future-proof your data-sharing architecture. Don’t miss the live demos, expert guidance and an exclusive look at what’s next in data collaboration.

Hands-on-Learning: Accelerating the Analytics Journey: Leveraging Fivetran, dbt Cloud, and Sigma on Databricks | Sponsored Session

2025-06-11
talk
Nina Anderson (dbt Labs) , Mitch Ertle (Sigma) , Pradeep Anandapu (Databricks) , David Hrncir (Fivetran)

This hands-on lab guides participants through the complete customer data analytics journey on Databricks, leveraging leading partner solutions - Fivetran, dbt Cloud, and Sigma. Attendees will learn how to:- Seamlessly connect to Fivetran, dbt Cloud, and Sigma using Databricks Partner Connect- Ingest data using Fivetran, transform and model data with dbt Cloud, and create interactive dashboards in Sigma, all on top of the Databricks Data Intelligence Platform- Empower teams to make faster, data-driven decisions by streamlining the entire analytics workflow using an integrated, scalable, and user-friendly platform

Hands-on Learning: Databricks SQL in Action: Intelligent Data Warehousing, Analytics and BI Workshop

2025-06-11
workshop
Pearl Ubaru (Databricks)

Most organizations run complex cloud data architectures that silo applications, users and data. Join this interactive hands-on workshop to learn how Databricks SQL allows you to operate a multi-cloud lakehouse architecture that delivers data warehouse performance at data lake economics — with up to 12x better price/performance than traditional cloud data warehouses.Here’s what we’ll cover: How Databricks SQL fits in the Data Intelligence Platform, enabling you to operate a multicloud lakehouse architecture that delivers data warehouse performance at data lake economics How to manage and monitor compute resources, data access and users across your lakehouse infrastructure How to query directly on your data lake using your tools of choice or the built-in SQL editor and visualizations How to use AI to increase productivity when querying, completing code or building dashboards Ask your questions during this hands-on lab, and the Databricks experts will guide you.

Learning from Goldman Sachs' Legend Lakehouse for Data Governance

2025-06-11
talk
George Wu (Goldman Sachs) , Abhishek Narang (Goldman Sachs)

Data is the backbone of modern decision-making, but centralizing it is only the tip of the iceberg. Entitlements, secure sharing and just-in-time availability are critical challenges to any large-scale platform. Join Goldman Sachs as we reveal how our Legend Lakehouse, coupled with Databricks, overcomes these hurdles to deliver high-quality, governed data at scale. By leveraging an open table format (Apache Iceberg) and open catalog format (Unity Catalog), we ensure platform interoperability and vendor neutrality. Databricks Unity Catalog then provides a robust entitlement system that aligns with our data contracts, ensuring consistent access control across producer and consumer workspaces. Finally, Legend functions, integrating with Databricks User Defined Functions (UDF), offer real-time data enrichment and secure transformations without exposing raw datasets. Discover how these components unite to streamline analytics, bolster governance and power innovation.

Lessons Learned: Building a Scalable Game Analytics Platform at Netflix

Lessons Learned: Building a Scalable Game Analytics Platform at Netflix

2025-06-11 Watch
talk
Michael Cuthbert (Netflix) , Bhargavi Reddy Dokuru (NETFLIX INC)

Over the past three years, Netflix has built a catalog of 100+ mobile and cloud games across TV, mobile and web platforms. With both internal and external studios contributing to this diverse ecosystem, building a robust game analytics platform became crucial for gaining insights into player behavior, optimizing game performance and driving member engagement.In this talk, we’ll share our journey of building Netflix’s Game Analytics platform from the ground up. We’ll highlight key decisions around data strategy, such as whether to develop an in-house solution or adopt an external service. We’ll discuss the challenges of balancing developer autonomy with data integrity and the complexities of managing data contracts for custom game telemetry, with an emphasis on self-service analytics. Attendees will learn how the Games Data team navigated these challenges, the lessons learned and the trade-offs involved in building a multi-tenant data ecosystem that supports diverse stakeholders.

Manufacturing Cleaner: How Data Intelligence Cuts Carbon, Not Profits

Manufacturing Cleaner: How Data Intelligence Cuts Carbon, Not Profits

2025-06-11 Watch
talk
Jesse Grekowicz (Dow Inc.) , Tim Licquia (Dow Inc.) , Baptiste Andrieux (CGI providing services for Michelin)

Join industry leaders from Dow and Michelin as they reveal how data intelligence is revolutionizing sustainable manufacturing without compromising profitability. Dow demonstrates how their implementation of Databricks' Data Intelligence Platform has transformed their ability to track and reduce carbon footprints while driving operational efficiencies, resulting in significant cost savings through optimized maintenance and reduced downtime. Michelin follows with their ambitious strategy to achieve 3% energy consumption reduction by 2026, leveraging Databricks to turn this environmental challenge into operational excellence. Together, these manufacturing giants showcase how modern data architecture and AI are creating a new paradigm where sustainability and profitability go hand-in-hand.

Metadata-Driven Streaming Ingestion Using Lakeflow Declarative Pipelines, Azure Event Hubs and a Schema Registry

Metadata-Driven Streaming Ingestion Using Lakeflow Declarative Pipelines, Azure Event Hubs and a Schema Registry

2025-06-11 Watch
talk
Vicky Avison (Plexure)

At Plexure, we ingest hundreds of millions of customer activities and transactions into our data platform every day, fuelling our personalisation engine and providing insights into the effectiveness of marketing campaigns.We're on a journey to transition from infrequent batch ingestion to near real-time streaming using Azure Event Hubs and Lakeflow Declarative Pipelines. This transformation will allow us to react to customer behaviour as it happens, rather than hours or even days later.It also enables us to move faster in other ways. By leveraging a Schema Registry, we've created a metadata-driven framework that allows data producers to: Evolve schemas with confidence, ensuring downstream processes continue running smoothly. Seamlessly publish new datasets into the data platform without requiring Data Engineering assistance. Join us to learn more about our journey and see how we're implementing this with Lakeflow Declarative Pipelines meta-programming - including a live demo of the end-to-end process!

Payer Digital Transformation: The Impact of Data + AI

Payer Digital Transformation: The Impact of Data + AI

2025-06-11 Watch
talk
Neeraj Sharma (Fractal) , Aaron Zavora (Databricks) , Jagadish Venkataraman (UnitedHealth Group)

Payer organizations are rapidly embracing digital transformation, leveraging data and AI to drive operational efficiency, improve member experiences and enhance decision-making. This session explores how advanced analytics, robust data governance and AI-powered insights are enabling payers to streamline claims processing, personalize member engagement, manage pharmacy operations, and optimize care management. Thought leaders will share real-world examples of data-driven innovation, discuss strategies for overcoming interoperability and privacy challenges, and highlight the future potential of AI in reshaping the payer landscape.

PDF Document Ingestion Accelerator for GenAI Applications

PDF Document Ingestion Accelerator for GenAI Applications

2025-06-11 Watch
talk
Qian Yu (Databricks)

Databricks Financial Service customers in the GenAI space have a common use case of ingestion and processing of unstructured documents — PDF/images — then performing downstream GenAI tasks such as entity extraction and RAG based knowledge Q&A. The pain points for the customers for these types of use cases are: The quality of the PDF/image documents varies since many older physical documents were scanned into electronic form The complexity of the PDF/image documents varies and many contain tables — images with embedding information — which require slower Tesseract OCR They would like to streamline postprocess for downstream workloads In this talk we will present an optimized structured streaming workflow for complex PDF ingestion. The key techniques include Apache Spark™ optimization, multi-threading, PDF object extraction, skew handling and auto retry logics

Reinvent Government in an Data Intelligence Era

Reinvent Government in an Data Intelligence Era

2025-06-11 Watch
talk
Asim Qureshi (Databricks) , Ricky Arora (Met Council Environmental Services) , Eric Popowich (Databricks)

To dramatically transform the way citizen services are delivered, organizations must bring all data together — streaming, structured and unstructured — in a secure and governed platform.

Revolutionizing PepsiCo BI Capabilities: From Traditional BI to Next-Gen Analytics Powerhouse

Revolutionizing PepsiCo BI Capabilities: From Traditional BI to Next-Gen Analytics Powerhouse

2025-06-11 Watch
talk
John Abraham (PepsiCo) , Joshua Sayah Lee (PepsiCo Inc.)

This session will provide an in-depth overview of how PepsiCo, a global leader in food and beverage, transformed its outdated data platform into a modern, unified and centralized data and AI-enabled platform using the Databricks SQL serverless environment. Through three distinct implementations that transpired at PepsiCo in 2024, we will demonstrate how the PepsiCo Data Analytics & AI Group unlocked pivotal capabilities that facilitated the delivery of diverse data-driven insights to the business, reduced operational expenses and enhanced overall performance through the newly implemented platform.

Securing the Future: How Banks are Reducing Risk With Data and AI

Securing the Future: How Banks are Reducing Risk With Data and AI

2025-06-11 Watch
talk
Nitin Kulkarni (Nationwide Building SOCIETY) , Gordon Wilson (Sumitomo Mitsui Banking Corporation) , Thomas Sawyer (Sumitomo Mitsui Banking Corp.) , Cyril Cymbler (Databricks)

Today, executives are focused on managing regulatory scrutiny and emerging threats. Banks worldwide are leveraging the Databricks Data Intelligence Platform to enhance fraud prevention, ensure compliance and protect sensitive data while improving operational efficiency.This session will highlight how leading banks are implementing AI-driven risk management to identify vulnerabilities, streamline governance and enhance resilience. By utilizing unified data platforms, these institutions can effectively tackle threats and foster trust without hindering growth.Key takeaways: Fraud detection: Best practices for using machine learning to combat fraud Regulatory compliance: Insights on navigating complex regulations Secure operations: Strategies for scalable operations that protect assets and support growth Join us to see how data intelligence is reshaping the banking industry and enabling success in uncertain times!

Sponsored by: Boomi, LP | From Pipelines to Agents: Manage Data and AI on One Platform for Maximum ROI

Sponsored by: Boomi, LP | From Pipelines to Agents: Manage Data and AI on One Platform for Maximum ROI

2025-06-11 Watch
talk

In the age of agentic AI, competitive advantage lies not only in AI models, but in the quality of the data agents reason on and the agility of the tools that feed them. To fully realize the ROI of agentic AI, organizations need a platform that enables high-quality data pipelines and provides scalable, enterprise-grade tools. In this session, discover how a unified platform for integration, data management, MCP server management, API management, and agent orchestration can help you to bring cohesion and control to how data and agents are used across your organization.

Sponsored by: Google Cloud | Unlock price-performance and efficiency on Google Cloud: Databricks & Axion in Action

Sponsored by: Google Cloud | Unlock price-performance and efficiency on Google Cloud: Databricks & Axion in Action

2025-06-11 Watch
talk
Mo Farhat (Google Cloud)

Maximize the performance of your Databricks Platform with innovations on Google Cloud. Discover how Google's Arm-based Axion C4A virtual machines (VMs) deliver breakthrough price-performance and efficiency for Databricks, supercharging Databricks Photon engine. Gain actionable strategies to optimize your Databricks deployments on Google Cloud.

Take it to the Limit: Art of the Possible in AI/BI

Take it to the Limit: Art of the Possible in AI/BI

2025-06-11 Watch
talk
Noah Sommerfeld (Databricks) , Adam Levine (Databricks)

Think you know everything AI/BI can do? Think again. This session explores the art of the possible with Databricks AI/BI Dashboards and Genie, going beyond traditional analytics to unleash the full power of the lakehouse. From incorporating AI into dashboards to handling large-scale data with ease to delivering insights seamlessly to end users — we’ll showcase creative approaches that unlock insights and real business outcomes. Perfect for adventurous data professionals looking to push limits and think outside the box.

Transforming Bio-Pharma Manufacturing: Eli Lilly's Data-Driven Journey With Databricks

Transforming Bio-Pharma Manufacturing: Eli Lilly's Data-Driven Journey With Databricks

2025-06-11 Watch
talk
Abhijay Datta (Tredence) , SAUNAK DEBROY (Eli Lilly) , Wilfred Mascarenhas (Eli Lilly and Company)

Eli Lilly and Company, a leading bio-pharma company, is revolutionizing manufacturing with next-gen fully digital sites. Lilly and Tredence have partnered to establish a Databricks-powered Global Manufacturing Data Fabric (GMDF), laying the groundwork for transformative data products used by various personas at sites and globally. By integrating data from various manufacturing systems into a unified data model, GMDF has delivered actionable insights across several use cases such as batch release by exception, predictive maintenance, anomaly detection, process optimization and more. Our serverless architecture leverages Databricks Auto Loader for real-time data streaming, PySpark for automation and Unity Catalog for governance, ensuring seamless data processing and optimization. This platform is the foundation for data driven processes, self-service analytics, AI and more. This session will provide details on the data architecture and strategy and share a few use cases delivered.

Transforming Data Pipeline Management With a Targeted Proof of Concept

Transforming Data Pipeline Management With a Targeted Proof of Concept

2025-06-11 Watch
talk
Yi-Chen Tu (Capital One Financial) , Raghu Valluri (Capital One Financial)

At Capital One, data-driven decision making is paramount to our success. This session explores how a focused proof of concept (POC) accelerated a shift in our data pipeline management strategy, resulting in operational improvements and expanded analytical capabilities. We'll cover the business challenges that motivated POC initiation, including data latency, cost savings and scalability limitations, and real-world results. We'll also dive into an examination of the before-and-after architecture with highlights for key technological levers. This session offers insights for data engineering and machine learning practitioners seeking to optimize their data pipelines for improved performance, scalability and business value.

Unity Catalog Deep Dive: Practitioner's Guide to Best Practices and Patterns

Unity Catalog Deep Dive: Practitioner's Guide to Best Practices and Patterns

2025-06-11 Watch
talk
JINLIN HE (Databricks) , Pamela Pettit (Databricks)

Join this deep dive session for practitioners on Unity Catalog, Databricks’ unified data governance solution, to explore its capabilities for managing data and AI assets across workflows. Unity Catalog provides fine-grained access control, automated lineage tracking, quality monitoring and policy enforcement and observability at scale. Whether your focus is data pipelines, analytics or machine learning and generative AI workflows, this session offers actionable insights on leveraging Unity Catalog’s open interoperability across tools and platforms to boost productivity and drive innovation. Learn governance best practices, including catalog configurations, access strategies for collaboration and controls for securing sensitive data. Additionally, discover how to design effective multi-cloud and multi-region deployments to ensure global compliance.

Unleashing Data Governance at iFood:Harnessing System Tables and Lineage for Dynamic Tag Propagation

Unleashing Data Governance at iFood:Harnessing System Tables and Lineage for Dynamic Tag Propagation

2025-06-11 Watch
talk

With regulations like LGPD (Brazil's General Data Protection Law) and GDPR, managing sensitive data access is critical. This session demonstrates how to leverage Databricks Unity Catalog system tables and data lineage to dynamically propagate classification tags, empowering organizations to monitor governance and ensure compliance. The presentation covers practical steps, including system table usage, data normalization, ingestion with Lakeflow Declarative Pipelines and classification tag propagation to downstream tables. It also explores permission monitoring with alerts to proactively address governance risks. Designed for advanced audiences, this session offers actionable strategies to strengthen data governance, prevent breaches and avoid regulatory fines while building scalable frameworks for sensitive data management.

Summit Live: Data Intelligence for Marketing

Summit Live: Data Intelligence for Marketing

2025-06-11 Watch
talk
Anoop Muraleedharan (Databricks)

Maximize the value of your company’s marketing efforts with Data Intelligence for Marketing. Databricks provides seamless, out-of-the-box integration with your ecosystem, empowering every marketer with self-serve insights. And with AI-driven CDP, you get a complete view of customers and campaigns.

Summit Live: Best Practices for Data Warehouse Migrations

Summit Live: Best Practices for Data Warehouse Migrations

2025-06-11 Watch
talk
Laurent Léturgez (Databricks)

Databricks SQL is the fastest-growing data warehouse on the market, with over 10k organizations thanks to its price performance and AI innovations. See the best practices and common architectural challenges of migrating your legacy DW, including reference architectures. Learn how to easily migrate per the recently acquired the Lakebridge migration tool, and through our partners.

AI for BI without the BS

AI for BI without the BS

2025-06-11 Watch
lightning_talk

Stuck on a treadmill of endless report building requests? Wondering how you can ship reliable AI products to internal users and even customers? Omni is a BI and embedded analytics platform on Databricks that lets users answer their own data questions – sometimes with a little AI help. No magic, no miracles – just smart tooling that cuts through the noise and leverages well-known concepts (semantic layer, anyone?) to improve accuracy and delight users. This talk is your blueprint for getting reliable AI use cases into production and reaching the promised land of contagious self-service.

Delta Sharing Demystified: Options, Use Cases and How it Works

Delta Sharing Demystified: Options, Use Cases and How it Works

2025-06-11 Watch
lightning_talk
Julia Førde (Evidi)

Data sharing doesn’t have to be complicated. In this session, we’ll take a practical look at Delta Sharing in Databricks — what it is, how it works and how it fits into your organization’s data ecosystem. The focus will be on giving an overview of the different ways to share data using Databricks, from direct sharing setups to broader distribution via the Databricks Marketplace and more collaborative approaches like Clean Rooms. This talk is meant for anyone curious about modern, secure data sharing — whether you're just getting started or looking to expand your use of Databricks. Attendees will walk away with a clearer picture of what’s possible, what’s required to get started and how to choose the right sharing method for the right scenario.

Generating Laughter: Testing and Evaluating the Success of LLMs for Comedy

Generating Laughter: Testing and Evaluating the Success of LLMs for Comedy

2025-06-11 Watch
lightning_talk
Erin Staples (Galileo)

Nondeterministic AI models, like large language models (LLMs), offer immense creative potential but require new approaches to testing and scalability. Drawing from her experience running New York Times-featured Generative AI comedy shows, Erin uncovers how traditional benchmarks may fall short and how embracing unpredictability can lead to innovative, laugh-inducing results. This talk will explore methods like multi-tiered feedback loops, chaos testing and exploratory user testing, where AI outputs are evaluated not by rigid accuracy standards but by their adaptability and resonance across different contexts — from comedy generation to functional applications. Erin will emphasize the importance of establishing a root source of truth — a reliable dataset or core principle — to manage consistency while embracing creativity. Whether you’re looking to generate a few laughs of your own or explore creative uses of Generative AI, this talk will inspire and delight enthusiasts of all levels.