Data + AI Summit 2025

Sponsored by: Domo | Orchestrating Fleet Intelligence with AI Agents and Real-Time IoT With Databricks + DOMO

AI-Driven Drug Discovery: Accelerating Molecular Insights With NVIDIA and Databricks

2025-06-10 Watch

talk

Karuna Nadadur (NVIDIA) , Srijit Chandrashekhar Nair (Databricks)

AI/ML Data Lakehouse Databricks Delta

This session is repeated. In the race to revolutionize healthcare and drug discovery, biopharma companies are turning to AI to streamline workflows and unlock new scientific insights. This session, we will explore how NVIDIA BioNeMo, combined with Databricks Delta Lakehouse, can be used for advancing drug discovery for critical applications like molecular structure modeling, protein folding and diagnostics. We’ll demonstrate how BioNeMo pre-trained models can run inference on data securely stored in Delta Lake, delivering actionable insights. By leveraging containerized solutions on Databricks’ ML Runtime with GPU acceleration, users can achieve significant performance gains compared to traditional CPU-based computation.

AI Powering Epsilon's Identity Strategy: Unified Marketing Platform on Databricks

2025-06-10 Watch

talk

Gairik Chakraborty (Epsilon Data Management) , Boaz Super (Epsilon Data Management)

AI/ML Cloud Computing Data Lakehouse Data Management Data Science Databricks

Join us to hear about how Epsilon Data Management migrated Epsilon’s unique, AI-powered marketing identity solution from multi-petabyte on-prem Hadoop and data warehouse systems to a unified Databricks Lakehouse platform. This transition enabled Epsilon to further scale its Decision Sciences solution and enable new cloud-based AI research capabilities on time and within budget, without being bottlenecked by the resource constraints of on-prem systems. Learn how Delta Lake, Unity Catalog, MLflow and LLM endpoints powered massive data volume, reduced data duplication, improved lineage visibility, accelerated Data Science and AI, and enabled new data to be immediately available for consumption by the entire Epsilon platform in a privacy-safe way. Using the Databricks platform as the base for AI and Data Science at global internet scale, Epsilon deploys marketing solutions across multiple cloud providers and multiple regions for many customers.

Cloud-to-Cloud Data Sharing by Walmart: Direct Access to Omni-Channel Sales Data With Delta Sharing

2025-06-10

talk

Roberto Robles Nacif (Walmart Data Ventures) , Ajay Bhonsule (Walmart Inc.)

Analytics API Cloud Computing Databricks Delta Omni

As first-party data becomes increasingly invaluable to organizations, Walmart Data Ventures is dedicated to bringing to life new applications of Walmart’s first-party data to better serve its customers. Through Scintilla, its integrated insights ecosystem, Walmart Data Ventures continues to expand its offerings to deliver insights and analytics that drive collaboration between our merchants, suppliers, and operators.Scintilla users can now access Walmart data using Cloud Feeds, based on Databricks Delta Sharing technologies. In the past, Walmart used API-based data sharing models, which required users to possess certain skills and technical attributes that weren’t always available. Now, with Cloud Feeds, Scintilla users can more easily access data without a dedicated technical team behind the scenes making it happen. Attendees will gain valuable insights into how Walmart has built its robust data sharing architecture and strategies to design scalable and collaborative data sharing architectures in their own organizations.

Demystifying Upgrading to Unity Catalog — Challenges, Design and Execution

2025-06-10 Watch

talk

Dipankar Kushari (Databricks) , Anirudh Kala (Celebal Technologies)

AI/ML Databricks

Databricks Unity Catalog (UC) is the industry’s only unified and open governance solution for data and AI, built into the Databricks Data Intelligence Platform. UC provides a single source of truth for organization’s data and AI, providing open connectivity to any data source, any format, lineage, monitoring and support for open sharing and collaboration. In this session we will discuss the challenges in upgrading to UC from your existing databricks Non-UC set up. We will discuss a few customer use cases and how we overcame difficulties and created a repeatable pattern and reusable assets to replicate the success of upgrading to UC across some of the largest databricks customers. It is co-presented with our partner Celebal Technologies.

Dusting off the Cobwebs — Moving off a 26-year-old Heritage Platform to Databricks [Teradata]

2025-06-10 Watch

talk

Joanna Gurry (National Australia Bank)

Agile/Scrum AI/ML Analytics Cloud Computing Data Analytics Databricks

Join us to hear about how National Australia Bank (NAB) successfully completed a significant milestone in its data strategy by decommissioning its 26-year-old Teradata environment and migrating to a new strategic data platform called 'Ada'. This transition marks a pivotal shift from legacy systems to a modern, cloud-based data and AI platform powered by Databricks. The migration process, which spanned two years, involved ingesting 16 data sources, transferring 456 use cases, and collaborating with hundreds of users across 12 business units. This strategic move positions NAB to leverage the full potential of cloud-native data analytics, enabling more agile and data-driven decision-making across the organization. The successful migration to Ada represents a significant step forward in NAB's ongoing efforts to modernize its data infrastructure and capitalize on emerging technologies in the rapidly evolving financial services landscape

Empowering Progress: Building a Personalized Training Goal Ecosystem with Databricks

2025-06-10 Watch

talk

Kristi Korsberg (Tonal) , Giuseppe Barbalinardo (Tonal)

Databricks Spark

Tonal is the ultimate strength training system, giving you the expertise of a personal trainer and a full gym in your home. Through user interviews and social media feedback, we identified a consistent challenge: members found it difficult to measure their progress in their fitness journey. To address this, we developed the Training Goal (TG) ecosystem, a four-part solution that introduced new preference options to capture users' fitness aspirations, implemented weekly metrics that accumulate as members complete workouts, defined personalized weekly targets to guide progress, and enhanced workout details to show how each session contributes toward individual goals.We present how we leveraged Spark, MLflow, and Workflows within the Databricks ecosystem to compute TG metrics, manage model development, and orchestrate data pipelines. These tools allowed us to launch the TG system on schedule, supporting scalability, reliability, and a more meaningful, personalized way for members to track their progress.

From Days to Seconds — Reducing Query Times on Large Geospatial Datasets by 99%

2025-06-10 Watch

talk

Chris Crawford (Databricks) , Hobson Bryan (Global Water Security Center)

CI/CD Data Lakehouse Databricks Git Cyber Security Spark

The Global Water Security Center translates environmental science into actionable insights for the U.S. Department of Defense. Prior to incorporating Databricks, responding to these requests required querying approximately five hundred thousand raster files representing over five hundred billion points. By leveraging lakehouse architecture, Databricks Auto Loader, Spark Streaming, Databricks Spatial SQL, H3 geospatial indexing and Databricks Liquid Clustering, we were able to drastically reduce our “time to analysis” from multiple business days to a matter of seconds. Now, our data scientists execute queries on pre-computed tables in Databricks, resulting in a “time to analysis” that is 99% faster, giving our teams more time for deeper analysis of the data. Additionally, we’ve incorporated Databricks Workflows, Databricks Asset Bundles, Git and Git Actions to support CI/CD across workspaces. We completed this work in close partnership with Databricks.

GenAI Observability in Customer Care

2025-06-10 Watch

talk

Matteo Ciccozzi (EarnIn) , Willem Dhaeseleer (EarnIn)

AI/ML BI Databricks GenAI Kafka

Customer support is going through the GenAI revolution, but how can we use AI to foster deeper empathy with our end users?To enable this, Earnin has built its GenAI observability platform on Databricks, leveraging Lakeflow Declarative Pipeliness, Kafka and Databricks AI/BI.This session covers how we use Lakeflow Declarative Pipelines to monitor our customer care chatbot in near real-time and how we leverage Databricks to better anticipate our customers' needs.

Genie for Engineering: Optimizing HVAC Design and Operational Insights With Data and AI

2025-06-10

talk

Mohamed Hanif Ansari (Lennox) , Sridhar Venkatesh (Lennox)

AI/ML Databricks Delta Spark

In this session, we will explore how Genie, an AI-driven platform transformed HVAC operational insights by leveraging Databricks offerings like Apache Spark, Delta Lake and the Databricks Data Intelligence Platform.Key contributions: Real-time data processing: Lakeflow Declarative Pipelines and Apache Spark™ for efficient data ingestion and real-time analysis. Workflow orchestration: Databricks Data Intelligence Platform to orchestrate complex workflows and integrate various data sources and analytical tools. Field Data Integration: Incorporating real-time field data into design and algorithm development, enabling engineers to make informed adjustments and optimize performance. By analyzing real-time data from HVAC installations, Genie identified discrepancies between design specs and field performance, allowing engineers to optimize algorithms, reduce inefficiencies and improve customer satisfaction. Discover how Genie revolutionized HVAC management and apply to your projects.

Geo-Powering Insights: The Art of Spatial Data Integration and Visualization

2025-06-10 Watch

talk

Mathieu Pelletier (Databricks)

BI Data Lakehouse Databricks Power BI SQL Tableau

In this presentation, we will explore how to leverage Databricks' SQL engine to efficiently ingest and transform geospatial data. We'll demonstrate the seamless process of connecting to external systems such as ArcGIS to retrieve datasets, showcasing the platform's versatility in handling diverse data sources. We'll then delve into the power of Databricks Apps, illustrating how you can create custom geospatial dashboards using various frameworks like Streamlit and Flask, or any framework of your choice. This flexibility allows you to tailor your visualizations to your specific needs and preferences. Furthermore, we'll highlight the Databricks Lakehouse's integration capabilities with popular dashboarding tools such as Tableau and Power BI. This integration enables you to combine the robust data processing power of Databricks with the advanced visualization features of these specialized tools.

High-Throughput ML: Mastering Efficient Model Serving at Enterprise Scale

2025-06-10 Watch

talk

Mingyang Ge (Databricks) , Yucheng Qian (Databricks)

AI/ML Databricks

Ever wondered how industry leaders handle thousands of ML predictions per second? This session reveals the architecture behind high-performance model serving systems on Databricks. We'll explore how to build inference pipelines that efficiently scale to handle massive request volumes while maintaining low latency. You'll learn how to leverage Feature Store for consistent, low-latency feature lookups and implement auto-scaling strategies that optimize both performance and cost. Key takeaways: Determining optimal compute capacity using the QPS × model execution time formula Configuring Feature Store for high-throughput, low-latency feature retrieval Managing cold starts and scaling strategies for latency-sensitive applications Implementing monitoring systems that provide visibility into inference performance Whether you're serving recommender systems or real-time fraud detection models, you'll gain practical strategies for building enterprise-grade ML serving systems.

How HP Is Optimizing the 3D Printing Supply Chain Using Delta Sharing

2025-06-10 Watch

talk

javier.lagares javier.lagares (HP)

AI/ML Analytics BI Databricks Delta Cyber Security

HP’s 3D Print division empowers manufacturers with telemetry data to optimize operations and streamline maintenance. Using Delta Sharing, Unity Catalog and AI/BI dashboards, HP provides a secure, scalable solution for data sharing and analytics. Delta Sharing D2O enables seamless data access, even for customers not on Databricks. Apigee masks private URLs, and Unity Catalog enhances security by managing data assets. Predictive maintenance with Mosaic AI boosts uptime by identifying issues early and alerting support teams. Custom dashboards and sample code let customers run analytics using any supported client, while Apigee simplifies access by abstracting complexity. Insights from A/BI dashboards help HP refines data strategy, aligning solutions with customer needs despite the complexity of diverse technologies, fragmented systems and customer-specific requirements. This fosters trust, drives innovation,and strengthens HP as a trusted partner for scalable, secure data solutions.

IQVIA's Analytics for Patient Support Services: Transforming Scalability, Performance and Governance

2025-06-10 Watch

talk

Dmytro Kobryn (Customertimes) , Sudha Ragothaman (IQVIA)

Analytics Data Governance Databricks Cyber Security

This presentation will explore the transformation of IQVIA's decade-old patient support platform through the implementation of Databricks Data Intelligence Platform. Facing scalability challenges, performance bottlenecks and rising costs, the existing platform required significant redesign to handle growing data volumes and complex analytics. Key issues included static metrics limiting workflow optimization, fragmented data governance and heightened compliance and security demands. By partnering with Customertimes (a Databricks Partner) and adopting Databricks' centralized, scalable analytics solution with enhanced self-service capabilities, IQVIA achieved improved query performance, cost efficiency and robust governance, ensuring operational effectiveness and regulatory compliance in an increasingly complex environment.

Managing Data and AI Security Risks With DASF 2.0 — and a Customer Story

2025-06-10 Watch

talk

Arun Pamulapati (Databricks) , Joseph Raetano (US AI)

AI/ML Databricks GenAI Cyber Security

The Databricks Security team led a broad working group that significantly evolved the Databricks AI Security Framework (DASF) to its 2.0 version since its first release by closely collaborating with the top cyber security researchers at industry organizations such as OWASP, Gartner, NIST, HITRUST, FAIR Institute and several Fortune 100 companies to address the evolving risks and associated controls of AI systems in enterprises. Join us to to learn how The CLEVER GenAI pipeline, an AI-driven innovation in healthcare, processes over 1.5 million clinical notes daily to classify social determinants impacting veteran care while adhering to robust security measures like NIST 800-53 controls and by leveraging Databricks AI Security Framework. We will discuss robust AI security guidelines to help data and AI teams understand how to deploy their AI applications securely. This session will give a security framework for security teams, AI practitioners, data engineers and governance teams.

Real-Time Market Insights — Powering Optiver’s Live Trading Dashboard with Databricks Apps and Dash

2025-06-10 Watch

talk

Huy Nguyen (Optiver)

Dashboard Databricks Data Streaming

In the fast-paced world of trading, real-time insights are critical for making informed decisions. This presentation explores how Optiver, a leading high-frequency trading firm, harnesses Databricks apps to power its live trading dashboards. The technology enables traders to analyze market data, detect patterns and respond instantly. In this talk, we will showcase how our system leverages Databricks’ scalable infrastructures such as Structured Streaming to efficiently handle vast streams of financial data while ensuring low-latency performance. In addition, we will show how the integration of Databricks apps with Dash has empowered traders to rapidly develop and deploy custom dashboards, minimizing dependency on developers. Attendees will gain insights into our architecture, data processing techniques and lessons learned in integrating Databricks apps with Dash in order to drive rapid, data-driven trading decisions.

ServiceNow ‘Walks the Talk’ With Databricks: Revolutionizing Go-To-Market With AI

2025-06-10 Watch

talk

Mili Merchant (ServiceNow) , Amulya Gupta (ServiceNow)

AI/ML Databricks GTM

At ServiceNow, we’re not just talking about AI innovation — we’re delivering it. By harnessing the power of Databricks, we’re reimagining Go-To-Market (GTM) strategies, seamlessly integrating AI at every stage of the deal journey — from identifying high-value leads to generating hyper-personalized outreach and pitch materials. In this session, learn how we’ve slashed data processing times by over 90%, reducing workflows from an entire day to just 30 minutes with Databricks. This unprecedented speed enables us to deploy AI-driven GTM initiatives faster, empowering our sellers with real-time insights that accelerate deal velocity and drive business growth. As Agentic AI becomes a game-changer in enterprise GTM, ServiceNow and Databricks are leading the charge — paving the way for a smarter, more efficient future in AI-powered sales.

Sponsored by: Deloitte | Advancing AI in Cybersecurity with Databricks & Deloitte: Data Management & Analytics

SQL-First ETL: Building Easy, Efficient Data Pipelines With Lakeflow Declarative Pipelines

2025-06-10 Watch

talk

Paul Lappas (Databricks) , Ritwik Yadav (Databricks) , Meixian Li (Databricks)

Databricks dbt ETL/ELT SQL Data Streaming

This session explores how SQL-based ETL can accelerate development, simplify maintenance and make data transformation more accessible to both engineers and analysts. We'll walk through how Databricks Lakeflow Declarative Pipelines and Databricks SQL warehouse support building production-grade pipelines using familiar SQL constructs.Topics include: Using streaming tables for real-time ingestion and processing Leveraging materialized views to deliver fast, pre-computed datasets Integrating with tools like dbt to manage batch and streaming workflows at scale By the end of the session, you’ll understand how SQL-first approaches can streamline ETL development and support both operational and analytical use cases.

Unifying Data Delivery: Using Databricks as Your Enterprise Serving Layer

2025-06-10 Watch

talk

Ivan Spiriev (The World Bank) , Ivan Donev (The World Bank)

Analytics Databricks Delta SQL

This session will take you on our journey of integrating Databricks as the core serving layer in a large enterprise, demonstrating how you can build a unified data platform that meets diverse business needs. We will walk through the steps for constructing a central serving layer by leveraging Databricks’ SQL Warehouse to efficiently deliver data to analytics tools and downstream applications. To tackle low latency requirements, we’ll show you how to incorporate an interim scalable relational database layer that delivers sub-second performance for hot data scenarios. Additionally, we’ll explore how Delta Sharing enables secure and cost-effective data distribution beyond your organization, eliminating silos and unnecessary duplication for a truly end-to-end centralized solution. This session is perfect for data architects, engineers and decision-makers looking to unlock the full potential of Databricks as a centralized serving hub.

Unifying Human-Curated Data Ingestion and Real-Time Updates with Databricks Lakeflow Declarative Pipelines, Protobuf and BSR

2025-06-10

talk

Dwight Whitlock (Clinician Nexus)

Data Governance Databricks Kafka Protobuf Data Streaming

Red Stapler is a streaming-native system on Databricks that merges file-based ingestion and real-time user edits into one Lakeflow Declarative Pipelines for near real-time feedback. Protobuf definitions, managed in the Buf Schema Registry (BSR), govern schema and data-quality rules, ensuring backward compatibility. All records — valid or not — are stored in an SCD Type 2 table, capturing every version for full history and immediate quarantine views of invalid data. This unified approach boosts data governance, simplifies auditing and streamlines error fixes.Running on Lakeflow Declarative Pipelines Serverless and the Kafka-compatible Bufstream keeps costs low by scaling down to zero when idle. Red Stapler’s configuration-driven Protobuf logic adapts easily to evolving survey definitions without risking production. The result is consistent validation, quick updates and a complete audit trail — all critical for trustworthy, flexible data pipelines.

Unity Catalog Upgrades Made Easy. Step-by-Step Guide for Databricks Labs UCX

2025-06-10 Watch

talk

Vuong ‎ (Databricks) , Liran Bareket (Databricks)

Dashboard Data Lakehouse Databricks

The Databricks labs project UCX aims to optimize the Unity Catalog (UC) upgrade process, ensuring a seamless transition for businesses. This session will delve into various aspects of the UCX project including the installation and configuration of UCX, the use of the UCX Assessment Dashboard to reduce upgrade risks and prepare effectively for a UC upgrade, and the automation of key components such as group, table and code migration. Attendees will gain comprehensive insights into leveraging UCX and Lakehouse Federation for a streamlined and efficient upgrade process. This session is aimed at customers new to UCX as well as veterans.

Using Catalogs for a Well-Governed and Efficient Data Ecosystem

2025-06-10 Watch

talk

Kajal Woods (Capital One Financial) , jim Lebonitte (Capital One)

Data Management Databricks SQL

The ability to enforce data management controls at scale and reduce the effort required to manage data pipelines is critical to operating efficiently. Capital One has scaled its data management capabilities and invested in platforms to help address this need. In the past couple of years, the role of “the catalog” in a data platform architecture has transitioned from just providing SQL to providing a full suite of capabilities that can help solve this problem at scale. This talk will give insight into how Capital One is thinking about leveraging Databricks Unity Catalog to help tackle these challenges.

Break the Ice: Your Guide to the AccuWeather Data Suite in Databricks

2025-06-10 Watch

lightning_talk

Crystal Camron (AccuWeather)

Databricks

AccuWeather harnesses cutting-edge technology, industry-leading weather data, and expert insights to empower businesses and individuals worldwide. In this session, we will explore how AccuWeather’s comprehensive datasets—ranging from historical and current conditions to forecasts and climate normals—can drive real-world impact across diverse industries. By showcasing scenario-based examples, we’ll demonstrate how AccuWeather’s hourly and daily weather data can address the unique needs of your organization, whether for operational planning, risk management, or strategic decision-making. This session is ideal for both newcomers to AccuWeather’s offerings and experienced users seeking to unlock the full potential of our weather data to optimize performance, improve efficiency, and boost overall success.

talk-data.com

Top Topics

Top Speakers

Sponsored by: Domo | Orchestrating Fleet Intelligence with AI Agents and Real-Time IoT With Databricks + DOMO

Sponsored by: Lovelytics | From SAP Silos to Supply Chain Superpower: How AI Is Reinventing Planning

AI-Driven Drug Discovery: Accelerating Molecular Insights With NVIDIA and Databricks

AI Powering Epsilon's Identity Strategy: Unified Marketing Platform on Databricks

Cloud-to-Cloud Data Sharing by Walmart: Direct Access to Omni-Channel Sales Data With Delta Sharing

Demystifying Upgrading to Unity Catalog — Challenges, Design and Execution

Dusting off the Cobwebs — Moving off a 26-year-old Heritage Platform to Databricks [Teradata]

Empowering Progress: Building a Personalized Training Goal Ecosystem with Databricks

From Days to Seconds — Reducing Query Times on Large Geospatial Datasets by 99%

GenAI Observability in Customer Care

Genie for Engineering: Optimizing HVAC Design and Operational Insights With Data and AI

Geo-Powering Insights: The Art of Spatial Data Integration and Visualization

High-Throughput ML: Mastering Efficient Model Serving at Enterprise Scale

How HP Is Optimizing the 3D Printing Supply Chain Using Delta Sharing

IQVIA's Analytics for Patient Support Services: Transforming Scalability, Performance and Governance

Managing Data and AI Security Risks With DASF 2.0 — and a Customer Story

Real-Time Market Insights — Powering Optiver’s Live Trading Dashboard with Databricks Apps and Dash

ServiceNow ‘Walks the Talk’ With Databricks: Revolutionizing Go-To-Market With AI

Sponsored by: Deloitte | Advancing AI in Cybersecurity with Databricks & Deloitte: Data Management & Analytics

SQL-First ETL: Building Easy, Efficient Data Pipelines With Lakeflow Declarative Pipelines

Unifying Data Delivery: Using Databricks as Your Enterprise Serving Layer

Unifying Human-Curated Data Ingestion and Real-Time Updates with Databricks Lakeflow Declarative Pipelines, Protobuf and BSR

Unity Catalog Upgrades Made Easy. Step-by-Step Guide for Databricks Labs UCX

Using Catalogs for a Well-Governed and Efficient Data Ecosystem

Break the Ice: Your Guide to the AccuWeather Data Suite in Databricks