Data + AI Summit 2025

Dealing With Sensitive Data on Databricks at Natura

2025-06-10 Watch

lightning_talk

Daniel Shimura (Natura)

Data Governance Databricks

Ensuring the protection of sensitive data within a Databricks environment requires robust mechanisms to prevent unauthorized access, even by high-privileged roles such as Databricks Administrators: Account Console Admins, Workspace Admins, and Unity Catalog Admins. To address this, a comprehensive data governance and access control strategy can be implemented, leveraging encryption, secret scope, column mask, fine-grained access on tables and auditing capabilities.

Gaining Insight From Image Data in Databricks Using Multi-Modal Foundation Model API

2025-06-10 Watch

lightning_talk

Ankit Mathur (Databricks)

API Data Lakehouse Databricks LLM

Unlock the hidden potential in your image data without specialized computer vision expertise! This session explores how to leverage Databricks' multi-modal Foundation Model APIs to analyze, classify and extract insights from visual content. Learn how Databricks provides a unified API to understand images using powerful foundation models within your data workflows. Key takeaways: Implementing efficient workflows for image data processing within your Databricks lakehouse Understanding multi-modal foundation models for image understanding Integrating image analysis with other data types for business insights Using OpenAI-compatible APIs to query multi-modal models Building end-to-end pipelines from image ingestion to model deployment Whether analyzing product images, processing visual documents or building content moderation systems, you'll discover how to extract valuable insights from your image data within the Databricks ecosystem.

Improving User Experience and Efficiency Using DBSQL

2025-06-10 Watch

lightning_talk

Renato Suarez (PicPay) , Gustavo Tadao Okida (PicPay)

Dashboard Databricks SQL

To scale Databricks SQL to 2,000 users efficiently and cost-effectively, we adopted serverless, ensuring dynamic scalability and resource optimization. During peak times, resources scale up automatically; during low demand, they scale down, preventing waste. Additionally, we implemented a strong content governance model. We created continuous monitoring to assess query and dashboard performance, notifying users about adjustments and ensuring only relevant content remains active. If a query exceeds time or impact limits, access is reviewed and, if necessary, deactivated. This approach brought greater efficiency, cost reduction and an improved user experience, keeping the platform well-organized and high-performing.

Powering Personalization at Scale with Data: How T-Mobile and Deep Sync Help Brands Connect with Consumers

2025-06-10 Watch

lightning_talk

Jeff Frantz (T-Mobile) , Pieter De Temmerman (Deep Sync)

Data Lakehouse Databricks Marketing

Discover how T-Mobile and Deep Sync are redefining personalized marketing through the power of Databricks. Deep Sync, a leader in deterministic identity solutions, has brought its identity spine to Databricks Lakehouse, which covers over 97% of U.S. households with the most current and accurate attribute data available. T-Mobile is bringing to market for the first time a new data services business that introduces privacy-compliant, consent-based consumer data. Together, T-Mobile and Deep Sync are transforming how brands engage with consumers—enabling bespoke, hyper-personalized workflows, identity-driven insights, and closed-loop measurement through Databricks’ Multi-Party Cleanrooms. Join this session to learn how data and identity are converging to solve today’s modern marketing challenges so consumers can rediscover what it feels like to be seen, not targeted

Revolutionizing Counterparty Credit Risk (SACCR) – How Morgan Stanley Scaled With Databricks

2025-06-10 Watch

lightning_talk

Naeem Rehman (Databricks) , Alistair MacDonald (Morgan Stanley)

Databricks

Learn how Morgan Stanley scaled one of their most significant regulatory calculators (SACCR) by leveraging Databricks for horizontal and vertical scaling. Discover how we harnessed Databricks to improve performance, improve calculation accuracy, regulatory compliance and more.

Energy and Utilities Industry Forum | Sponsored by: Deloitte and AWS

2025-06-10 Watch

talk

Bryce Bartmann (Shell) , Ali Marzban (NOV) , Lou Martinez Sancho (Westinghouse Electric Company) , Shane Powell (Alabama Power) , Julien Debard (Databricks)

AI/ML AWS Databricks

Join us for a compelling forum exploring how energy leaders are harnessing data and AI to build a more sustainable future. As the industry navigates the complex balance between rising global energy demands and ambitious decarbonization goals, innovative companies are discovering that intelligence-driven operations are the key to success. From optimizing renewable energy integration to revolutionizing grid management, learn how energy pioneers are using AI to transform traditional operations while accelerating the path to net zero. This session reveals how Databricks is empowering energy companies to turn their sustainability aspirations into reality, proving that the future of energy is both clean and intelligent.

Tech Industry Forum: Tip of the Spear With Data and AI | Sponsored by: Aimpoint Digital and AWS

2025-06-10 Watch

talk

Shant Hovsepian (Databricks) , Jason Reid (Databricks) , Madelyn Mullen (Databricks) , Dev Tagare (Robinhood) , Josh Clemm (Dropbox) , Dan DeMeyere (ThredUp Inc.) , Dan Wulin (Zillow)

AI/ML Analytics AWS Databricks Cyber Security

Join us for the Tech Industry Forum, formerly known as the Tech Innovators Summit, now part of Databricks Industry Experience. This session will feature keynotes, panels and expert talks led by top customer speakers and Databricks experts. Tech companies are pushing the boundaries of data and AI to accelerate innovation, optimize operations and build collaborative ecosystems. In this session, we’ll explore how unified data platforms empower organizations to scale their impact, democratize analytics across teams and foster openness for building tomorrow’s products. Key topics include: Scaling data platforms to support real-time analytics and AI-driven decision-making Democratizing access to data while maintaining robust governance and security Harnessing openness and portability to enable seamless collaboration with partners and customers After the session, connect with your peers during the exclusive Industry Forum Happy Hour. Reserve your seat today!

Automated Deployment with Databricks Asset Bundles

2025-06-10

talk

API CI/CD Data Engineering Databricks DataOps Delta

This course provides a comprehensive review of DevOps principles and their application to Databricks projects. It begins with an overview of core DevOps, DataOps, continuous integration (CI), continuous deployment (CD), and testing, and explores how these principles can be applied to data engineering pipelines. The course then focuses on continuous deployment within the CI/CD process, examining tools like the Databricks REST API, SDK, and CLI for project deployment. You will learn about Databricks Asset Bundles (DABs) and how they fit into the CI/CD process. You’ll dive into their key components, folder structure, and how they streamline deployment across various target environments in Databricks. You will also learn how to add variables, modify, validate, deploy, and execute Databricks Asset Bundles for multiple environments with different configurations using the Databricks CLI. Finally, the course introduces Visual Studio Code as an Interactive Development Environment (IDE) for building, testing, and deploying Databricks Asset Bundles locally, optimizing your development process. The course concludes with an introduction to automating deployment pipelines using GitHub Actions to enhance the CI/CD workflow with Databricks Asset Bundles. By the end of this course, you will be equipped to automate Databricks project deployments with Databricks Asset Bundles, improving efficiency through DevOps practices. Pre-requisites: Strong knowledge of the Databricks platform, including experience with Databricks Workspaces, Apache Spark, Delta Lake, the Medallion Architecture, Unity Catalog, Delta Live Tables, and Workflows. In particular, knowledge of leveraging Expectations with Lakeflow Declarative Pipelines. Labs : Yes Certification Path: Databricks Certified Data Engineer Professional

De-Risking Investment Decisions: QCG's Smarter Deal Evaluation Process Leveraging Databricks

2025-06-10 Watch

lightning_talk

Ian Brown (Quantum Capital Group)

Analytics Databricks Delta Spark SQL

Quantum Capital Group (QCG) screens hundreds of deals across the global Sustainable Energy Ecosystem, requiring deep technical due diligence. With over 1.5 billion records sourced from public, premium and proprietary datasets, their challenge was how to efficiently curate, analyze and share this data to drive smarter investment decisions. QCG partnered with Databricks & Tiger Analytics to modernize its data landscape. Using Delta tables, Spark SQL, and Unity Catalog, the team built a golden dataset that powers proprietary evaluation models and automates complex workflows. Data is now seamlessly curated, enriched and distributed — both internally and to external stakeholders — in a secure, governed and scalable way. This session explores how QCG’s investment in data intelligence has turned an overwhelming volume of information into a competitive advantage, transforming deal evaluation into a faster, more strategic process.

Gen AI Deployment and Monitoring

2025-06-10

talk

AI/ML Data Lakehouse Databricks GenAI RAG Vector DB

This course introduces learners to deploying, operationalizing, and monitoring generative artificial intelligence (AI) applications. First, learners will develop knowledge and skills in deploying generative AI applications using tools like Model Serving. Next, the course will discuss operationalizing generative AI applications following modern LLMOps best practices and recommended architectures. Finally, learners will be introduced to the idea of monitoring generative AI applications and their components using Lakehouse Monitoring. Pre-requisites: Familiarity with prompt engineering and retrieval-augmented generation (RAG) techniques, including data preparation, embeddings, vectors, and vector databases. A foundational knowledge of Databricks Data Intelligence Platform tools for evaluation and governance (particularly Unity Catalog). Labs: Yes Certification Path: Databricks Certified Generative AI Engineer Associate

Machine Learning Operations

2025-06-10

talk

AI/ML Data Lakehouse Databricks Delta MLOps Python

This course will guide participants through a comprehensive exploration of machine learning model operations, focusing on MLOps and model lifecycle management. The initial segment covers essential MLOps components and best practices, providing participants with a strong foundation for effectively operationalizing machine learning models. In the latter part of the course, we will delve into the basics of the model lifecycle, demonstrating how to navigate it seamlessly using the Model Registry in conjunction with the Unity Catalog for efficient model management. By the course's conclusion, participants will have gained practical insights and a well-rounded understanding of MLOps principles, equipped with the skills needed to navigate the intricate landscape of machine learning model operations. Pre-requisites: Familiarity with Databricks workspace and notebooks, familiarity with Delta Lake and Lakehouse, intermediate level knowledge of Python (e.g. understanding of basic MLOps concepts and practices as well as infrastructure and importance of monitoring MLOps solutions) Labs: Yes Certification Path: Databricks Certified Machine Learning Associate

MLOps With Databricks

2025-06-10 Watch

lightning_talk

Maria Vechtomova (Marvelous MLOps)

AI/ML Databricks MLOps

Adopting MLOps is getting increasingly important with the rise of AI. A lot of different features are required to do MLOps in large organizations. In the past, you had to implement these features yourself. Luckily, the MLOps space is getting more mature, and end-to-end platforms like Databricks provide most of the features. In this talk, I will walk through the MLOps components and how you can simplify your processes using Databricks. Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.

Pushing the Limits of What Your Warehouse Can Do Using Python and Databricks

2025-06-10 Watch

lightning_talk

Jakob Mund (Databricks)

Cloud Computing Databricks Python SQL

SQL warehouses in Databricks can run more than just SQL. Join this session to learn how to get more out of your SQL warehouses and any tools built on top of it by leveraging Python. After attending this session, you will be familiar with Python user-defined functions and how to bring in custom dependencies from PyPi, as a custom wheel or even securely invoke cloud services with performance at scale.

ReguBIM AI – Transforming BIM, Engineering, and Code Compliance with Generative AI

2025-06-10 Watch

lightning_talk

Qi Qi Oh (Exyte Singapore Pte. Ltd.)

AI/ML Databricks GenAI

At Exyte, we design, engineer, and deliver ultra-clean and sustainable facilities for high-tech industries. One of the most complex tasks our engineers and designers face is ensuring that their building designs comply with constantly evolving codes and regulations – often a manual, error-prone process. To address this, we developed ReguBIM AI, a generative AI-powered assistant that helps our teams verify code compliance more efficiently and accurately by linking 3D Building Information Modeling (BIM) data with regulatory documents. Built on the Databricks Data Intelligence Platform, ReguBIM AI is part of our broader vision to apply AI meaningfully across engineering and design processes. We are proud to share that ReguBIM AI won the Grand Prize and EMEA Winner titles at the Databricks GenAI World Cup 2024 — a global hackathon that challenged over 1,500 data scientists and AI engineers from 18 countries to create innovative generative AI solutions for real-world problems.

Scaling GenAI Inference From Prototype to Production: Real-World Lessons in Speed & Cost

2025-06-10 Watch

lightning_talk

Anish Kumar (Scribd, Inc.)

AI/ML Databricks GenAI

This lightning talk dives into real-world GenAI projects that scaled from prototype to production using Databricks’ fully managed tools. Facing cost and time constraints, we leveraged four key Databricks features—Workflows, Model Serving, Serverless Compute, and Notebooks—to build an AI inference pipeline processing millions of documents (text and audiobooks). This approach enables rapid experimentation, easy tuning of GenAI prompts and compute settings, seamless data iteration and efficient quality testing—allowing Data Scientists and Engineers to collaborate effectively. Learn how to design modular, parameterized notebooks that run concurrently, manage dependencies and accelerate AI-driven insights. Whether you're optimizing AI inference, automating complex data workflows or architecting next-gen serverless AI systems, this session delivers actionable strategies to maximize performance while keeping costs low.

Sponsored by: EY | Xoople: Fueling enterprise AI with Earth data intelligence products

Bridging Ontologies & Lakehouses: Palantir AIP + Databricks for Secure Autonomous AI

2025-06-10 Watch

talk

Siddhant Ekale (Palantir) , Ben Abood (Databricks)

AI/ML Data Lakehouse Databricks

AI is moving from pilots to production, but many organizations still struggle to connect boardroom ambitions with operational reality. Palantir’s Artificial Intelligence Platform (AIP) and the Databricks Data Intelligence Platform now form a single, open architecture that closes this gap by pairing Palantir’s operational decision empowering Ontology- with Databricks’ industry-leading scale, governance and Lakehouse economics. The result: real-time, AI-powered, autonomous workflows that are already powering mission-critical outcomes for the U.S. Department of Defense, bp and other joint customers across the public and private sectors. In this technically grounded but business-focused session you will see the new reference architecture in action. We will walk through how Unity Catalog and Palantir Virtual Tables provide governed, zero-copy access to Lakehouse data and back mission-critical operational workflows on top of Palantir’s semantic ontology and agentic AI capabilities. We will also explore how Palantir’s no-code and pro-code tooling integrates with Databricks compute to orchestrate builds and write tables to Unity Catalog. Come hear from customers currently using this architecture to drive critical business outcomes seamlessly across Databricks and Palantir.

Databricks Apps: Turning Data and AI Into Practical, User-Friendly Applications

2025-06-10 Watch

talk

Nic Heier (Databricks) , Justin DeBrabant (Databricks)

AI/ML Databricks RAG

This session is repeated. In this session, we present an overview of the GA release of Databricks Apps, the new app hosting platform that integrates all the Databricks services necessary to build production-ready data and AI applications. With Apps, data and developer teams can build new interfaces into the data intelligence platform, further democratizing the transformative power of data and AI across the organization. We'll cover common use cases, including RAG chat apps, interactive visualizations and custom workflow builders, as well as look at several best practices and design patterns when building apps. Finally, we'll look ahead with the vision, strategy and roadmap for the year ahead.

Enabling Sleep Science Research With Databricks and Delta Sharing

2025-06-10 Watch

talk

Alexandr Rivlin (Sleep Number Labs) , Sajeev Mayandi (Sleep Number)

AWS Azure Cloud Computing Databricks Delta GCP

Leveraging Databricks as a platform, we facilitate the sharing of anonymized datasets across various Databricks workspaces and accounts, spanning multiple cloud environments such as AWS, Azure, and Google Cloud. This capability, powered by Delta Sharing, extends both within and outside Sleep Number, enabling accelerated insights while ensuring compliance with data security and privacy standards. In this session, we will showcase our architecture and implementation strategy for data sharing, highlighting the use of Databricks’ Unity Catalog and Delta Sharing, along with integration with platforms like Jira, Jenkins, and Terraform to streamline project management and system orchestration.

From Datavault to Delta Lake: Streamlining Data Sync with Lakeflow Connect

2025-06-10 Watch

talk

Olivia Ren (Databricks) , Andrew Clarke (Australian Red Cross Lifeblood)

Analytics Azure Data Lakehouse Data Vault Databricks Delta

In this session, we will explore the Australian Red Cross Lifeblood's approach to synchronizing an Azure SQL Datavault 2.0 (DV2.0) implementation with Unity Catalog (UC) using Lakeflow Connect. Lifeblood's DV2.0 data warehouse, which includes raw vault (RV) and business vault (BV) tables, as well as information marts defined as views, required a multi-step process to achieve data/business logic sync with UC. This involved using Lakeflow Connect to ingest RV and BV data, followed by a custom process utilizing JDBC to ingest view definitions, and the automated/manual conversion of T-SQL to Databricks SQL views, with Lakehouse Monitoring for validation. In this talk, we will share our journey, the design decisions we made, and how the resulting solution now supports analytics workloads, analysts, and data scientists at Lifeblood.

How United Airlines Transforms SWIM Data into Real-Time Operational Insight and Faster Decision Making

2025-06-10 Watch

talk

Saurabh Agarwal (United Airlines) , Dheeraj Arora (United Airlines)

Databricks

Discover how United Airlines, in collaboration with Databricks and Impetus Technologies, has built a next-generation data intelligence platform leveraging System Wide Information Management (SWIM) to deliver mission-critical, real-time insights for flight disruption prediction, situational analysis, and smarter, faster decision-making. In this session, United Airlines experts will share how their Databricks-based SWIM architecture enables near real-time operational awareness, enhances responsiveness during irregular operations (IRROPs), and drives proactive actions to minimize disruptions. They will also discuss how United efficiently processes and manages the large volume and variety of SWIM data, ensuring seamless integration and actionable intelligence across their operations.

Mastering Data Security and Compliance: CoorsTek's Journey With Databricks Unity Catalog

2025-06-10 Watch

talk

Anupam Wahi (Tredence) , David Tomlinson (CoorsTek)

Cloud Computing Databricks Cyber Security

Ensuring data security & meeting compliance requirements are critical priorities for businesses operating in regulated industries, where the stakes are high and the standards are stringent. We will showcase how CoorsTek, a global leader in technical ceramics MFG, partnered with Databricks to leverage the power of UC for addressing regulatory challenges while achieving significant operational efficiency gains. We'll dive into the migration journey, highlighting the adoption of key features such as RBAC, comprehensive data lineage tracking and robust auditing capabilities. Attendees will gain practical insights into the strategies and tools used to manage sensitive data, ensure compliance with industry standards and optimize cloud data architectures. Additionally, we’ll share real-world lessons learned, best practices for integrating compliance into a modern data ecosystem and actionable takeaways for leveraging Databricks as a catalyst for secure and compliant data innovation.

Patients Are Waiting...Accelerating Healthcare Innovation with Data, AI and Agents

2025-06-10 Watch

talk

JZTS (Jonatan Selsing) (Novo Nordisk) , Christian Sørensen (Novo Nordisk) , Thomas Larsen (Novo Nordisk A/S)

AI/ML Data Management Databricks

This session is repeated. In an era of exponential data growth, organizations across industries face common challenges in transforming raw data into actionable insights. This presentation showcases how Novo Nordisk is pioneering insights generation approaches to clinical data management and AI. Using our clinical trials platform FounData, built on Databricks, we demonstrate how proper data architecture enables advanced AI applications. We'll introduce a multi-agent AI framework that revolutionizes data interaction, combining specialized AI agents to guide users through complex datasets. While our focus is on clinical data, these principles apply across sectors – from manufacturing to financial services. Learn how democratizing access to data and AI capabilities can transform organizational efficiency while maintaining governance. Through this real-world implementation, participants will gain insights on building scalable data architectures and leveraging multi-agent AI frameworks for responsible innovation.

talk-data.com

Top Topics

Top Speakers

Dealing With Sensitive Data on Databricks at Natura

Gaining Insight From Image Data in Databricks Using Multi-Modal Foundation Model API

Improving User Experience and Efficiency Using DBSQL

Powering Personalization at Scale with Data: How T-Mobile and Deep Sync Help Brands Connect with Consumers

Revolutionizing Counterparty Credit Risk (SACCR) – How Morgan Stanley Scaled With Databricks

Sponsored by: AWS | Deploying a GenAI Agent using Databricks Mosaic AI, Anthropic, LangGraph, and Amazon Bedrock

Energy and Utilities Industry Forum | Sponsored by: Deloitte and AWS

Tech Industry Forum: Tip of the Spear With Data and AI | Sponsored by: Aimpoint Digital and AWS

Automated Deployment with Databricks Asset Bundles

De-Risking Investment Decisions: QCG's Smarter Deal Evaluation Process Leveraging Databricks

Gen AI Deployment and Monitoring

Machine Learning Operations

MLOps With Databricks

Pushing the Limits of What Your Warehouse Can Do Using Python and Databricks

ReguBIM AI – Transforming BIM, Engineering, and Code Compliance with Generative AI

Scaling GenAI Inference From Prototype to Production: Real-World Lessons in Speed & Cost

Sponsored by: EY | Xoople: Fueling enterprise AI with Earth data intelligence products

Sponsored by: Impetus | Supercharge AI with automated migration to Databricks with Impetus

Bridging Ontologies & Lakehouses: Palantir AIP + Databricks for Secure Autonomous AI

Databricks Apps: Turning Data and AI Into Practical, User-Friendly Applications

Enabling Sleep Science Research With Databricks and Delta Sharing

From Datavault to Delta Lake: Streamlining Data Sync with Lakeflow Connect

How United Airlines Transforms SWIM Data into Real-Time Operational Insight and Faster Decision Making

Mastering Data Security and Compliance: CoorsTek's Journey With Databricks Unity Catalog

Patients Are Waiting...Accelerating Healthcare Innovation with Data, AI and Agents