Databricks

Sponsored by: Salesforce | From Data to Action: A Unified and Trusted Approach

Techcombank's Multi-Million Dollar Transformation Leveraging Cloud and Databricks

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Santhosh Mahendiran (Techcombank (TCB))

AI/ML Analytics Cloud Computing

The migration to the Databricks Data Intelligence Platform has enabled Techcombank to more efficiently unify data from over 50 systems, improve governance, streamline daily operational analytics pipelines and use advanced analytics tools and AI to create more meaningful and personalized experiences for customers. With Databricks, Techcombank has also introduced key solutions that are reshaping its digital banking services: AI-driven lead management system: Techcombank's internally developed AI program called 'Lead Allocation Curated Engine' (LACE) optimizes lead management and provides relationship managers with enriched insights for smarter lead allocation to drive business growth. AI-powered program for digital banking inclusion of small businesses: An AI-powered GeoSense assists frontline workers with analytics-driven insights about which small businesses and merchants to engage in the bank's digital ecosystem. And more examples, which will be presented.

Unlocking Cross-Organizational Collaboration to Protect the Environment With Databricks at DEFRA

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Paul Sinclair (Defra)

AI/ML Analytics Data Analytics Data Governance

Join us to learn how the UK's Department for Environment, Food & Rural Affairs (DEFRA) transformed data use with Databricks’ Unity Catalog, enabling nationwide projects through secure, scalable analytics. DEFRA safeguards the UK's natural environment. Historical fragmentation of data, talent and tools across siloed platforms and organizations, made it difficult to fully exploit the department’s rich data. DEFRA launched its Data Analytics & Science Hub (DASH), powered by the Databricks Data Intelligence Platform, to unify its data ecosystem. DASH enables hundreds of users to access and share datasets securely. A flagship example demonstrates its power, using Databricks to process aerial photography and satellite data to identify peatlands in need of restoration — a complex task made possible through unified data governance, scalable compute and AI. Attendees will hear about DEFRA’s journey, learn valuable lessons about building a platform crossing organizational boundaries.

Using Delta-rs and Delta-Kernel-rs to Serve CDC Feeds

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Stephen Carman (Databricks) , Oussama Saoudi (Databricks)

Delta Python Rust Spark

Change data feeds are a common tool for synchronizing changes between tables and performing data processing in a scalable fashion. Serverless architectures offer a compelling solution for organizations looking to avoid the complexity of managing infrastructure. But how can you bring CDFs into a serverless environment? In this session, we'll explore how to integrate Change Data Feeds into serverless architectures using Delta-rs and Delta-kernel-rs—open-source projects that allow you to read Delta tables and their change data feeds in Rust or Python. We’ll demonstrate how to use these tools with Lakestore’s serverless platform to easily stream and process changes. You’ll learn how to: Leverage Delta tables and CDFs in serverless environments Utilize Databricks and Unity Catalog without needing Apache Spark

Pacers Sports and Entertainment and Databricks

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Ari Kaplan (Databricks) , Jared Chavez (Pacers Sports & Entertainment) , Rick Schultz (Databricks)

AI/ML Marketing

The Pacers Sports Group has had an amazing year. The Indianapolis Pacers in the NBA finals for the first time in 25 years. The Fever are setting attendance and viewership records with WNBA celebrity Caitlin Clark. Hear how they have transformed their data and AI capabilities for marketing, fan behavior insights, season ticket propensity models, and democratization to their non-technical personas. And receiving a 12,000x cost reduction down to just $8 a year switching to Databricks.

Creating a Custom PySpark Stream Reader with PySpark 4.0

2025-06-11 · Data + AI Summit 2025 Watch

lightning_talk

by Skyler Myers (Entrada)

Delta Java Kafka MySQL PySpark Spark Data Streaming

PySpark supports many data sources out of the box, such as Apache Kafka, JDBC, ODBC, Delta Lake, etc. However, some older systems, such as systems that use JMS protocol, are not supported by default and require considerable extra work for developers to read from them. One such example is ActiveMQ for streaming. Traditionally, users of ActiveMQ have to use a middle-man in order to read the stream with Spark (such as writing to a MySQL DB using Java code and reading that table with Spark JDBC). With PySpark 4.0’s custom data sources (supported in DBR 15.3+) we are able to cut out the middle-man processing using batch or Spark Streaming and consume the queues directly from PySpark, saving developers considerable time and complexity in getting source data into your Delta Lake and governed by Unity Catalog and orchestrated with Databricks Workflows.

Disney's Foundational Medallion: A Journey Into Next-Generation Data Architecture

2025-06-11 · Data + AI Summit 2025 Watch

lightning_talk

by mark.senerth mark.senerth (Disney)

Delta Data Streaming

Step into the world of Disney Streaming as we unveil the creation of our Foundational Medallion, a cornerstone in our architecture that redefines how we manage data at scale. In this session, we'll explore how we tackled the multi-faceted challenges of building a consistent, self-service surrogate key architecture — a foundational dataset for every ingested stream powering Disney Streaming's data-driven decisions. Learn how we streamlined our architecture and unlocked new efficiencies by leveraging cutting-edge Databricks features such as liquid clustering, Photon with dynamic file pruning, Delta's identity column, Unity Catalog and more — transforming our implementation into a simpler, more scalable solution. Join us on this thrilling journey as we navigate the twists and turns of designing and implementing a new Medallion at scale — the very heartbeat of our streaming business!

Next-Gen Data Science: How Posit and Databricks Are Transforming Analytics at Scale

2025-06-11 · Data + AI Summit 2025 Watch

lightning_talk

by James Blair (Posit, PBC)

Analytics API Data Science Python

Modern data science teams face the challenge of navigating complex landscapes of languages, tools and infrastructure. Positron, Posit’s next-generation IDE, offers a powerful environment tailored for data science, seamlessly integrating with Databricks to empower teams working in Python and R. Now integrated within Posit Workbench, Positron enables data scientists to efficiently develop, iterate and analyze data with Databricks — all while maintaining their preferred workflows. In this session, we’ll explore how Python and R users can develop, deploy and scale their data science workflows by combining Posit tools with Databricks. We’ll showcase how Positron simplifies development for both Python and R and how Posit Connect enables seamless deployment of applications, reports and APIs powered by Databricks. Join us to see how Posit + Databricks create a frictionless, scalable and collaborative data science experience — so your teams can focus on insights, not infrastructure.

No-Trust, All Value: Monetizing Analytics With Databricks Clean Rooms

2025-06-11 · Data + AI Summit 2025 Watch

lightning_talk

by Eddie Edgeworth (Koantek)

Analytics Delta

In a world where data collaboration is essential but trust is scarce, Databricks Clean Rooms delivers a game-changing model: no data shared, all value gained. Discover how data providers can unlock new revenue streams by launching subscription-based analytics and “Built-on-Databricks” services that run on customer data — without exposing raw data or violating compliance. Clean Rooms integrates Unity Catalog’s governance, Delta Sharing’s secure exchange and serverless compute to enable true multi-party collaboration — without moving data. See how privacy-preserving models like fraud detection, clinical analytics and ad measurement become scalable, productizable and monetizable across industries. Walk away with a proven pattern to productize analytics, preserve compliance and turn trustless collaboration into recurring revenue.

Scaling Blockchain ML With Databricks: From Graph Analytics to Graph Machine Learning

2025-06-11 · Data + AI Summit 2025 Watch

lightning_talk

by Indra Rustandi (Coinbase)

AI/ML Analytics Blockchain Delta NFT

Coinbase leverages Databricks to scale ML on blockchain data, turning vast transaction networks into actionable insights. This session explores how Databricks’ scalable infrastructure, powered by Delta Lake, enables real-time processing for ML applications like NFT floor price predictions. We’ll show how GraphFrames helps us analyze billion-node transaction graphs (e.g., Bitcoin) for clustering and fraud detection, uncovering structural patterns in blockchain data. But traditional graph analytics has limits. We’ll go further with Graph Neural Networks (GNNs) using Kumo AI, which learn from the transaction network itself rather than relying on hand-engineered features. By encoding relationships directly into the model, GNNs adapt to new fraud tactics, capturing subtle relationships that evolve over time. Join us to see how Coinbase is advancing blockchain ML with Databricks and deep learning on graphs.

Sponsored by: Acceldata | Agentic Data Management: Trusted Data for Enterprise AI on Databricks

A Japanese Mega-Bank’s Journey to a Modern, GenAI-Powered, Governed Data Platform

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Anshul Wadhawan (Deloitte Consulting LLP) , Gordon Wilson (Sumitomo Mitsui Banking Corporation)

Azure Cloud Computing Delta GenAI

SMBC, a major Japanese multinational financial services institution, has embarked on an initiative to build a GenAI-powered, modern and well-governed cloud data platform on Azure/Databricks. This initiative aims to build an enterprise data foundation encompassing loans, deposits, securities, derivatives, and other data domains. Its primary goals are: To decommission legacy data platforms and reduce data sprawl by migrating 20+ core banking systems to a multi-tenant Azure Databricks architecture To leverage Databrick’s delta-share capabilities to address SMBC’s unique global footprint and data sharing needs To govern data by design using Unity Catalog To achieve global adoption of the frameworks, accelerators, architecture and tool stack to support similar implementations across EMEA Deloitte and SMBC leveraged the Brickbuilder asset “Data as a Service for Banking” to accelerate this highly strategic transformation.

American Airlines Flies to New Heights with Data Intelligence

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Saimahesh Chava (American Airlines) , Yash Joshi (Accenture)

API Data Governance GitHub Hive

American Airlines migrated from Hive Metastore to Unity Catalog using automated processes with Databricks APIs and GitHub Actions. This automation streamlined the migration for many applications within AA, ensuring consistency, efficiency and minimal disruption while enhancing data governance and disaster recovery capabilities.

Building and Scaling Production AI Systems With Mosaic AI

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Kasey Uhlenhuth (Databricks)

AI/ML

Ready to go beyond the basics of Mosaic AI? This session will walk you through how to architect and scale production-grade AI systems on the Databricks Data Intelligence Platform. We’ll cover practical techniques for building end-to-end AI pipelines — from processing structured and unstructured data to applying Mosaic AI tools and functions for model development, deployment and monitoring. You’ll learn how to integrate experiment tracking with MLflow, apply performance tuning and use built-in frameworks to manage the full AI lifecycle. By the end, you’ll be equipped to design, deploy and maintain AI systems that deliver measurable outcomes at enterprise scale.

Building Tool-Calling Agents With Databricks Agent Framework and MCP

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Siddharth Murching (Databricks) , Elise Gonzales (Databricks)

AI/ML API Cyber Security

Want to create AI agents that can do more than just generate text? Join us to explore how combining Databricks' Mosaic AI Agent Framework with the Model Context Protocol (MCP) unlocks powerful tool-calling capabilities. We'll show you how MCP provides a standardized way for AI agents to interact with external tools, data and APIs, solving the headache of fragmented integration approaches. Learn to build agents that can retrieve both structured and unstructured data, execute custom code and tackle real enterprise challenges. Key takeaways: Implementing MCP-enabled tool-calling in your AI agents Prototyping in AI Playground and exporting for deployment Integrating Unity Catalog functions as agent tools Ensuring governance and security for enterprise deployments Whether you're building customer service bots or data analysis assistants, you'll leave with practical know-how to create powerful, governed AI agents.

ClickHouse and Databricks for Real-Time Analytics

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Melvyn Peignon (ClickHouse)

Analytics ClickHouse Data Lakehouse Delta

ClickHouse is a C++ based, column-oriented database built for real-time analytics. While it has its own internal storage format, the rise of open lakehouse architectures has created a growing need for seamless interoperability. In response, we have developed integrations with your favorite lakehouse ecosystem to enhance compatibility, performance and governance. From integrating with Unity Catalog to embedding the Delta Kernel into ClickHouse, this session will explore the key design considerations behind these integrations, their benefits to the community, the lessons learned and future opportunities for improved compatibility and seamless integration.

talk-data.com

Activity Trend

Top Events

Top Speakers

Sponsored by: Salesforce | From Data to Action: A Unified and Trusted Approach

Sponsored by: Securiti | Safely Curating Data to Enable Enterprise AI with Databricks

Techcombank's Multi-Million Dollar Transformation Leveraging Cloud and Databricks

Unlocking Cross-Organizational Collaboration to Protect the Environment With Databricks at DEFRA

Using Delta-rs and Delta-Kernel-rs to Serve CDC Feeds

Pacers Sports and Entertainment and Databricks

Creating a Custom PySpark Stream Reader with PySpark 4.0

Disney's Foundational Medallion: A Journey Into Next-Generation Data Architecture

Next-Gen Data Science: How Posit and Databricks Are Transforming Analytics at Scale

No-Trust, All Value: Monetizing Analytics With Databricks Clean Rooms

Scaling Blockchain ML With Databricks: From Graph Analytics to Graph Machine Learning

Sponsored by: Acceldata | Agentic Data Management: Trusted Data for Enterprise AI on Databricks

Sponsored by: dbt Labs | Leveling Up Data Engineering at Riot: How We Rolled Out dbt and Transformed the Developer Experience

Sponsored by: Google Cloud | Powering AI & Analytics: Innovations in Google Cloud Storage for Data Lakes

Sponsored by: Qubika | Agentic AI In Finance: How To Build Agents Using Databricks And LangGraph

A Japanese Mega-Bank’s Journey to a Modern, GenAI-Powered, Governed Data Platform

American Airlines Flies to New Heights with Data Intelligence

Building and Scaling Production AI Systems With Mosaic AI

Building Tool-Calling Agents With Databricks Agent Framework and MCP

ClickHouse and Databricks for Real-Time Analytics