talk-data.com talk-data.com

Event

Data + AI Summit 2025

2025-06-09 – 2025-06-13 Databricks Summit Visit website ↗

Activities tracked

715

Sessions & talks

Showing 151–175 of 715 · Newest first

Search within this event →
Sponsored by: Qubika | Agentic AI In Finance: How To Build Agents Using Databricks And LangGraph

Sponsored by: Qubika | Agentic AI In Finance: How To Build Agents Using Databricks And LangGraph

2025-06-11 Watch
lightning_talk
Sebastian Diaz (Qubika)

Join us for this session on how to build AI finance agents with Databricks and LangChain. This session introduces a powerful approach to building AI agents by combining a modular framework that integrates LangChain, retrieval-augmented generation (RAG), and Databricks' unified data platform to build intelligent, adaptable finance agents. We’ll walk through the architecture and key components, including Databricks Unity Catalog, ML Flow, and Mosaic AI involved in building a system tailored for complex financial tasks like portfolio analysis, reporting automation, and real-time risk insights. We’ll also showcase a demo of one such agent in action - a Financial Analyst Agent. This agent emulates the expertise of a seasoned data analyst, delivering in-depth analysis in seconds - eliminating the need to wait hours or days for manual reports. The solution provides organizations with 24/7 access to advanced data analysis, enabling faster, smarter decision-making.

Sponsored by: West Monroe | Disruptive Forces: LLMs and the New Age of Data Engineering

Sponsored by: West Monroe | Disruptive Forces: LLMs and the New Age of Data Engineering

2025-06-11 Watch
lightning_talk
Doug MacWilliams (West Monroe)

Seismic shift Large Language Models are unleashing on data engineering, challenging traditional workflows. LLMs obliterate inefficiencies and redefine productivity. AI powerhouses automate complex tasks like documentation, code translation, and data model development with unprecedented speed and precision. Integrating LLMs into tools promises to reduce offshore dependency, fostering agile onshore innovation. Harnessing LLMs' full potential involves challenges, requiring deep dives into domain-specific data and strategic business alignment. Session will addresses deploying LLMs effectively, overcoming data management hurdles, and fostering collaboration between engineers and stakeholders. Join us to explore a future where LLMs redefine possibilities, inviting you to embrace AI-driven innovation and position your organization as a leader in data engineering.

AI in Motion: Build a Roadmap for Impact in Just 30 Minutes

AI in Motion: Build a Roadmap for Impact in Just 30 Minutes

2025-06-11 Watch
talk
Lexy Kassan (Databricks)

This high-velocity workshop is designed for data and AI leaders seeking to rapidly develop a comprehensive AI strategy tailored to their organization's needs. In just 30 minutes, participants will engage in a focused, interactive session that delivers actionable insights and a strategic framework for AI implementation. Key components of the workshop include: Rapid assessment: Quickly evaluate your organization's AI readiness and potential impact areas Strategic alignment: Align AI initiatives with core business objectives and value creation opportunities Resource optimization: Identify critical resources, skills and technologies required for successful AI adoption Risk mitigation: Address key challenges and ethical considerations in AI deployment Priority areas: Where to focus first to get the best start By the end of this intensive session, you will have the foundation of a robust AI strategy and guidance on roadmap execution.

A Japanese Mega-Bank’s Journey to a Modern, GenAI-Powered, Governed Data Platform

A Japanese Mega-Bank’s Journey to a Modern, GenAI-Powered, Governed Data Platform

2025-06-11 Watch
talk
Anshul Wadhawan (Deloitte Consulting LLP) , Gordon Wilson (Sumitomo Mitsui Banking Corporation)

SMBC, a major Japanese multinational financial services institution, has embarked on an initiative to build a GenAI-powered, modern and well-governed cloud data platform on Azure/Databricks. This initiative aims to build an enterprise data foundation encompassing loans, deposits, securities, derivatives, and other data domains. Its primary goals are: To decommission legacy data platforms and reduce data sprawl by migrating 20+ core banking systems to a multi-tenant Azure Databricks architecture To leverage Databrick’s delta-share capabilities to address SMBC’s unique global footprint and data sharing needs To govern data by design using Unity Catalog To achieve global adoption of the frameworks, accelerators, architecture and tool stack to support similar implementations across EMEA Deloitte and SMBC leveraged the Brickbuilder asset “Data as a Service for Banking” to accelerate this highly strategic transformation.

American Airlines Flies to New Heights with Data Intelligence

American Airlines Flies to New Heights with Data Intelligence

2025-06-11 Watch
talk
Saimahesh Chava (American Airlines) , Yash Joshi (American Airlines)

American Airlines migrated from Hive Metastore to Unity Catalog using automated processes with Databricks APIs and GitHub Actions. This automation streamlined the migration for many applications within AA, ensuring consistency, efficiency and minimal disruption while enhancing data governance and disaster recovery capabilities.

Building and Scaling Production AI Systems With Mosaic AI

Building and Scaling Production AI Systems With Mosaic AI

2025-06-11 Watch
talk
Kasey Uhlenhuth (Databricks)

Ready to go beyond the basics of Mosaic AI? This session will walk you through how to architect and scale production-grade AI systems on the Databricks Data Intelligence Platform. We’ll cover practical techniques for building end-to-end AI pipelines — from processing structured and unstructured data to applying Mosaic AI tools and functions for model development, deployment and monitoring. You’ll learn how to integrate experiment tracking with MLflow, apply performance tuning and use built-in frameworks to manage the full AI lifecycle. By the end, you’ll be equipped to design, deploy and maintain AI systems that deliver measurable outcomes at enterprise scale.

Building Tool-Calling Agents With Databricks Agent Framework and MCP

Building Tool-Calling Agents With Databricks Agent Framework and MCP

2025-06-11 Watch
talk
Siddharth Murching (Databricks) , Elise Gonzales (Databricks)

Want to create AI agents that can do more than just generate text? Join us to explore how combining Databricks' Mosaic AI Agent Framework with the Model Context Protocol (MCP) unlocks powerful tool-calling capabilities. We'll show you how MCP provides a standardized way for AI agents to interact with external tools, data and APIs, solving the headache of fragmented integration approaches. Learn to build agents that can retrieve both structured and unstructured data, execute custom code and tackle real enterprise challenges. Key takeaways: Implementing MCP-enabled tool-calling in your AI agents Prototyping in AI Playground and exporting for deployment Integrating Unity Catalog functions as agent tools Ensuring governance and security for enterprise deployments Whether you're building customer service bots or data analysis assistants, you'll leave with practical know-how to create powerful, governed AI agents.

ClickHouse and Databricks for Real-Time Analytics

ClickHouse and Databricks for Real-Time Analytics

2025-06-11 Watch
talk
Melvyn Peignon (ClickHouse)

ClickHouse is a C++ based, column-oriented database built for real-time analytics. While it has its own internal storage format, the rise of open lakehouse architectures has created a growing need for seamless interoperability. In response, we have developed integrations with your favorite lakehouse ecosystem to enhance compatibility, performance and governance. From integrating with Unity Catalog to embedding the Delta Kernel into ClickHouse, this session will explore the key design considerations behind these integrations, their benefits to the community, the lessons learned and future opportunities for improved compatibility and seamless integration.

Declarative Pipelines: What’s Next for the Apache Spark Ecosystem

Declarative Pipelines: What’s Next for the Apache Spark Ecosystem

2025-06-11 Watch
talk
Michael Armbrust (Databricks) , Sandy Ryza (Databricks)

Lakeflow Declarative Pipelines has made it dramatically easier to build production-grade Spark pipelines, using a framework that abstracts away orchestration and complexity. It’s become a go-to solution for teams who want reliable, maintainable pipelines without reinventing the wheel.But we’re just getting started. In this session, we’ll take a step back and share a broader vision for the future of Spark Declarative Pipelines — one that opens the door to a new level of openness, standardization and community momentum.We’ll cover the core concepts behind Declarative Pipelines, where the architecture is headed, and what this shift means for both existing Lakeflow users and Spark engineers building procedural code. Don’t miss this session — we’ll be sharing something new that sets the direction for what comes next.

Driving Secure AI Innovation with Obsidian Security, Databricks, and PointGuard AI

Driving Secure AI Innovation with Obsidian Security, Databricks, and PointGuard AI

2025-06-11 Watch
talk
Alfredo Hickman (Obsidian Security) , JD Braun (Databricks) , Mali Gorantla (PointGuard AI)

As enterprises adopt AI and Large Language Models (LLMs), securing and governing these models - and the data used to train them - is essential. In this session, learn how Databricks Partner PointGuard AI helps organizations implement the Databricks AI Security Framework to manage AI-specific risks, ensuring security, compliance, and governance across the entire AI lifecycle. Then, discover how Obsidian Security provides a robust approach to AI security, enabling organizations to confidently scale AI applications.

End-to-End Interoperable Data Platform: How Bosch Leverages Databricks Supply Chain Consolidation

End-to-End Interoperable Data Platform: How Bosch Leverages Databricks Supply Chain Consolidation

2025-06-11 Watch
talk
Satish Karunakaran (Robert Bosch GmbH) , Marc-Alexander Frey (Robert Bosch GmbH)

This session will showcase Bosch’s journey in consolidating supply chain information using the Databricks platform. It will dive into how Databricks not only acts as the central data lakehouse but also integrates seamlessly with transformative components such as dbt and Large Language Models (LLMs). The talk will highlight best practices, architectural considerations, and the value of an interoperable platform in driving actionable insights and operational excellence across complex supply chain processes. Key Topics and Sections Introduction & Business Context Brief Overview of Bosch’s Supply Chain Challenges and the Need for a Consolidated Data Platform. Strategic Importance of Data-Driven Decision-Making in a Global Supply Chain Environment. Databricks as the Core Data Platform Integrating dbt for Transformation Leveraging LLM Models for Enhanced Insights

Entity Resolution for the Best Outcomes on Your Data

Entity Resolution for the Best Outcomes on Your Data

2025-06-11 Watch
talk
Ninad Sohoni (Databricks) , Yinxi Zhang (Databricks)

There are many ways to implement entity resolution (ER) system — both using vendor software and open-source libraries that enable DIY Entity Resolution. However, generally we see common challenges with any approach — scalability, bound to a single model architecture, lack of metrics and explainability, and stagnant implementations that do not "learn" with experience. Recent experiments with transformer-based approaches, fast lookups with vector search and Databricks components such as Databricks Apps and Agent Eval provide the foundations for a composable ER system that can get better with time on your data. In this presentation, we include a demo of how to use these components to build a composable ER that has the best outcomes for your data.

Evolving Data Insights With Privacy at Mastercard

Evolving Data Insights With Privacy at Mastercard

2025-06-11 Watch
talk
Spencer Cook (Databricks) , john Derrico (Mastercard)

Mastercard is a global technology company whose role is anchored in trust. It supports 3.4 billion cards and over 143 billion transactions annually. To address customers’ increasing data volume and complex privacy needs, Mastercard has developed a novel service atop Databricks’ Clean Rooms and broader Data Intelligence Platform. This service combines several Databricks components with Mastercard’s IP, providing an evolved method for data-driven insights and value-added services while ensuring a unique standalone turnkey service. The result is a secure environment where multiple parties can collaborate on sensitive data without directly accessing each other’s information. After this session, attendees will understand how Mastercard used its expertise in privacy-enhancing technologies to create collaboration tools powered by Databricks’ Clean Rooms, AI/BI, Apps, Unity Catalog, Workflows and DatabricksIQ — as well as how to take advantage of this new privacy-enhancing service directly.

Extending the Lakehouse: Power Interoperable Compute With Unity Catalog Open APIs

Extending the Lakehouse: Power Interoperable Compute With Unity Catalog Open APIs

2025-06-11 Watch
talk
Tathagata Das (Databricks) , Michelle Leon (Databricks)

The lakehouse is built for storage flexibility, but what about compute? In this session, we’ll explore how Unity Catalog enables you to connect and govern multiple compute engines across your data ecosystem. With open APIs and support for the Iceberg REST Catalog, UC lets you extend access to engines like Trino, DuckDB, and Flink while maintaining centralized security, lineage, and interoperability. We will show how you can get started today working with engines like Apache Spark and Starburst to read and write to UC managed tables with some exciting demos. Learn how to bring flexibility to your compute layer—without compromising control.

Hands-On Learning: AI Agents Workshop: Create, Evaluate, and Deploy using Mosaic AI

2025-06-11
workshop
Nicolas Pelaez (Databricks) , Amber Roberts (Databricks)

Looking for a practical workshop on building an AI Agent on Databricks? Well, we have just the thing for you.This hands-on workshop takes you through the process of creating intelligent agents that can reason their way to useful outcomes. You'll start by building your own toolkit of SQL and Python functions that give your agent practical capabilities. Then we'll explore how to select the right foundation model for your needs, connect your custom tools, and watch as your agent tackles complex challenges through visible reasoning paths.The workshop doesn't just stop at building—you'll dive into evaluation techniques using evaluation datasets to identify where your agent shines and where it needs improvement. After implementing and measuring your changes, we'll explore deployment strategies, including a feedback collection interface that enables continuous improvement and governance mechanisms to ensure responsible AI usage in production environments.

Hands-on Learning: Databricks SQL in Action: Intelligent Data Warehousing, Analytics and BI Workshop (repeat)

2025-06-11
workshop
Pearl Ubaru (Databricks)

Most organizations run complex cloud data architectures that silo applications, users and data. Join this interactive hands-on workshop to learn how Databricks SQL allows you to operate a multi-cloud lakehouse architecture that delivers data warehouse performance at data lake economics — with up to 12x better price/performance than traditional cloud data warehouses. Here’s what we’ll cover: How Databricks SQL fits in the Data Intelligence Platform, enabling you to operate a multicloud lakehouse architecture that delivers data warehouse performance at data lake economics How to manage and monitor compute resources, data access and users across your lakehouse infrastructure How to query directly on your data lake using your tools of choice or the built-in SQL editor and visualizations How to use AI to increase productivity when querying, completing code or building dashboards Ask your questions during this hands-on lab, and the Databricks experts will guide you.

Harnessing Databricks Asset Bundles: Transforming Pipeline Management at Scale at Stack Overflow

Harnessing Databricks Asset Bundles: Transforming Pipeline Management at Scale at Stack Overflow

2025-06-11 Watch
talk
Chelsea Zhang (Stack Overflow)

Discover how Stack Overflow optimized its data engineering workflows using Databricks Asset Bundles (DABs) for scalable and efficient pipeline deployments. This session explores the structured pipeline architecture, emphasizing code reusability, modular design and bundle variables to ensure clarity and data isolation across projects. Learn how the data team leverages enterprise infrastructure to streamline deployment across multiple environments. Key topics include DRY-principled modular design, essential DAB features for automation and data security strategies using Unity Catalog. Designed for data engineers and teams managing multi-project workflows, this talk offers actionable insights on optimizing pipelines with Databricks evolving toolset.

How HMS Federation Powered Nationwide’s Seamless and Efficient Unity Catalog Migration

How HMS Federation Powered Nationwide’s Seamless and Efficient Unity Catalog Migration

2025-06-11 Watch
talk

This talk takes you through the Nationwide Security and Infrastructure data team's journey of migrating from HMS to UC. Discover how HMS federation simplified our transition to UC, allowing for an incremental migration that minimized disruption to data consumers while optimizing our data layout. We’ll share the key technical decisions, challenges faced and lessons learned along the way. The migration process wasn’t without its hurdles, so we’ll walk you through our detailed, step-by-step approach covering planning, execution and validation. We will also showcase the benefits realized, such as improved data governance, more efficient data access and enhanced operational performance. Join us to gain practical insights into executing complex data migrations with a focus on security, flexibility and long-term scalability.

How the Texas Rangers Use a Unified Data Platform to Drive World Class Baseball Analytics

How the Texas Rangers Use a Unified Data Platform to Drive World Class Baseball Analytics

2025-06-11 Watch
talk
Michael Topol (Texas Rangers) , Oliver Dykstra (Texas Rangers)

Don't miss this session where we demonstrate how the Texas Rangers baseball team is staying one step ahead of the competition by going back to the basics. After implementing a modern data strategy with Databricks and winnng the 2023 World Series the rest of the league quickly followed suit. Now more than ever, data and AI are a central pillar of every baseball team's strategy driving profound insights into player performance and game dynamics. With a 'fundamentals win games' back to the basics focus, join us as we explain our commmitment to world-class data quality, engineering, and MLOPS by taking full advantage of the Databricks Data Intelligence Platform. From system tables to federated querying, find out how the Rangers use every tool at their disposal to stay one step ahead in the hyper competitive world of baseball.

HP's Data Platform Migration Journey: Redshift to Lakehouse

HP's Data Platform Migration Journey: Redshift to Lakehouse

2025-06-11 Watch
talk
Isaac Chan (HP Inc.) , Kavya Atmakuri (HP Inc.)

HP Print's data platform team took on a migration from a monolithic, shared resource of AWS Redshift, to a modular and scalable data ecosystem on Databricks lakehouse.​ The result was 30–40% cost savings, scalable and isolated resources for different data consumers and ETL workloads, and performance optimization for a variety of query types.​ Through this migration, there were technical challenges and learnings relating to the ETL migrations with DBT, new Databricks features like Liquid Clustering, predictive optimization, Photon, SQL serverless warehouses, managing multiple teams on Unity Catalog, and others.​ This presentation dives into both the business and technical sides of this migration. Come along as we share our key takeaways from this journey.​

Innovating Retail Data: Unilever’s Transformation with Databricks Lakeflow Declarative Pipelines

2025-06-11
talk
Evan Cherney (Unilever)

Retail data is expanding at an unprecedented rate, demanding a scalable, cost-efficient, and near real-time architecture. At Unilever, we transformed our data management approach by leveraging Databricks Lakeflow Declarative Pipelines, achieving approximately $500K in cost savings while accelerating computation speeds by 200–500%.By adopting a streaming-driven architecture, we built a system where data flows continuously across processing layers, enabling real-time updates with minimal latency.Lakeflow Declarative Pipelines' serverless simplicity replaced complex-dependency management, reducing maintenance overhead, and improving pipeline reliability. Lakeflow Declarative Pipelines Direct Publishing further enhanced data segmentation, concurrency, and governance, ensuring efficient and scalable data operations while simplifying workflows.This transformation empowers Unilever to manage data with greater efficiency, scalability, and reduced costs, creating a future-ready infrastructure that evolves with the needs of our retail partners and customers.

Intuit's Privacy-Safe Lending Marketplace: Leveraging Databricks Clean Rooms

Intuit's Privacy-Safe Lending Marketplace: Leveraging Databricks Clean Rooms

2025-06-11 Watch
talk
Anurag Malik (Intuit Inc.)

Intuit leverages Databricks Clean Rooms to create a secure, privacy-safe lending marketplace, enabling small business lending partners to perform analytics and deploy ML/AI workflows on sensitive data assets. This session explores the technical foundations of building isolated clean rooms across multiple partners and cloud providers, differentiating Databricks Clean Rooms from market alternatives. We'll demonstrate our automated approach to clean room lifecycle management using APIs, covering creation, collaborator onboarding, data asset sharing, workflow orchestration and activity auditing. The integration with Unity Catalog for managing clean room inputs and outputs will also be discussed. Attendees will gain insights into harnessing collaborative ML/AI potential, support various languages and workloads, and enable complex computations without compromising sensitive information in Clean Rooms.

Let’s Elevate: An Open Source Model for Data Sharing and Collaboration in Retail and Consumer Goods

Let’s Elevate: An Open Source Model for Data Sharing and Collaboration in Retail and Consumer Goods

2025-06-11 Watch
talk
Rob Saker (Databricks) , Richard Schwartz (Pensa Systems) , Sabrina Miller (Databricks) , Dag Liodden (Crisp, Inc.) , Adrian Bolosan (Databricks)

Retailers and suppliers face persistent financial and technical challenges to data sharing — including expensive proprietary platforms, complex data integration hurdles, fragmented governance and more — which currently restrict seamless data exchange primarily to their largest trading partners. In this session, we’ll provide an in-depth explanation of Elevate, an industry alliance focused on building open source standards for data sharing and collaboration to drive greater efficiency across the entire ecosystem. This session will highlight proposed standards for data sharing, data models, business cases on the ROI and potential areas of innovation to democratize data sharing, drastically reduce costs, simplify integration processes and foster transparent, trusted collaboration. Learn about the Elevate industry data-sharing initiative and how your company can participate and help guide standards to improve data sharing with your key partners.

Mastering Change Data Capture With Lakeflow Declarative Pipelines

Mastering Change Data Capture With Lakeflow Declarative Pipelines

2025-06-11 Watch
talk
Ray Zhu (Databricks) , Jacob Gollub (Square)

Transactional systems are a common source of data for analytics, and Change Data Capture (CDC) offers an efficient way to extract only what’s changed. However, ingesting CDC data into an analytics system comes with challenges, such as handling out-of-order events or maintaining global order across multiple streams. These issues often require complex, stateful stream processing logic.This session will explore how Lakeflow Declarative Pipelines simplifies CDC ingestion using the Apply Changes function. With Apply Changes, global ordering across multiple change feeds is handled automatically — there is no need to manually manage state or understand advanced streaming concepts like watermarks. It supports both snapshot-based inputs from cloud storage and continuous change feeds from systems like message buses, reducing complexity for common streaming use cases.