talk-data.com talk-data.com

Event

Data + AI Summit 2025

2025-06-09 – 2025-06-13 Databricks Summit Visit website ↗

Activities tracked

715

Sessions & talks

Showing 501–525 of 715 · Newest first

Search within this event →
Sponsored by: Astronomer | Scaling Data Teams for the Future

Sponsored by: Astronomer | Scaling Data Teams for the Future

2025-06-10 Watch
lightning_talk
Steven Hillion (Astronomer)

The role of data teams and data engineers is evolving. No longer just pipeline builders or dashboard creators, today’s data teams must evolve to drive business strategy, enable automation, and scale with growing demands. Best practices seen in the software engineering world (Agile development, CI/CD, and Infrastructure-as-code) from the DevOps movement are gradually making their way into data engineering. We believe these changes have led to the rise of DataOps and a new wave of best practices that will transform the discipline of data engineering. But how do you transform a reactive team into a proactive force for innovation? We’ll explore the key principles for building a resilient, high-impact data team—from structuring for collaboration, testing, automation, to leveraging modern orchestration tools. Whether you’re leading a team or looking to future-proof your career, you’ll walk away with actionable insights on how to stay ahead in the rapidly changing data landscape.

Sponsored by: AWS | Deploying a GenAI Agent using Databricks Mosaic AI, Anthropic, LangGraph, and Amazon Bedrock

Sponsored by: AWS | Deploying a GenAI Agent using Databricks Mosaic AI, Anthropic, LangGraph, and Amazon Bedrock

2025-06-10 Watch
lightning_talk

In this session, you’ll see how to build and deploy a GenAI agent and Model Context Protocol (MCP) with Databricks, Anthropic, Mosaic External AI Gateway, and Amazon Bedrock. You will learn the architecture, best-practices of using Databricks Mosaic AI, Anthropic Sonnet 3.7 first-party frontier model, and LangGraph for custom workflow orchestration in Databricks Data Intelligence Platform. You’ll also see how to use Databricks Mosaic AI to provide agent evaluation and monitoring. In addition, you will also see how inline agent will use MCP to provide tools and other resources using Amazon Nova models with Amazon Bedrock inline agent for deep research. This approach gives you the flexibility of LangGraph, the powerful managed agents offered by Amazon Bedrock, and Databricks Mosaic AI’s operational support for evaluation and monitoring.

Sponsored by: Coalesce | Bringing Order to Chaos: How to Succeed in a Data & Analytics World

Sponsored by: Coalesce | Bringing Order to Chaos: How to Succeed in a Data & Analytics World

2025-06-10 Watch
lightning_talk
Michael Tantrum (Coalesce)

Priorities shift, requirements change, resources fluctuate, and the demands on data teams are only continuing to grow. Join this session, led by Coalesce Sales Engineering Director, Michael Tantrum, to hear about the most efficient way to deliver high quality data to your organization at the speed they need to consume it. Learn how to sidestep the common pitfalls of data development for maximum data team productivity.

Energy and Utilities Industry Forum | Sponsored by: Deloitte and AWS

Energy and Utilities Industry Forum | Sponsored by: Deloitte and AWS

2025-06-10 Watch
talk
Bryce Bartmann (Shell) , Ali Marzban (NOV) , Lou Martinez Sancho (Westinghouse Electric Company) , Shane Powell (Alabama Power) , Julien Debard (Databricks)

Join us for a compelling forum exploring how energy leaders are harnessing data and AI to build a more sustainable future. As the industry navigates the complex balance between rising global energy demands and ambitious decarbonization goals, innovative companies are discovering that intelligence-driven operations are the key to success. From optimizing renewable energy integration to revolutionizing grid management, learn how energy pioneers are using AI to transform traditional operations while accelerating the path to net zero. This session reveals how Databricks is empowering energy companies to turn their sustainability aspirations into reality, proving that the future of energy is both clean and intelligent.

Tech Industry Forum: Tip of the Spear With Data and AI | Sponsored by: Aimpoint Digital and AWS

Tech Industry Forum: Tip of the Spear With Data and AI | Sponsored by: Aimpoint Digital and AWS

2025-06-10 Watch
talk
Shant Hovsepian (Databricks) , Jason Reid (Databricks) , Madelyn Mullen (Databricks) , Dev Tagare (Robinhood) , Josh Clemm (Dropbox) , Dan DeMeyere (ThredUp Inc.) , Dan Wulin (Zillow)

Join us for the Tech Industry Forum, formerly known as the Tech Innovators Summit, now part of Databricks Industry Experience. This session will feature keynotes, panels and expert talks led by top customer speakers and Databricks experts. Tech companies are pushing the boundaries of data and AI to accelerate innovation, optimize operations and build collaborative ecosystems. In this session, we’ll explore how unified data platforms empower organizations to scale their impact, democratize analytics across teams and foster openness for building tomorrow’s products. Key topics include: Scaling data platforms to support real-time analytics and AI-driven decision-making Democratizing access to data while maintaining robust governance and security Harnessing openness and portability to enable seamless collaboration with partners and customers After the session, connect with your peers during the exclusive Industry Forum Happy Hour. Reserve your seat today!

Automated Deployment with Databricks Asset Bundles

2025-06-10
talk

This course provides a comprehensive review of DevOps principles and their application to Databricks projects. It begins with an overview of core DevOps, DataOps, continuous integration (CI), continuous deployment (CD), and testing, and explores how these principles can be applied to data engineering pipelines. The course then focuses on continuous deployment within the CI/CD process, examining tools like the Databricks REST API, SDK, and CLI for project deployment. You will learn about Databricks Asset Bundles (DABs) and how they fit into the CI/CD process. You’ll dive into their key components, folder structure, and how they streamline deployment across various target environments in Databricks. You will also learn how to add variables, modify, validate, deploy, and execute Databricks Asset Bundles for multiple environments with different configurations using the Databricks CLI. Finally, the course introduces Visual Studio Code as an Interactive Development Environment (IDE) for building, testing, and deploying Databricks Asset Bundles locally, optimizing your development process. The course concludes with an introduction to automating deployment pipelines using GitHub Actions to enhance the CI/CD workflow with Databricks Asset Bundles. By the end of this course, you will be equipped to automate Databricks project deployments with Databricks Asset Bundles, improving efficiency through DevOps practices. Pre-requisites: Strong knowledge of the Databricks platform, including experience with Databricks Workspaces, Apache Spark, Delta Lake, the Medallion Architecture, Unity Catalog, Delta Live Tables, and Workflows. In particular, knowledge of leveraging Expectations with Lakeflow Declarative Pipelines. Labs : Yes Certification Path: Databricks Certified Data Engineer Professional

De-Risking Investment Decisions: QCG's Smarter Deal Evaluation Process Leveraging Databricks

De-Risking Investment Decisions: QCG's Smarter Deal Evaluation Process Leveraging Databricks

2025-06-10 Watch
lightning_talk
Ian Brown (Quantum Capital Group)

Quantum Capital Group (QCG) screens hundreds of deals across the global Sustainable Energy Ecosystem, requiring deep technical due diligence. With over 1.5 billion records sourced from public, premium and proprietary datasets, their challenge was how to efficiently curate, analyze and share this data to drive smarter investment decisions. QCG partnered with Databricks & Tiger Analytics to modernize its data landscape. Using Delta tables, Spark SQL, and Unity Catalog, the team built a golden dataset that powers proprietary evaluation models and automates complex workflows. Data is now seamlessly curated, enriched and distributed — both internally and to external stakeholders — in a secure, governed and scalable way. This session explores how QCG’s investment in data intelligence has turned an overwhelming volume of information into a competitive advantage, transforming deal evaluation into a faster, more strategic process.

Gen AI Deployment and Monitoring

2025-06-10
talk

This course introduces learners to deploying, operationalizing, and monitoring generative artificial intelligence (AI) applications. First, learners will develop knowledge and skills in deploying generative AI applications using tools like Model Serving. Next, the course will discuss operationalizing generative AI applications following modern LLMOps best practices and recommended architectures. Finally, learners will be introduced to the idea of monitoring generative AI applications and their components using Lakehouse Monitoring. Pre-requisites: Familiarity with prompt engineering and retrieval-augmented generation (RAG) techniques, including data preparation, embeddings, vectors, and vector databases. A foundational knowledge of Databricks Data Intelligence Platform tools for evaluation and governance (particularly Unity Catalog). Labs: Yes Certification Path: Databricks Certified Generative AI Engineer Associate

Machine Learning Operations

2025-06-10
talk

This course will guide participants through a comprehensive exploration of machine learning model operations, focusing on MLOps and model lifecycle management. The initial segment covers essential MLOps components and best practices, providing participants with a strong foundation for effectively operationalizing machine learning models. In the latter part of the course, we will delve into the basics of the model lifecycle, demonstrating how to navigate it seamlessly using the Model Registry in conjunction with the Unity Catalog for efficient model management. By the course's conclusion, participants will have gained practical insights and a well-rounded understanding of MLOps principles, equipped with the skills needed to navigate the intricate landscape of machine learning model operations. Pre-requisites: Familiarity with Databricks workspace and notebooks, familiarity with Delta Lake and Lakehouse, intermediate level knowledge of Python (e.g. understanding of basic MLOps concepts and practices as well as infrastructure and importance of monitoring MLOps solutions) Labs: Yes Certification Path: Databricks Certified Machine Learning Associate

MLOps With Databricks

MLOps With Databricks

2025-06-10 Watch
lightning_talk
Maria Vechtomova (Marvelous MLOps)

Adopting MLOps is getting increasingly important with the rise of AI. A lot of different features are required to do MLOps in large organizations. In the past, you had to implement these features yourself. Luckily, the MLOps space is getting more mature, and end-to-end platforms like Databricks provide most of the features. In this talk, I will walk through the MLOps components and how you can simplify your processes using Databricks. Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.

Pushing the Limits of What Your Warehouse Can Do Using Python and Databricks

Pushing the Limits of What Your Warehouse Can Do Using Python and Databricks

2025-06-10 Watch
lightning_talk
Jakob Mund (Databricks)

SQL warehouses in Databricks can run more than just SQL. Join this session to learn how to get more out of your SQL warehouses and any tools built on top of it by leveraging Python. After attending this session, you will be familiar with Python user-defined functions and how to bring in custom dependencies from PyPi, as a custom wheel or even securely invoke cloud services with performance at scale.

ReguBIM AI – Transforming BIM, Engineering, and Code Compliance with Generative AI

ReguBIM AI – Transforming BIM, Engineering, and Code Compliance with Generative AI

2025-06-10 Watch
lightning_talk
Qi Qi Oh (Exyte Singapore Pte. Ltd.)

At Exyte, we design, engineer, and deliver ultra-clean and sustainable facilities for high-tech industries. One of the most complex tasks our engineers and designers face is ensuring that their building designs comply with constantly evolving codes and regulations – often a manual, error-prone process. To address this, we developed ReguBIM AI, a generative AI-powered assistant that helps our teams verify code compliance more efficiently and accurately by linking 3D Building Information Modeling (BIM) data with regulatory documents. Built on the Databricks Data Intelligence Platform, ReguBIM AI is part of our broader vision to apply AI meaningfully across engineering and design processes. We are proud to share that ReguBIM AI won the Grand Prize and EMEA Winner titles at the Databricks GenAI World Cup 2024 — a global hackathon that challenged over 1,500 data scientists and AI engineers from 18 countries to create innovative generative AI solutions for real-world problems.

Scaling GenAI Inference From Prototype to Production: Real-World Lessons in Speed & Cost

Scaling GenAI Inference From Prototype to Production: Real-World Lessons in Speed & Cost

2025-06-10 Watch
lightning_talk
Anish Kumar (Scribd, Inc.)

This lightning talk dives into real-world GenAI projects that scaled from prototype to production using Databricks’ fully managed tools. Facing cost and time constraints, we leveraged four key Databricks features—Workflows, Model Serving, Serverless Compute, and Notebooks—to build an AI inference pipeline processing millions of documents (text and audiobooks). This approach enables rapid experimentation, easy tuning of GenAI prompts and compute settings, seamless data iteration and efficient quality testing—allowing Data Scientists and Engineers to collaborate effectively. Learn how to design modular, parameterized notebooks that run concurrently, manage dependencies and accelerate AI-driven insights. Whether you're optimizing AI inference, automating complex data workflows or architecting next-gen serverless AI systems, this session delivers actionable strategies to maximize performance while keeping costs low.

Site to Insight: Powering Construction Analytics Through Delta Sharing

Site to Insight: Powering Construction Analytics Through Delta Sharing

2025-06-10 Watch
lightning_talk
Vinodh Thiagarajan (Procore) , vishnu sreenivasan (Procore)

At Procore, we're transforming the construction industry through innovative data solutions. This session unveils how we've supercharged our analytics offerings using a unified lakehouse architecture and Delta Sharing, delivering game-changing results for our customers and our business and how data professionals can unlock the full potential of their data assets and drive meaningful business outcomes. Key highlights: Learn how we've implemented seamless, secure sharing of large datasets across various BI tools and programming languages, dramatically accelerating time-to-insights for our customers Discover our approach to sharing dynamically filtered subsets of data across our numerous customers with cross-platform view sharing We'll demonstrate how our architecture has eliminated the need for data replication, fostering a more efficient, collaborative data ecosystem

Sponsored by: EY | Xoople: Fueling enterprise AI with Earth data intelligence products

Sponsored by: EY | Xoople: Fueling enterprise AI with Earth data intelligence products

2025-06-10 Watch
lightning_talk

Xoople aims to provide its users with trusted AI-Ready Earth data and accelerators that unlock new insights for enterprise AI. With access to scientific-grade Earth data that provides spatial intelligence on real-world changes, data scientists and BI analysts can increase forecast accuracy for their enterprise processes and models. These improvements drive smarter, data-driven business decisions across various business functions, including supply chain, finance, and risk across industries. Xoople, which has recently introduced their product, Enterprise AI-Ready Earth Data™, on the Databricks Marketplace, will have their CEO, Fabrizio Pirondini, discuss the importance of the Databricks Data Intelligence Platform in making Xoople’s product a reality for use in the enterprise.

Sponsored by: Impetus | Supercharge AI with automated migration to Databricks with Impetus

Sponsored by: Impetus | Supercharge AI with automated migration to Databricks with Impetus

2025-06-10 Watch
lightning_talk
Sachneet Bains (Impetus Technologies Inc.)

Migrating legacy workloads to a modern, scalable platform like Databricks can be complex and resource-intensive. Impetus, an Elite Databricks Partner and the Databricks Migration Partner of the Year 2024, simplifies this journey with LeapLogic, an automated solution for data platform modernization and migration services. LeapLogic intelligently discovers, transforms, and optimizes workloads for Databricks, ensuring minimal risk and faster time-to-value. In this session, we’ll showcase real-world success stories of enterprises that have leveraged Impetus’ LeapLogic to modernize their data ecosystems efficiently. Join us to explore how you can accelerate your migration journey, unlock actionable insights, and future-proof your analytics with a seamless transition to Databricks.

Sponsored by: Informatica | Extending Unity Catalog to Govern the Data Estate With Informatica Cloud Data Governance & Catalog

Sponsored by: Informatica | Extending Unity Catalog to Govern the Data Estate With Informatica Cloud Data Governance & Catalog

2025-06-10 Watch
lightning_talk
Ajay GOLLAPALLI (Informatica)

Join this 20-minute session to learn how Informatica CDGC integrates with and leverages Unity Catalog metadata to provide end-to-end governance and security across an enterprise data landscape. Topics covered will include: Comprehensive data lineage that provides complete data transformation visibility across multicloud and hybrid environments -Broad data source support to facilitate holistic cataloging and a centralized governance framework Centralized access policy management and data stewardship to enable compliance with regulatory standards Rich data quality to ensure data is cleansed, validated and trusted for analytics and AI

ViewShift: Dynamic Policy Enforcement With Spark and SQL Views

ViewShift: Dynamic Policy Enforcement With Spark and SQL Views

2025-06-10 Watch
lightning_talk
Khai Tran (LinkedIn) , Walaa Moustafa (LinkedIn)

Dynamic policy enforcement is increasingly critical in today's landscape, where data compliance is a top priorities for companies, individuals, and regulators alike. In this talk, Walaa explores how LinkedIn has implemented a robust dynamic policy enforcement engine, ViewShift, and integrated it within its data lake. He will demystify LinkedIn's query engine stack by demonstrating how catalogs can automatically route table resolutions to compliance-enforcing SQL views. These SQL views possess several noteworthy properties: Auto-Generated: Created automatically from declarative data annotations. User-Centric: They honor user-level consent and preferences. Context-Aware: They apply different transformations tailored to specific use cases. Portable: Despite the SQL logic being implemented in a single dialect, it remains accessible across all engines. Join this session to learn how ViewShift helps ensure that compliance is seamlessly integrated into data processing workflows.

Bridging Ontologies & Lakehouses: Palantir AIP + Databricks for Secure Autonomous AI

Bridging Ontologies & Lakehouses: Palantir AIP + Databricks for Secure Autonomous AI

2025-06-10 Watch
talk
Siddhant Ekale (Palantir) , Ben Abood (Databricks)

AI is moving from pilots to production, but many organizations still struggle to connect boardroom ambitions with operational reality. Palantir’s Artificial Intelligence Platform (AIP) and the Databricks Data Intelligence Platform now form a single, open architecture that closes this gap by pairing Palantir’s operational decision empowering Ontology- with Databricks’ industry-leading scale, governance and Lakehouse economics. The result: real-time, AI-powered, autonomous workflows that are already powering mission-critical outcomes for the U.S. Department of Defense, bp and other joint customers across the public and private sectors. In this technically grounded but business-focused session you will see the new reference architecture in action. We will walk through how Unity Catalog and Palantir Virtual Tables provide governed, zero-copy access to Lakehouse data and back mission-critical operational workflows on top of Palantir’s semantic ontology and agentic AI capabilities. We will also explore how Palantir’s no-code and pro-code tooling integrates with Databricks compute to orchestrate builds and write tables to Unity Catalog. Come hear from customers currently using this architecture to drive critical business outcomes seamlessly across Databricks and Palantir.

Databricks Apps: Turning Data and AI Into Practical, User-Friendly Applications

Databricks Apps: Turning Data and AI Into Practical, User-Friendly Applications

2025-06-10 Watch
talk
Nic Heier (Databricks) , Justin DeBrabant (Databricks)

This session is repeated. In this session, we present an overview of the GA release of Databricks Apps, the new app hosting platform that integrates all the Databricks services necessary to build production-ready data and AI applications. With Apps, data and developer teams can build new interfaces into the data intelligence platform, further democratizing the transformative power of data and AI across the organization. We'll cover common use cases, including RAG chat apps, interactive visualizations and custom workflow builders, as well as look at several best practices and design patterns when building apps. Finally, we'll look ahead with the vision, strategy and roadmap for the year ahead.

Enabling Sleep Science Research With Databricks and Delta Sharing

Enabling Sleep Science Research With Databricks and Delta Sharing

2025-06-10 Watch
talk
Alexandr Rivlin (Sleep Number Labs) , Sajeev Mayandi (Sleep Number)

Leveraging Databricks as a platform, we facilitate the sharing of anonymized datasets across various Databricks workspaces and accounts, spanning multiple cloud environments such as AWS, Azure, and Google Cloud. This capability, powered by Delta Sharing, extends both within and outside Sleep Number, enabling accelerated insights while ensuring compliance with data security and privacy standards. In this session, we will showcase our architecture and implementation strategy for data sharing, highlighting the use of Databricks’ Unity Catalog and Delta Sharing, along with integration with platforms like Jira, Jenkins, and Terraform to streamline project management and system orchestration.

From Datavault to Delta Lake: Streamlining Data Sync with Lakeflow Connect

From Datavault to Delta Lake: Streamlining Data Sync with Lakeflow Connect

2025-06-10 Watch
talk
Olivia Ren (Databricks) , Andrew Clarke (Australian Red Cross Lifeblood)

In this session, we will explore the Australian Red Cross Lifeblood's approach to synchronizing an Azure SQL Datavault 2.0 (DV2.0) implementation with Unity Catalog (UC) using Lakeflow Connect. Lifeblood's DV2.0 data warehouse, which includes raw vault (RV) and business vault (BV) tables, as well as information marts defined as views, required a multi-step process to achieve data/business logic sync with UC. This involved using Lakeflow Connect to ingest RV and BV data, followed by a custom process utilizing JDBC to ingest view definitions, and the automated/manual conversion of T-SQL to Databricks SQL views, with Lakehouse Monitoring for validation. In this talk, we will share our journey, the design decisions we made, and how the resulting solution now supports analytics workloads, analysts, and data scientists at Lifeblood.

How United Airlines Transforms SWIM Data into Real-Time Operational Insight and Faster Decision Making

How United Airlines Transforms SWIM Data into Real-Time Operational Insight and Faster Decision Making

2025-06-10 Watch
talk
Saurabh Agarwal (United Airlines) , Dheeraj Arora (United Airlines)

Discover how United Airlines, in collaboration with Databricks and Impetus Technologies, has built a next-generation data intelligence platform leveraging System Wide Information Management (SWIM) to deliver mission-critical, real-time insights for flight disruption prediction, situational analysis, and smarter, faster decision-making. In this session, United Airlines experts will share how their Databricks-based SWIM architecture enables near real-time operational awareness, enhances responsiveness during irregular operations (IRROPs), and drives proactive actions to minimize disruptions. They will also discuss how United efficiently processes and manages the large volume and variety of SWIM data, ensuring seamless integration and actionable intelligence across their operations.

Lakeflow Declarative Pipelines Integrations and Interoperability: Get Data From — and to — Anywhere

Lakeflow Declarative Pipelines Integrations and Interoperability: Get Data From — and to — Anywhere

2025-06-10 Watch
talk
Ryan Nienhuis (Databricks)

This session is repeated.In this session, you will learn how to integrate Lakeflow Declarative Pipelines with external systems in order to ingest and send data virtually anywhere. Lakeflow Declarative Pipelines is most often used in ingestion and ETL into the Lakehouse. New Lakeflow Declarative Pipelines capabilities like the Lakeflow Declarative Pipelines Sinks API and added support for Python Data Source and ForEachBatch have opened up Lakeflow Declarative Pipelines to support almost any integration. This includes popular Apache Spark™ integrations like JDBC, Kafka, External and managed Delta tables, Azure CosmosDB, MongoDB and more.

Mastering Data Security and Compliance: CoorsTek's Journey With Databricks Unity Catalog

Mastering Data Security and Compliance: CoorsTek's Journey With Databricks Unity Catalog

2025-06-10 Watch
talk
Anupam Wahi (Tredence) , David Tomlinson (CoorsTek)

Ensuring data security & meeting compliance requirements are critical priorities for businesses operating in regulated industries, where the stakes are high and the standards are stringent. We will showcase how CoorsTek, a global leader in technical ceramics MFG, partnered with Databricks to leverage the power of UC for addressing regulatory challenges while achieving significant operational efficiency gains. We'll dive into the migration journey, highlighting the adoption of key features such as RBAC, comprehensive data lineage tracking and robust auditing capabilities. Attendees will gain practical insights into the strategies and tools used to manage sensitive data, ensure compliance with industry standards and optimize cloud data architectures. Additionally, we’ll share real-world lessons learned, best practices for integrating compliance into a modern data ecosystem and actionable takeaways for leveraging Databricks as a catalyst for secure and compliant data innovation.