talk-data.com talk-data.com

Event

Data + AI Summit 2025

2025-06-09 – 2025-06-13 Databricks Summit Visit website ↗

Activities tracked

509

Filtering by: Databricks ×

Sessions & talks

Showing 276–300 of 509 · Newest first

Search within this event →
Scaling Trust in BI: How Bolt Manages Thousands of Metrics Across Databricks, dbt, and Looker

Scaling Trust in BI: How Bolt Manages Thousands of Metrics Across Databricks, dbt, and Looker

2025-06-11 Watch
talk
Silja Märdla (Bolt) , Sarah Levy (Euno)

Managing metrics across teams can feel like everyone’s speaking a different language, which often leads to loss of trust in numbers. Based on a real-world use case, we’ll show you how to establish a governed source of truth for metrics that works at scale and builds a solid foundation for AI integration. You’ll explore how Bolt.eu’s data team governs consistent metrics for different data users and leverages Euno’s automations to navigate the overlap between Looker and dbt. We’ll cover best practices for deciding where your metrics belong and how to optimize engineering and maintenance workflows across Databricks, dbt and Looker. For curious analytics engineers, we’ll dive into thinking in dimensions & measures vs. tables & columns and determining when pre-aggregations make sense. The goal is to help you contribute to a self-serve experience with consistent metric definitions, so business teams and AI agents can access the right data at the right time without endless back-and-forth.

Securing Databricks Using Databricks as SIEM

2025-06-11
talk
Kishore Prabakaran Fernando (Databricks) , Yugi Reddy (Databricks)

Securing Databricks using Databricks as SIEM showcases our approach on how we leverage Databricks product capabilities to prevent and mitigate security risks for Databricks. It demonstrates how Databricks can serve as a powerful Security Information and Event Management (SIEM) platform, offering advanced capabilities for data collection and threat detection. This session explores data collection from diverse data sources and real-time threat detection.

Simplify Data Ingest and Egress with the New Python Data Source API

Simplify Data Ingest and Egress with the New Python Data Source API

2025-06-11 Watch
talk
Craig Lukasik (Databricks)

Data engineering teams are frequently tasked with building bespoke ingest and/or egress solutions for myriad custom, proprietary, or industry-specific data sources or sinks. Many teams find this work cumbersome and time-consuming. Recognizing these challenges, Databricks interviewed numerous companies across different industries to better understand their diverse data integration needs. This comprehensive feedback led us to develop the Python Data Source API for Apache Spark™.

Sponsored by: Google Cloud | Unleash the power of Gemini for Databricks

Sponsored by: Google Cloud | Unleash the power of Gemini for Databricks

2025-06-11 Watch
lightning_talk
Abhishek Bhagwat (Google Cloud)

Elevate your AI initiatives on Databricks by harnessing the latest advancements in Google Cloud's Gemini models. Learn how to integrate Gemini's built-in reasoning and powerful development tools to build more dynamic and intelligent applications within your existing Databricks platform. We'll explore concrete ideas for agentic AI solutions, showcasing how Gemini can help you unlock new value from your data in Databricks.

Sponsored by: Hightouch | Unleashing AI at PetSmart: Using AI Decisioning Agents to Drive Revenue

Sponsored by: Hightouch | Unleashing AI at PetSmart: Using AI Decisioning Agents to Drive Revenue

2025-06-11 Watch
talk
Bradley Breuer (PetSmart) , Tejas Manohar (Hightouch)

With 75M+ Treats Rewards members, PetSmart knows how to build loyalty with pet parents. But recently, traditional email testing and personalization strategies weren’t delivering the engagement and growth they wanted—especially in the Salon business. This year, they replaced their email calendar and A/B testing with AI Decisioning, achieving a +22% incremental lift in bookings. Join Bradley Breuer, VP of Marketing – Loyalty, Personalization, CRM, and Customer Analytics, to learn how his team reimagined CRM using AI to personalize campaigns and dynamically optimize creative, offers, and timing for every unique pet parent. Learn: How PetSmart blends human insight and creativity with AI to deliver campaigns that engage and convert. How they moved beyond batch-and-blast calendars with AI Decisioning Agents to optimize sends—while keeping control over brand, messaging, and frequency. How using Databricks as their source of truth led to surprising learnings and better outcomes.

Sponsored by: RowZero | Spreadsheets in the modern data stack: security, governance, AI, and self-serve analytics

Sponsored by: RowZero | Spreadsheets in the modern data stack: security, governance, AI, and self-serve analytics

2025-06-11 Watch
lightning_talk
Breck Fresen (Row Zero)

Despite the proliferation of cloud data warehousing, BI tools, and AI, spreadsheets are still the most ubiquitous data tool. Business teams in finance, operations, sales, and marketing often need to analyze data in the cloud data warehouse but don't know SQL and don't want to learn BI tools. AI tools offer a new paradigm but still haven't broadly replaced the spreadsheet. With new AI tools and legacy BI tools providing business teams access to data inside Databricks, security and governance are put at risk. In this session, Row Zero CEO, Breck Fresen, will share examples and strategies data teams are using to support secure spreadsheet analysis at Fortune 500 companies and the future of spreadsheets in the world of AI. Breck is a former Principal Engineer from AWS S3 and was part of the team that wrote the S3 file system. He is an expert in storage, data infrastructure, cloud computing, and spreadsheets.

Sponsored by: Snowplow | Snowplow Signals: Powering Tomorrow’s Customer Experiences on Databricks

Sponsored by: Snowplow | Snowplow Signals: Powering Tomorrow’s Customer Experiences on Databricks

2025-06-11 Watch
lightning_talk
Yali Sassoon (Snowplow)

The web is on the verge of a major shift. Agentic applications will redefine how customers engage with digital experiences—delivering highly personalized, relevant interactions. In this talk, Snowplow CTO Yali Sassoon explores how Snowplow Signals enables agents to perceive users through short- and long-term memory, natively on the Databricks Data Intelligence Platform.

Sponsored by: Tealium | Personalizing Experiences and Improving Engagement with a Modernized Data Infrastructure

Sponsored by: Tealium | Personalizing Experiences and Improving Engagement with a Modernized Data Infrastructure

2025-06-11 Watch
talk
Sav Khetan (Tealium) , Narendra Pandya (Western Governors University)

Join this session to hear how Western Governors University leverages Tealium & Databricks to power their data activation strategy with real-time customer data collection, activation and advanced analytics.

Transforming Data Governance for Multimodal Data at Amgen With Databricks

Transforming Data Governance for Multimodal Data at Amgen With Databricks

2025-06-11 Watch
talk
Jaison Dominic (Amgen) , Jinesh Kunjumon (AMGEN)

Amgen is advancing its Enterprise Data Fabric to securely manage sensitive multimodal data, such as imaging and research data, across formats.Databricks is already the de facto standard for governance on structured data, and Amgen seeks to extend it for unstructured multi modal data too. This approach will also allow Amgen to standardize its GenAI projects on Databricks. Key priorities include: Centralized data access: establishing a unified, secure access control system Enhanced traceability: implementing detailed processes for transparency and accountability Consistent access standards: ensuring uniform data access privilege experience User support: providing flexible access for diverse stakeholders Comprehensive auditing: enabling thorough permission audits and data usage tracking Learn strategies for implementing a comprehensive multimodal data governance framework using Databricks, as we share our experience on standardizing data governance for GenAI use cases.

Unlocking Access: Simplifying Identity Management at Scale With Databricks

Unlocking Access: Simplifying Identity Management at Scale With Databricks

2025-06-11 Watch
talk
Keegan Dubbs (Databricks) , Hari Selvarajan (Databricks)

Effective Identity and Access Management (IAM) is essential for securing enterprise environments while enabling innovation and collaboration. As companies scale, ensuring users have the right access without adding administrative overhead is critical. In this session, we’ll explore how Databricks is simplifying identity management by integrating with customers’ Identity Providers (IDPs). Learn about Automatic Identity Management in Azure Databricks, which eliminates SCIM for Entra ID users and ensures scalable identity provisioning for other IDPs. We'll also cover externally managed groups, PIM integration and upcoming enhancements like a bring-your-own-IDP model for Google Cloud. Through a customer success story and live demo, see how Databricks is making IAM more scalable, secure and user-friendly.

Building Trustworthy AI at Northwestern Mutual: Guardrail Technologies and Strategies

Building Trustworthy AI at Northwestern Mutual: Guardrail Technologies and Strategies

2025-06-10 Watch
lightning_talk
Nicholas Brathwaite (Northwestern Mutual)

This intermediate-level presentation will explore the various methods we've leveraged within Databricks to deliver and evaluate guardrail models for AI safety. From prompt engineering with custom built frameworks to hosting models served from the market place and beyond. We've utilized GPU within clusters to fine-tune and run large open sourced models at inference such as Llama Guard 3.1 and generate synthetic datasets based on questions we've received from production.

No Time for the Dad Bod: Automating Life with AI and Databricks

No Time for the Dad Bod: Automating Life with AI and Databricks

2025-06-10 Watch
lightning_talk
Sean Falconer (Confluent)

Life as a father, tech leader, and fitness enthusiast demands efficiency. To reclaim my time, I’ve built AI-driven solutions that automate everyday tasks—from research agents that prep for podcasts to multi-agent systems that plan meals—all powered by real-time data and automation. This session dives into the technical foundations of these solutions, focusing on event-driven agent design and scalable patterns for robust AI systems. You’ll discover how Databricks technologies like Delta Lake, for reliable and scalable data management, and DSPy, for streamlining the development of generative AI workflows, empower seamless decision-making and deliver actionable insights. Through detailed architecture diagrams and a live demo, I’ll showcase how to design systems that process data in motion to tackle complex, real-world problems. Whether you’re an engineer, architect, or data scientist, you’ll leave with practical strategies to integrate AI-driven automation into your workflows.

Scaling Data Quality at Zillow: Migrating and Enhancing Data Quality Systems on Databricks

2025-06-10
lightning_talk
Laura Zhou (Zillow) , Firas Farah (Databricks)

Zillow has well-established, comprehensive systems for defining and enforcing data quality contracts and detecting anomalies.In this session, we will share how we evaluated Databricks’ native data quality features and why we chose Lakeflow Declarative Pipelines expectations for Lakeflow Declarative Pipelines, along with a combination of enforced constraints and self-defined queries for other job types. Our evaluation considered factors such as performance overhead, cost and scalability. We’ll highlight key improvements over our previous system and demonstrate how these choices have enabled Zillow to enforce scalable, production-grade data quality.Additionally, we are actively testing Databricks’ latest data quality innovations, including enhancements to lakehouse monitoring and the newly released DQX project from Databricks Labs.In summary, we will cover Zillow’s approach to data quality in the lakehouse, key lessons from our migration and actionable takeaways.

Sponsored by: Actian | Beyond the Lakehouse: Unlocking Enterprise-Wide AI-Ready Data with Unified Metadata Intelligence

Sponsored by: Actian | Beyond the Lakehouse: Unlocking Enterprise-Wide AI-Ready Data with Unified Metadata Intelligence

2025-06-10 Watch
lightning_talk
Emma McGrattan (Actian)

As organizations scale AI initiatives on platforms like Databricks, one challenge remains: bridging the gap between the data in the lakehouse and the vast, distributed data that lives elsewhere. Turning massive volumes of technical metadata into trusted, business-ready insight requires more than cataloging what's inside the lakehouse—it demands true enterprise-wide intelligence. Actian CTO Emma McGrattan will explore how combining Databricks Unity Catalog with the Actian Data Platform extends visibility, governance, and trust beyond the lakehouse. Learn how leading enterprises are: Integrating metadata across all enterprise data assets for complete visibility Enriching Unity Catalog metadata with business context for broader usability Empowering non-technical users to discover, trust, and act on AI-ready data Building a foundation for scalable data productization with governance by design

Sponsored by: Alation | Better Together: Enterprise Catalog with Databricks & Alation at American Airlines

Sponsored by: Alation | Better Together: Enterprise Catalog with Databricks & Alation at American Airlines

2025-06-10 Watch
lightning_talk
Anuradha Maradapu (American Airlines)

In the era of data-driven enterprises, true democratization requires more than just access–it demands context, trust, and governance at scale. In this session, discover how to seamlessly integrate Databricks Unity Catalog with Alation’s Enterprise Data Catalog to deliver: End-to-End Lineage Storytelling: Unify technical and business views into a single, cohesive narrative that resonates with both technical engineers and non-technical stakeholders across business domains Accelerated and Democratized Insights: Automate metadata stitching to reduce time-to-insight, enabling analysts to answer critical business questions faster and drive multi-domain collaboration Empowered, Trustworthy Discovery: Equip business users with a unified platform, populated with rich documentation and usage signals, so they can find, understand, and confidently use trusted data assets

Sponsored by: Fivetran | Raw Data to Real-Time Insights: How Dropbox Revolutionized Data Ingestion

Sponsored by: Fivetran | Raw Data to Real-Time Insights: How Dropbox Revolutionized Data Ingestion

2025-06-10 Watch
lightning_talk
Kelly Kohlleffel (Fivetran) , Chris Neat (Dropbox)

Dropbox, a leading cloud storage platform, is on a mission to accelerate data insights to better understand customers’ needs and elevate the overall customer experience. By leveraging Fivetran’s data movement platform, Dropbox gained real-time visibility into customer sentiment, marketing ROI, and ad performance-empowering teams to optimize spend, improve operational efficiency, and deliver greater business outcomes.Join this session to learn how Dropbox:- Cut data pipeline time from 8 weeks to 30 minutes by automating ingestion and streamlining reporting workflows.- Enable real-time, reliable data movement across tools like Zendesk Chat, Google Ads, MySQL, and more — at global operations scale.- Unify fragmented data sources into the Databricks Data Intelligence Platform to reduce redundancy, improve accessibility, and support scalable analytics.

Sponsored by: Slalom | Nasdaq's Journey from Fragmented Customer Data to AI-Ready Insights

Sponsored by: Slalom | Nasdaq's Journey from Fragmented Customer Data to AI-Ready Insights

2025-06-10 Watch
lightning_talk
Soumya Ghosh (Slalom)

Nasdaq’s rapid growth through acquisitions led to fragmented client data across multiple Salesforce instances, limiting cross-sell potential and sales insights. To solve this, Nasdaq partnered with Slalom to build a unified Client Data Hub on the Databricks Lakehouse Platform. This cloud-based solution merges CRM, product usage, and financial data into a consistent, 360° client view accessible across all Salesforce orgs with bi-directional integration. It enables personalized engagement, targeted campaigns, and stronger cross-sell opportunities across all business units. By delivering this 360 view directly in Salesforce, Nasdaq is improving sales visibility, client satisfaction, and revenue growth. The platform also enables advanced analytics like segmentation, churn prediction, and revenue optimization. With centralized data in Databricks, Nasdaq is now positioned to deploy next-gen Agentic AI and chatbots to drive efficiency and enhance sales and marketing experiences.

Traditional MDM is Dead. How Next-Generation Data Products are Winning the Enterprise

Traditional MDM is Dead. How Next-Generation Data Products are Winning the Enterprise

2025-06-10 Watch
lightning_talk
Dan Onions (Quantexa)

Organizations continue to struggle under the weight of data that still exists across multiple siloed sources, leaving data teams caught between their crumbling legacy data foundations and the race to build new AI and data-driven applications. Modern enterprises are quickly pivoting to data products that simplify and improve reusable data pipelines by joining data at massive scale and publishing it for internal users and the applications that drive business outcomes. Learn how Quantexa with Databricks enables an internal data marketplace to deliver the value that traditional data platforms never could.

Unlocking the Power of Iceberg: Our Journey to a Unified Lakehouse on Databricks

Unlocking the Power of Iceberg: Our Journey to a Unified Lakehouse on Databricks

2025-06-10 Watch
lightning_talk
Tomer Sabag (LSports)

This session showcases our journey of adopting Apache Iceberg™ to build a modern lakehouse architecture and leveraging Databricks advanced Iceberg support to take it to the next level. We’ll dive into the key design principles behind our lakehouse, the operational challenges we tackled and how Databricks enabled us to unlock enhanced performance, scalability and streamlined data workflows. Whether you’re exploring Apache Iceberg™ or building a lakehouse on Databricks, this session offers actionable insights, lessons learned and best practices for modern data engineering.

Accelerate End-to-End Multi-Agents on Databricks and DSPy

Accelerate End-to-End Multi-Agents on Databricks and DSPy

2025-06-10 Watch
talk
Austin Choi (Databricks)

A production-ready GenAI application is more than the framework itself. Like ML, you need a unified platform to create an end-to-end workflow for production quality applications.Below is an example of how this works on Databricks: Data ETL with Lakeflow Declarative Pipelines and jobs Data storage for governance and access with Unity Catalog Code development with Notebooks Agent versioning and metric tracking with MLflow and Unity Catalog Evaluation and optimizations with Mosaic AI Agent Framework and DSPy Hosting infrastructure with monitoring with Model Serving and AI Gateway Front-end apps using Databricks Apps In this session, learn how to build agents to access all your data and models through function calling. Then, learn how DSPy enables agent interaction with each other to ensure the question is answered correctly. We will demonstrate a chatbot, powered by multiple agents, to be able to answer questions and reason answers the base LLM does not know and very specialized topics.ow and very specialized topics.

AI Meets SQL: Leverage GenAI at Scale to Enrich Your Data

AI Meets SQL: Leverage GenAI at Scale to Enrich Your Data

2025-06-10 Watch
talk
Sid Taneja (Databricks) , Youngbin Kim (Databricks)

This session is repeated. Integrating AI into existing data workflows can be challenging, often requiring specialized knowledge and complex infrastructure. In this session, we'll share how SQL users can leverage AI/ML to access large language models (LLMs) and traditional machine learning directly from within SQL, simplifying the process of incorporating AI into data workflows. We will demonstrate how to use Databricks SQL for natural language processing, traditional machine learning, retrieval augmented generation and more. You'll learn about best practices and see examples of solving common use cases such as opinion mining, sentiment analysis, forecasting and other common AI/ML tasks.

A Prescription for Success: Leveraging DABs for Faster Deployment and Better Patient Outcomes

A Prescription for Success: Leveraging DABs for Faster Deployment and Better Patient Outcomes

2025-06-10 Watch
talk
Brendon Allen (Health Catalyst) , Alex Owen (Databricks)

Health Catalyst (HCAT) transformed its CI/CD strategy by replacing a rigid, internal deployment tool with Databricks Asset Bundles (DABs), unlocking greater agility and efficiency. This shift streamlined deployments across both customer workspaces and HCAT's core platform, accelerating time to insights and driving continuous innovation. By adopting DABs, HCAT ensures feature parity, standardizes metric stores across clients, and rapidly delivers tailored analytics solutions. Attendees will gain practical insights into modernizing CI/CD pipelines for healthcare analytics, leveraging Databricks to scale data-driven improvements. HCAT's next-generation platform, Health Catalyst Ignite™, integrates healthcare-specific data models, self-service analytics, and domain expertise—powering faster, smarter decision-making.

Building Real-time Trading Dashboards with Lakeflow Declarative Pipelines, Serverless OLTP and Databricks Apps

2025-06-10
talk
Matt Slack (Databricks) , Matthew Moorcroft (Databricks)

Barclays Post Trade real-time trade monitoring platform was historically built on a complex set of legacy technologies including Java, Solace, and custom micro-services.This session will demonstrate how the power of Lakeflow Declarative Pipelines' new real-time mode, in conjunction with the foreach_batch_sink, can enable simple, cost-effective streaming pipelines that can load high volumes of data into Databricks new Serverless OLTP database with very low latency.Once in our OLTP database, this can be used to update real-time trading dashboards, securely hosted in Databricks Apps, with the latest stock trades - enabling better, more responsive decision-making and alerting.The session will walk-through the architecture, and demonstrate how simple it is to create and manage the pipelines and apps within the Databricks environment.

Databricks on Databricks: Powering Marketing Insights with Lakehouse

Databricks on Databricks: Powering Marketing Insights with Lakehouse

2025-06-10 Watch
talk
Elizabeth Dobbs (Databricks) , Anoop Muraleedharan (Databricks)

This presentation outlines the evolution of our marketing data strategy, focusing on how we’ve built a strong foundation using the Databricks Lakehouse. We will explore key advancements across data ingestion, strategy, and insights, highlighting the transition from legacy systems to a more scalable and intelligent infrastructure. Through real-world applications, we will showcase how unified Customer 360 insights drive personalization, predictive analytics enhance campaign effectiveness, and GenAI optimizes content creation and marketing execution. Looking ahead, we will demonstrate the next phase of our CDP, the shift toward an end-user-first analytics model powered by AIBI, Genie and Matik, and the growing importance of clean rooms for secure data collaboration. This is just the beginning, and we are poised to unlock even greater capabilities in the future.

From Largest to Best: How We Transformed Databricks’ Biggest Workspace With Unity Catalog

From Largest to Best: How We Transformed Databricks’ Biggest Workspace With Unity Catalog

2025-06-10 Watch
talk
Donghan Zhang (Databricks) , Li Yang (Databricks)

Join us as we unveil how we transformed the largest Databricks workspace into the best-in-class lakehouse through Unity Catalog. Discover how we harnessed lineage and unified access management to build ultimate governance automation.