talk-data.com talk-data.com

Event

Data + AI Summit 2025

2025-06-09 – 2025-06-13 Databricks Summit Visit website ↗

Activities tracked

425

Filtering by: AI/ML ×

Sessions & talks

Showing 251–275 of 425 · Newest first

Search within this event →
One-Stop Machine Translation Solution in Game Domain From Real-Time UGC Content to In-Game Text

One-Stop Machine Translation Solution in Game Domain From Real-Time UGC Content to In-Game Text

2025-06-11 Watch
talk
Junxuan Huang (Tencent)

We present Level Infinite AI Translation, a translation engine developed by Tencent, tailored specifically for the gaming industry. The primary challenge in game machine translation (MT) lies in accurately interpreting the intricate context of game texts, effectively handling terminology and adapting to the highly diverse translation formats and stylistic requirements across different games. Traditional MT approaches cannot effectively address the aforementioned challenges due to their weak context representation ability and lack of common knowledge. Leveraging large language model and related technology, our engine is crafted to capture the subtleties of localized language expression while ensuring optimization for domain-specific terminology, jargon and required formats and styles. To date, the engine has been successfully implemented in 15 international projects, translating over one billion words across 23 languages, and has demonstrated cost savings exceeding 25% for partners.

Scaling Trust in BI: How Bolt Manages Thousands of Metrics Across Databricks, dbt, and Looker

Scaling Trust in BI: How Bolt Manages Thousands of Metrics Across Databricks, dbt, and Looker

2025-06-11 Watch
talk
Silja Märdla (Bolt) , Sarah Levy (Euno)

Managing metrics across teams can feel like everyone’s speaking a different language, which often leads to loss of trust in numbers. Based on a real-world use case, we’ll show you how to establish a governed source of truth for metrics that works at scale and builds a solid foundation for AI integration. You’ll explore how Bolt.eu’s data team governs consistent metrics for different data users and leverages Euno’s automations to navigate the overlap between Looker and dbt. We’ll cover best practices for deciding where your metrics belong and how to optimize engineering and maintenance workflows across Databricks, dbt and Looker. For curious analytics engineers, we’ll dive into thinking in dimensions & measures vs. tables & columns and determining when pre-aggregations make sense. The goal is to help you contribute to a self-serve experience with consistent metric definitions, so business teams and AI agents can access the right data at the right time without endless back-and-forth.

Sponsored by: Google Cloud | Unleash the power of Gemini for Databricks

Sponsored by: Google Cloud | Unleash the power of Gemini for Databricks

2025-06-11 Watch
lightning_talk
Abhishek Bhagwat (Google Cloud)

Elevate your AI initiatives on Databricks by harnessing the latest advancements in Google Cloud's Gemini models. Learn how to integrate Gemini's built-in reasoning and powerful development tools to build more dynamic and intelligent applications within your existing Databricks platform. We'll explore concrete ideas for agentic AI solutions, showcasing how Gemini can help you unlock new value from your data in Databricks.

Sponsored by: Hightouch | Unleashing AI at PetSmart: Using AI Decisioning Agents to Drive Revenue

Sponsored by: Hightouch | Unleashing AI at PetSmart: Using AI Decisioning Agents to Drive Revenue

2025-06-11 Watch
talk
Bradley Breuer (PetSmart) , Tejas Manohar (Hightouch)

With 75M+ Treats Rewards members, PetSmart knows how to build loyalty with pet parents. But recently, traditional email testing and personalization strategies weren’t delivering the engagement and growth they wanted—especially in the Salon business. This year, they replaced their email calendar and A/B testing with AI Decisioning, achieving a +22% incremental lift in bookings. Join Bradley Breuer, VP of Marketing – Loyalty, Personalization, CRM, and Customer Analytics, to learn how his team reimagined CRM using AI to personalize campaigns and dynamically optimize creative, offers, and timing for every unique pet parent. Learn: How PetSmart blends human insight and creativity with AI to deliver campaigns that engage and convert. How they moved beyond batch-and-blast calendars with AI Decisioning Agents to optimize sends—while keeping control over brand, messaging, and frequency. How using Databricks as their source of truth led to surprising learnings and better outcomes.

Sponsored by: Onehouse | Open By Default, Fast By Design: One Lakehouse That Scales From BI to AI

Sponsored by: Onehouse | Open By Default, Fast By Design: One Lakehouse That Scales From BI to AI

2025-06-11 Watch
lightning_talk
Kyle Weller (Onehouse.ai)

You already see the value of the lakehouse. But are you truly maximizing its potential across all workloads, from BI to AI? In this session, Onehouse unveils how our open lakehouse architecture unifies your entire stack, enabling true interoperability across formats, catalogs, and engines. From lightning-fast ingestion at scale to cost-efficient processing and multi-catalog sync, Onehouse helps you go beyond trade-offs. Discover how Apache XTable (Incubating) enables cross-table-format compatibility, how OpenEngines puts your data in front of the best engine for the job, and how OneSync keeps data consistent across Snowflake, Athena, Redshift, BigQuery, and more. Meanwhile, our purpose-built lakehouse runtime slashes ingest and ETL costs. Whether you’re delivering BI, scaling AI, or building the next big thing, you need a lakehouse that’s open and powerful. Onehouse opens everything—so your data can power anything.

Sponsored by: RowZero | Spreadsheets in the modern data stack: security, governance, AI, and self-serve analytics

Sponsored by: RowZero | Spreadsheets in the modern data stack: security, governance, AI, and self-serve analytics

2025-06-11 Watch
lightning_talk
Breck Fresen (Row Zero)

Despite the proliferation of cloud data warehousing, BI tools, and AI, spreadsheets are still the most ubiquitous data tool. Business teams in finance, operations, sales, and marketing often need to analyze data in the cloud data warehouse but don't know SQL and don't want to learn BI tools. AI tools offer a new paradigm but still haven't broadly replaced the spreadsheet. With new AI tools and legacy BI tools providing business teams access to data inside Databricks, security and governance are put at risk. In this session, Row Zero CEO, Breck Fresen, will share examples and strategies data teams are using to support secure spreadsheet analysis at Fortune 500 companies and the future of spreadsheets in the world of AI. Breck is a former Principal Engineer from AWS S3 and was part of the team that wrote the S3 file system. He is an expert in storage, data infrastructure, cloud computing, and spreadsheets.

Turn Genie Into an Agent Using Conversation APIs

Turn Genie Into an Agent Using Conversation APIs

2025-06-11 Watch
talk
Prithvi Kannan (Databricks) , Hanlin Sun (Databricks)

Transform your AI/BI Genie into a text-to-SQL powerhouse using the Genie Conversation APIs. This session explores how Genie functions as an intelligent agent, translating natural language queries into SQL to accelerate insights and enhance self-service analytics. You'll learn practical techniques for configuring agents, optimizing queries and handling errors — ensuring Genie delivers accurate, relevant responses in real time. A must-attend for teams looking to level up their AI/BI capabilities and deliver smarter analytics experiences.

Industry Forum Networking Reception

2025-06-11
networking

Data + AI Summit brings together thousands of leaders from across the industry ecosystem, sharing ideas and practical applications for data and AI. At 5:00pm on Tuesday, June 10 following the Industry & Solution Forums, join us in Moscone West Level 2 for a networking event (refreshments provided) and connect with industry peers!

Building Trustworthy AI at Northwestern Mutual: Guardrail Technologies and Strategies

Building Trustworthy AI at Northwestern Mutual: Guardrail Technologies and Strategies

2025-06-10 Watch
lightning_talk
Nicholas Brathwaite (Northwestern Mutual)

This intermediate-level presentation will explore the various methods we've leveraged within Databricks to deliver and evaluate guardrail models for AI safety. From prompt engineering with custom built frameworks to hosting models served from the market place and beyond. We've utilized GPU within clusters to fine-tune and run large open sourced models at inference such as Llama Guard 3.1 and generate synthetic datasets based on questions we've received from production.

No Time for the Dad Bod: Automating Life with AI and Databricks

No Time for the Dad Bod: Automating Life with AI and Databricks

2025-06-10 Watch
lightning_talk
Sean Falconer (Confluent)

Life as a father, tech leader, and fitness enthusiast demands efficiency. To reclaim my time, I’ve built AI-driven solutions that automate everyday tasks—from research agents that prep for podcasts to multi-agent systems that plan meals—all powered by real-time data and automation. This session dives into the technical foundations of these solutions, focusing on event-driven agent design and scalable patterns for robust AI systems. You’ll discover how Databricks technologies like Delta Lake, for reliable and scalable data management, and DSPy, for streamlining the development of generative AI workflows, empower seamless decision-making and deliver actionable insights. Through detailed architecture diagrams and a live demo, I’ll showcase how to design systems that process data in motion to tackle complex, real-world problems. Whether you’re an engineer, architect, or data scientist, you’ll leave with practical strategies to integrate AI-driven automation into your workflows.

Sponsored by: Actian | Beyond the Lakehouse: Unlocking Enterprise-Wide AI-Ready Data with Unified Metadata Intelligence

Sponsored by: Actian | Beyond the Lakehouse: Unlocking Enterprise-Wide AI-Ready Data with Unified Metadata Intelligence

2025-06-10 Watch
lightning_talk
Emma McGrattan (Actian)

As organizations scale AI initiatives on platforms like Databricks, one challenge remains: bridging the gap between the data in the lakehouse and the vast, distributed data that lives elsewhere. Turning massive volumes of technical metadata into trusted, business-ready insight requires more than cataloging what's inside the lakehouse—it demands true enterprise-wide intelligence. Actian CTO Emma McGrattan will explore how combining Databricks Unity Catalog with the Actian Data Platform extends visibility, governance, and trust beyond the lakehouse. Learn how leading enterprises are: Integrating metadata across all enterprise data assets for complete visibility Enriching Unity Catalog metadata with business context for broader usability Empowering non-technical users to discover, trust, and act on AI-ready data Building a foundation for scalable data productization with governance by design

Sponsored by: Slalom | Nasdaq's Journey from Fragmented Customer Data to AI-Ready Insights

Sponsored by: Slalom | Nasdaq's Journey from Fragmented Customer Data to AI-Ready Insights

2025-06-10 Watch
lightning_talk
Soumya Ghosh (Slalom)

Nasdaq’s rapid growth through acquisitions led to fragmented client data across multiple Salesforce instances, limiting cross-sell potential and sales insights. To solve this, Nasdaq partnered with Slalom to build a unified Client Data Hub on the Databricks Lakehouse Platform. This cloud-based solution merges CRM, product usage, and financial data into a consistent, 360° client view accessible across all Salesforce orgs with bi-directional integration. It enables personalized engagement, targeted campaigns, and stronger cross-sell opportunities across all business units. By delivering this 360 view directly in Salesforce, Nasdaq is improving sales visibility, client satisfaction, and revenue growth. The platform also enables advanced analytics like segmentation, churn prediction, and revenue optimization. With centralized data in Databricks, Nasdaq is now positioned to deploy next-gen Agentic AI and chatbots to drive efficiency and enhance sales and marketing experiences.

Traditional MDM is Dead. How Next-Generation Data Products are Winning the Enterprise

Traditional MDM is Dead. How Next-Generation Data Products are Winning the Enterprise

2025-06-10 Watch
lightning_talk
Dan Onions (Quantexa)

Organizations continue to struggle under the weight of data that still exists across multiple siloed sources, leaving data teams caught between their crumbling legacy data foundations and the race to build new AI and data-driven applications. Modern enterprises are quickly pivoting to data products that simplify and improve reusable data pipelines by joining data at massive scale and publishing it for internal users and the applications that drive business outcomes. Learn how Quantexa with Databricks enables an internal data marketplace to deliver the value that traditional data platforms never could.

Accelerate End-to-End Multi-Agents on Databricks and DSPy

Accelerate End-to-End Multi-Agents on Databricks and DSPy

2025-06-10 Watch
talk
Austin Choi (Databricks)

A production-ready GenAI application is more than the framework itself. Like ML, you need a unified platform to create an end-to-end workflow for production quality applications.Below is an example of how this works on Databricks: Data ETL with Lakeflow Declarative Pipelines and jobs Data storage for governance and access with Unity Catalog Code development with Notebooks Agent versioning and metric tracking with MLflow and Unity Catalog Evaluation and optimizations with Mosaic AI Agent Framework and DSPy Hosting infrastructure with monitoring with Model Serving and AI Gateway Front-end apps using Databricks Apps In this session, learn how to build agents to access all your data and models through function calling. Then, learn how DSPy enables agent interaction with each other to ensure the question is answered correctly. We will demonstrate a chatbot, powered by multiple agents, to be able to answer questions and reason answers the base LLM does not know and very specialized topics.ow and very specialized topics.

AI Meets SQL: Leverage GenAI at Scale to Enrich Your Data

AI Meets SQL: Leverage GenAI at Scale to Enrich Your Data

2025-06-10 Watch
talk
Sid Taneja (Databricks) , Youngbin Kim (Databricks)

This session is repeated. Integrating AI into existing data workflows can be challenging, often requiring specialized knowledge and complex infrastructure. In this session, we'll share how SQL users can leverage AI/ML to access large language models (LLMs) and traditional machine learning directly from within SQL, simplifying the process of incorporating AI into data workflows. We will demonstrate how to use Databricks SQL for natural language processing, traditional machine learning, retrieval augmented generation and more. You'll learn about best practices and see examples of solving common use cases such as opinion mining, sentiment analysis, forecasting and other common AI/ML tasks.

Cross-Region AI Model Deployment for Resiliency and Compliance

Cross-Region AI Model Deployment for Resiliency and Compliance

2025-06-10 Watch
talk
Greg Wood (Databricks) , Tony Farias (Databricks)

AI for enterprises, particularly in the era of GenAI, requires rapid experimentation and the ability to productionize models and agents quickly and at scale. Compliance, resilience and commercial flexibility drive the need to serve models across regions. As cloud providers struggle with rising demand for GPUs in environments, VM shortages have become commonplace, and add to the pressure of general cloud outages. Enterprises that can quickly leverage GPU capacity in other cloud regions will be better equipped to capitalize on the promise of AI, while staying flexible to serve distinct user bases and complying with regulations. In this presentation we will show and discuss how to implement AI deployments across cloud regions, deploying a model across regions and using a load balancer to determine where to best route a user request.

Introduction to Modern Open Table Formats and Catalogs

Introduction to Modern Open Table Formats and Catalogs

2025-06-10 Watch
talk
Bart Samwel (Databricks) , Sirui Sun (Databricks)

In this session, learn about why modern open table formats like Delta and Iceberg are a big deal and how they work with catalogs. Learn about what motivated their creation, how they work, what benefits they can bring to your data and AI platform. Hear about how these formats are becoming increasingly interoperable and what our vision is for their future.

Managing the Governed Cloud

Managing the Governed Cloud

2025-06-10 Watch
talk
Sherri Adame (GM) , Johnathan Powell (General Motors)

As organizations increasingly adopt Databricks as a unified platform for analytics and AI, ensuring robust data governance becomes critical for compliance, security, and operational efficiency. This presentation will explore the end-to-end framework for governing the Databricks cloud, covering key use cases, foundational governance principles, and scalable automation strategies. We will discuss best practices for metadata, data access, catalog, classification, quality, and lineage, while leveraging automation to streamline enforcement. Attendees will gain insights into best practices and real-world approaches to building a governed data cloud that balances innovation with control.

Real-Time Mode Technical Deep Dive: How We Built Sub-300 Millisecond Streaming Into Apache Spark™

Real-Time Mode Technical Deep Dive: How We Built Sub-300 Millisecond Streaming Into Apache Spark™

2025-06-10 Watch
talk
Siying Dong (Databricks) , Jerry Peng (Databricks)

Real-time mode is a new low-latency execution mode for Apache Spark™ Structured Streaming. It can consistently provide p99 latencies less than 300 milliseconds for a broad set of stateless and stateful streaming queries. Our talk focuses on the technical aspects of making this possible in Spark. We’ll dive into the core architecture that enables these dramatic latency improvements, including a concurrent stage scheduler and a non-blocking shuffle. We’ll explore how we maintained Spark’s fault-tolerance guarantees, and we’ll also share specific optimizations we made to our streaming SQL operators. These architectural improvements have already enabled Databricks customers to build workloads with latencies up to 10x lower than before. Early adopters in our Private Preview have successfully implemented real-time enrichment pipelines and feature engineering for machine learning — use cases that were previously impossible at these latencies.

RecSys, Topic Modeling and Agents: Bridging the GenAI-Traditional ML Divide

RecSys, Topic Modeling and Agents: Bridging the GenAI-Traditional ML Divide

2025-06-10 Watch
talk
Dan Pechi (Databricks)

The rise of GenAI has led to a complete reinvention of how we conceptualize Data + AI. In this breakout, we will recontextualize the rise of GenAI in traditional ML paradigms, and hopefully unite the pre- and post-LLM eras. We will demonstrate when and where GenAI may prove more effective than traditional ML algorithms, and highlight problems for which the wheel is unnecessarily being reinvented with GenAI. This session will also highlight how MLflow provides a unified means of benchmarking traditional ML against GenAI, and lay out a vision for bridging the divide between Traditional ML and GenAI practitioners.

Revolutionizing Nuclear AI With HiVE and Bertha on Databricks Architecture

Revolutionizing Nuclear AI With HiVE and Bertha on Databricks Architecture

2025-06-10 Watch
talk
Lou Martinez Sancho (Westinghouse Electric Company)

In this session we will explore the revolutionary advancements in nuclear AI capabilities with HiVE and Bertha on Databricks architecture. HiVE, developed by Westinghouse, leverages over a century of proprietary data to deliver unparalleled AI capabilities. At its core is Bertha, a generative AI model designed to tackle the unique challenges of the nuclear industry. This session will delve into the technical architecture of HiVE and Bertha, showcasing how Databricks' scalable environment enhances their performance. We will discuss the secure data infrastructure supporting HiVE, ensuring data integrity and compliance. Real-world applications and use cases will demonstrate the impact of HiVE and Bertha on improving efficiency, innovation and safety in nuclear operations. Discover how the fusion of HiVE and Bertha with Databricks architecture is transforming the nuclear AI landscape and driving the future of nuclear technology.

Shifting Left — Setting up Your GenAI Ecosystem to Work for Business Analysts

Shifting Left — Setting up Your GenAI Ecosystem to Work for Business Analysts

2025-06-10 Watch
talk
James Lin (Experian)

At Data and AI in 2022, Databricks pioneered the term to shift left in how AI workloads would enable less data science driven people to create their own apps. In 2025, we take a look at how Experian is doing on that journey. This session highlights Databricks services that assist with the shift left paradigm for Generative AI, including how AI/BI Genie helps with Generative analytics, and how Agent Studio helps with synthetic generation of test cases to validate model performance.

Sponsored by: ThoughtSpot | How Chevron Fuels Cloud Data Modernization

Sponsored by: ThoughtSpot | How Chevron Fuels Cloud Data Modernization

2025-06-10 Watch
talk
Francois Lopitaux (ThoughtSpot) , Ben Vander Heide (Chevron)

Learn how Chevron transitioned their central finance and procurement analytics into the cloud using Databricks and ThoughtSpot’s Agentic Analytics Platform. Explore how Chevron leverages ThoughtSpot to unlock actionable insights, enhance their semantic layer with user-driven understanding, and ultimately drive more impactful strategies for customer engagement and business growth. In this session, Chevron explains the dos, don’ts, and best practices of migrating from outdated legacy business intelligence to real time, AI-powered insights.

Streaming Meets Governance: Building AI-Ready Tables With Confluent Tableflow and Unity Catalog

Streaming Meets Governance: Building AI-Ready Tables With Confluent Tableflow and Unity Catalog

2025-06-10 Watch
talk
Kasun Indrasiri Gamage (Confluent) , Victoria Bukta (Databricks)

Learn how Databricks and Confluent are simplifying the path from real-time data to governed, analytics- and AI-ready tables. This session will cover how Confluent Tableflow automatically materializes Kafka topics into Delta tables and registers them with Unity Catalog — eliminating the need for custom streaming pipelines. We’ll walk through how this integration helps data engineers reduce ingestion complexity, enforce data governance and make real-time data immediately usable for analytics and AI.

The Next Wave of AI Applications Driven by Agentic Workflow at Adidas Using Databricks

2025-06-10
talk
Joana Ferreira (Adidas AG) , Mahavir Teraiya (Databricks)

Curious to know how Adidas is transforming customer experience and business impact with agentic workflows, powered by Databricks? By leveraging cutting-edge tools like MosaicML’s deployment capabilities, Mosaic AI Gateway, and MLflow, Adidas built a scalable GenAI agentic infrastructure that delivers actionable insights from growing 2 million product reviews annually. With remarkable results: 60% latency reduction (15.5 seconds to 6 seconds) 91.67% cost savings (transitioning to more efficient LLMs) 98.5% token efficiency, reducing input tokens from 200k to just 3k 20% increase in productivity (faster time to insight) Empowering over 500 decision-makers across 150+ countries, this infrastructure is set to optimize products and services for Adidas’ 500 million members by 2025 while supporting dozens of upcoming AI-driven solutions. Join us to explore how Adidas turned agentic workflows infra into a strategic advantage using Databricks and learn how you can do the same!