talk-data.com talk-data.com

Event

Data + AI Summit 2025

2025-06-09 – 2025-06-13 Databricks Summit Visit website ↗

Activities tracked

715

Sessions & talks

Showing 76–100 of 715 · Newest first

Search within this event →
TAO and Reinforcement Learning: Building AI With the Data You Have

TAO and Reinforcement Learning: Building AI With the Data You Have

2025-06-12 Watch
talk
Brandon Cui (Databricks) , Jonathan Frankle (Databricks)

Curious about the cutting-edge technology that's revolutionizing AI model performance? Join us for an in-depth exploration of TAO and discover how this innovative approach is transforming the capabilities of modern AI systems. This research-focused session peels back the layers of theoretical foundations, implementation challenges, and breakthrough applications that make TAO one of the most promising advancements in AI development. Key takeaways: Understanding the fundamental principles behind TAO and how it differs from conventional optimization techniques Examining the quantifiable improvements in model accuracy, efficiency, and generalization capabilities Exploring real-world case studies where TAO has solved previously intractable AI challenges Analyzing current research directions and future potential for further enhancements Whether you're a research scientist, AI engineer, or technical leader, this session will equip you with valuable insights into how TAO can be leveraged to push your AI models beyond current limitations.

Tech Industry Session: Building Collaborative Ecosystems With Openness and Portability

Tech Industry Session: Building Collaborative Ecosystems With Openness and Portability

2025-06-12 Watch
talk
Matthew Houser (Tealium) , Bob Pisani (Addepar) , Adrian Bolosan (Databricks) , Davis Matson (Health Catalyst)

Join us to discover how leading tech companies accelerate growth using open ecosystems and built-on solutions to foster collaboration, accelerate innovation and create scalable data products. This session will explore how organizations use Databricks to securely share data, integrate with partners and enable teams to build impactful applications powered by AI and analytics. Topics include: Using Delta Sharing for secure, real-time data collaboration across teams and partners Embedding analytics and creating marketplaces to extend product capabilities Building with open standards and governance frameworks to ensure compliance without sacrificing agility Hear real-world examples of how open ecosystems empower organizations to widen the aperture on collaboration, driving better business outcomes. Walk away with insights into how open data sharing and built-on solutions can help your teams innovate faster at scale.

Telecom Innovation Exchange: Demos and Dialogues

Telecom Innovation Exchange: Demos and Dialogues

2025-06-12 Watch
talk
Steve Jones (Capgemini) , Randeep Raghu (Wipro) , Prakash Trivedi (Accenture) , Nevash Pillay (Databricks)

Join us for an interactive breakout session designed to explore scalable, real-world solutions powered by Partners with Databricks. In this high-energy session, you'll hear from three of our leading partners — Accenture, Capgemini and Wipro — as they each deliver rapid-fire, 5-minute demos of their most impactful, production-grade solutions built for the telecom industry. From network intelligence to customer experience to AI-driven automation, these solutions are already driving tangible outcomes at scale. After the demos, you’ll have the unique opportunity to engage directly with each partner in a “speed dating” style format. Dive deep into the solutions, ask your questions and explore how these approaches can be tailored to your organization’s needs. Whether you're solving for churn, fraud, network ops or enterprise AI use cases, this session is your chance to connect, collaborate and walk away with practical ideas you can take back to your teams.

The Future of Open Table Formats: Delta Lake, Iceberg, and More

The Future of Open Table Formats: Delta Lake, Iceberg, and More

2025-06-12 Watch
talk
Daniel Weeks (Databricks) , Ryan Blue (Databricks)

Open table formats are evolving quickly. In this session, we’ll explore the latest features of Delta Lake and Apache Iceberg™ , including a look at the emerging Iceberg v3 specification. Join us to learn about what’s driving format innovation, how interoperability is becoming real, and what it means for the future of data architecture.

What's New and What's Next: Building Impactful AI/BI Dashboards

What's New and What's Next: Building Impactful AI/BI Dashboards

2025-06-12 Watch
talk
Eason Gao (Databricks) , Rory Jacobs (Databricks)

Ready to take your AI/BI dashboards to the next level? This session dives into the latest capabilities in Databricks AI/BI Dashboards and how to maximize impact across your organization. Learn how data authors can tailor visualizations for different audiences, optimize performance and seamlessly integrate with Genie for a unified analytics experience. We’ll also share practical tips on how business users and data teams can better collaborate — ensuring insights are accessible, actionable and aligned to business goals.

What’s New in Apache Spark™ 4.0?

What’s New in Apache Spark™ 4.0?

2025-06-12 Watch
talk
Wenchen Fan (Databricks) , Daniel Tenedorio (Databricks)

Join this session for a concise tour of Apache Spark™ 4.0’s most notable enhancements: SQL features: ANSI by default, scripting, SQL pipe syntax, SQL UDF, session variable, view schema evolution, etc. Data type: VARIANT type, string collation Python features: Python data source, plotting API, etc. Streaming improvements: State store data source, state store checkpoint v2, arbitrary state v2, etc. Spark Connect improvements: More API coverage, thin client, unified Scala interface, etc. Infrastructure: Better error message, structured logging, new Java/Scala version support, etc. Whether you’re a seasoned Spark user or new to the ecosystem, this talk will prepare you to leverage Spark 4.0’s latest innovations for modern data and AI pipelines.

What’s New in Unity Catalog With Live Demos

What’s New in Unity Catalog With Live Demos

2025-06-12 Watch
talk
Paul Roome (Databricks) , Murt Neemuchwala (Databricks)

Join the Unity Catalog product team for an exclusive deep dive into the latest innovations and upcoming features of Unity Catalog! Explore cutting-edge advancements in access control, discovery, lineage and monitoring — plus get a sneak peek at what’s coming next. Packed with live demos, expert insights and best practices from thousands of customers running Unity Catalog in production, this session is also your chance to engage directly with product experts and get answers to your most pressing questions. Don’t miss this opportunity to stay ahead of the curve and elevate your data governance strategy!

AI-Assisted BI: Everything You Need to Know

AI-Assisted BI: Everything You Need to Know

2025-06-12 Watch
lightning_talk
Chung Wu (Databricks) , Alex Lichen (Databricks)

Explore how AI is transforming business intelligence and data analytics across the Databricks platform. This session offers a comprehensive overview of AI-assisted capabilities, from generating dashboards and visualizations to integrating Genie on dashboards for conversational analytics. Whether you’re a data engineer, analyst or BI developer, this session will equip you to leverage AI with BI for better, smarter decisions.

A No-Code ML Forecasting Platform for Retail and CPG Companies

A No-Code ML Forecasting Platform for Retail and CPG Companies

2025-06-12 Watch
lightning_talk
Moez Ali (Zebra Technologies)

Retail and CPG companies face growing pressure to better forecast demand, optimize pricing and manage inventory — yet traditional approaches take months to deploy and often require extensive engineering support. In this session, we will showcase Workcloud Modeling Studio, a low-code/no-code ML platform designed for data scientists working in retail and CPG. Learn how this tool improves forecasting accuracy and accelerates time-to-value from months to hours. We will walk through a real-world use case of demand forecasting for a retailer using Zebra's Modeling Studio. This talk will demonstrate how to build, train and deploy an ML forecasting pipeline — without reinventing the wheel.

A Practical Roadmap to Becoming an Expert Databricks Data Engineer

A Practical Roadmap to Becoming an Expert Databricks Data Engineer

2025-06-12 Watch
lightning_talk
Derar Alhussein (Acadford)

The demand for skilled Databricks data engineers continues to rise as enterprises accelerate their adoption of the Databricks platform. However, navigating the complex ecosystem of data engineering tools, frameworks and best practices can be overwhelming. This session provides a structured roadmap to becoming an expert Databricks data engineer, offering a clear progression from foundational skills to advanced capabilities. Acadford, a leading training provider, has successfully trained thousands of data engineers on Databricks, equipping them with the skills needed to excel in their careers and obtain professional certifications. Drawing on this experience, we will guide attendees through the most in-demand skills and knowledge areas through a combination of structured learning and practical insights. Key takeaways: Understand the core tech stack in Databricks Explore real-world code examples and live demonstrations Receive an actionable learning path with recommended resources

Automating Engineering with AI - LLMs in Metadata Driven Frameworks

Automating Engineering with AI - LLMs in Metadata Driven Frameworks

2025-06-12 Watch
lightning_talk
Simon Whiteley (Advancing Analytics)

The demand for data engineering keeps growing, but data teams are bored by repetitive tasks, stumped by growing complexity and endlessly harassed by an unrelenting need for speed. What if AI could take the heavy lifting off your hands? What if we make the move away from code-generation and into config-generation — how much more could we achieve? In this session, we’ll explore how AI is revolutionizing data engineering, turning pain points into innovation. Whether you’re grappling with manual schema generation or struggling to ensure data quality, this session offers practical solutions to help you work smarter, not harder. You’ll walk away with a good idea of where AI is going to disrupt the data engineering workload, some good tips around how to accelerate your own workflows and an impending sense of doom around the future of the industry!

Sponsored by: Airbyte | How Data Movement Powers GenAI

Sponsored by: Airbyte | How Data Movement Powers GenAI

2025-06-12 Watch
lightning_talk
Jim Kutz (Airbyte)

In this session, discover how effective data movement is foundational to successful GenAI implementations. As organizations rush to adopt AI technologies, many struggle with the infrastructure needed to manage the massive influx of unstructured data these systems require. Jim Kutz, Head of Data at Airbyte, draws from 20+ years of experience leading data teams at companies like Grafana, CircleCI, and BlackRock to demonstrate how modern data movement architectures can enable secure, compliant GenAI applications. Learn practical approaches to data sovereignty, metadata management, and privacy controls that transform data governance into an enabler for AI innovation. This session will explore how you can securely leverage your most valuable asset—first-party data—for GenAI applications while maintaining complete control over sensitive information. Walk away with actionable strategies for building an AI-ready data infrastructure that balances innovation with governance requirements.

Sponsored by: Anomalo | Reconciling IoT, Policy, and Insurer Data to Deliver Better Customer Discounts

Sponsored by: Anomalo | Reconciling IoT, Policy, and Insurer Data to Deliver Better Customer Discounts

2025-06-12 Watch
lightning_talk
Michael Randall (Nationwide)

As insurers increasingly leverage IoT data to personalize policy pricing, reconciling disparate datasets across devices, policies, and insurers becomes mission-critical. In this session, learn how Nationwide transitioned from prototype workflows in Dataiku to a hardened data stack on Databricks, enabling scalable data governance and high-impact analytics. Discover how the team orchestrates data reconciliation across Postgres, Oracle, and Databricks to align customer driving behavior with insurer and policy data—ensuring more accurate, fair discounts for policyholders. With Anomalo’s automated monitoring layered on top, Nationwide ensures data quality at scale while empowering business units to define custom logic for proactive stewardship. We’ll also look ahead to how these foundations are preparing the enterprise for unstructured data and GenAI initiatives.

Sponsored by: IBM | How to leverage unstructured data to build more accurate, trustworthy AI agents

Sponsored by: IBM | How to leverage unstructured data to build more accurate, trustworthy AI agents

2025-06-12 Watch
lightning_talk

As AI adoption accelerates, unstructured data has emerged as a critical—yet often overlooked—asset for building accurate, trustworthy AI agents. But preparing and governing this data at scale remains a challenge. Traditional data integration and RAG approaches fall short. In this session, discover how IBM enables AI agents grounded in governed, high-quality unstructured data. Learn how our unified data platform streamlines integration across batch, streaming, replication, and unstructured sources—while accelerating data intelligence through built-in governance, quality, lineage, and data sharing. But governance doesn’t stop at data. We’ll explore how AI governance extends oversight to the models and agents themselves. Walk away with practical strategies to simplify your stack, strengthen trust in AI outputs, and deliver AI-ready data at scale.

Sponsored by: MathCo | Powering Contextualized Intelligence with NucliOS, MathCo’s Databricks-Native Platform

Sponsored by: MathCo | Powering Contextualized Intelligence with NucliOS, MathCo’s Databricks-Native Platform

2025-06-12 Watch
lightning_talk
Aakarsh Kishore (MathCo)

In today's fast-paced digital landscape, context is everything. Decisions made without understanding the full picture often lead to missed opportunities or suboptimal outcomes. Powering contextualized intelligence is at the heart of MathCo’s proprietary platform — NucliOS, a Databricks-Native Platform leveraging Databricks features across the data lifecycle like Unity Catalog, Delta Lake, MLFlow, and Notebooks. Join this session to discover how NucliOS reimagines the data journey end-to-end: from data discovery and preparation to advanced analysis, dynamic visualization, and scenario modeling, all the way through to operationalizing insights within business workflows. At every step, intelligent agents act in concert, accelerating innovation and delivering speed at scale.

Sponsored by: Soda Data Inc. | Clean Energy, Clean Data: How Data Quality Powers Decarbonization

Sponsored by: Soda Data Inc. | Clean Energy, Clean Data: How Data Quality Powers Decarbonization

2025-06-12 Watch
lightning_talk
Robert Young (BDO Canada)

Drawing on BDO Canada’s deep expertise in the electricity sector, this session explores how clean energy innovation can be accelerated through a holistic approach to data quality. Discover BDO’s practical framework for implementing data quality and rebuilding trust in data through a structured, scalable approach. BDO will share a real-world example of monitoring data at scale—from high-level executive dashboards to the details of daily ETL and ELT pipelines. Learn how they leveraged Soda’s data observability platform to unlock near-instant insights, and how they moved beyond legacy validation pipelines with built-in checks across their production Lakehouse. Whether you're a business leader defining data strategy or a data engineer building robust data products, this talk connects the strategic value of clean data with actionable techniques to make it a reality.

Founder discussion: Matei on UC, Data Intelligence and AI Governance

Founder discussion: Matei on UC, Data Intelligence and AI Governance

2025-06-12 Watch
talk
Matei Zaharia (Databricks)

Matei is a legend of open source: he started the Apache Spark project in 2009, co-founded Databricks, and worked on other widely used data and AI software, including MLflow, Delta Lake, and Dolly. His most recent research is about combining large language models (LLMs) with external data sources, such as search systems, and improving their efficiency and result quality. This will be a conversation coverering the latest and greatest of UC, Data Intelligence, AI Governance, and more.

Summit Live: Data Sharing and Collaboration

Summit Live: Data Sharing and Collaboration

2025-06-12 Watch
talk
Zaheera Valani (Databricks)

Hear more on the latest in data collaboration, which is paramount to unlocking business success. Delta Sharing is an open-source approach to share and govern data, AI models, dashboards, and notebooks across clouds and platforms - without the costly need for replication. Databricks Clean Rooms provide safe hosting environments for data collaboration across companies, also without the costly duplication of data. And the Databricks Marketplace is the open marketplace for all your data, analytics, and AI needs.

Summit Live: Data Strategy - Democratizing Consumption and Growing ROI

Summit Live: Data Strategy - Democratizing Consumption and Growing ROI

2025-06-12 Watch
talk
Robin Sutara (Databricks)

With organizations collecting more and more data every day, the accuracy, trustworthiness, and accessibility of data must be prioritized if businesses want to unlock its value and grow ROI. And, that's where data democratization and data strategy can help. Hear from Robin Sutara, Field CDO, on top insights from her global experiences.

Wednesday Keynote (Virtual Replay)

2025-06-12
keynote

Be first to witness the latest breakthroughs from Databricks and share the success of innovative data and AI companies.

Data After Hours

2025-06-12
talk

Kick back at Data After Hours for drinks and dialogue with new (and old) friends. Enjoy live music, drinks, food and good company!

AI/BI Genie: A Look Under the Hood of Everyone's Friendly, Neighborhood GenAI Product

AI/BI Genie: A Look Under the Hood of Everyone's Friendly, Neighborhood GenAI Product

2025-06-12 Watch
talk
Amir Hormati (Databricks) , Alnur Ali (Databricks)

Go beyond the user interface and explore the cutting-edge technology driving AI/BI Genie. This session breaks down the AI/BI Genie architecture, showcasing how LLMs, retrieval-augmented generation (RAG) and finely tuned knowledge bases work together to deliver fast, accurate responses. We’ll also explore how AI agents orchestrate workflows, optimize query performance and continuously refine their understanding. Ideal for those who want to geek out about the tech stack behind Genie, this session offers a rare look at the magic under the hood.

AI-Powered Profits: Smarter Order and Inventory Management

AI-Powered Profits: Smarter Order and Inventory Management

2025-06-12 Watch
talk
Anders Poirel (Joby Aviation) , David Rogers (Databricks) , Samuel Ceriale (Xylem)

Join this session to hear from two incredible companies, Xylem and Joby Aviation. Xylem shares their successful journey from fragmented legacy systems to a unified Enterprise Data Platform, demonstrating how they integrated complex ERP data across four business segments to achieve breakthrough improvements in parts management and operational efficiency. Following Xylem's story, learn how Joby Aviation leveraged Databricks to automate and accelerate flight test data checks, cutting processing times from over two hours to under thirty minutes. This session highlights how advanced cloud tools empower engineers to quickly build and run custom data checks, improving both speed and safety in flight test operations.

Better Together: Change Data Feed in a Streaming Data Flow

Better Together: Change Data Feed in a Streaming Data Flow

2025-06-12 Watch
talk
Mattias Moser (84.51 LLC) , Scott Gordon (84.51˚)

Traditional streaming works great when your data source is append-only, but what if your data source includes updates and deletes? At 84.51 we used Lakeflow Declarative Pipelines and Delta Lake to build a streaming data flow that consumes inserts, updates and deletes while still taking advantage of streaming checkpoints. We combined this flow with a materialized view and Enzyme incremental refresh for a low-code, efficient and robust end-to-end data flow.We process around 8 million sales transactions each day with 80 million items purchased. This flow not only handles new transactions but also handles updates to previous transactions.Join us to learn how 84.51 combined change data feed, data streaming and materialized views to deliver a “better together” solution.84.51 is a retail insights, media & marketing company. We use first-party retail data from 60 million households sourced through a loyalty card program to drive Kroger’s customer-centric journey.