SQL

Hands-on Learning: Databricks SQL in Action: Intelligent Data Warehousing, Analytics and BI Workshop

2025-06-11 · Data + AI Summit 2025

workshop

by Pearl Ubaru (Databricks)

AI/ML Analytics BI Cloud Computing Data Lake Data Lakehouse Databricks DWH

Most organizations run complex cloud data architectures that silo applications, users and data. Join this interactive hands-on workshop to learn how Databricks SQL allows you to operate a multi-cloud lakehouse architecture that delivers data warehouse performance at data lake economics — with up to 12x better price/performance than traditional cloud data warehouses.Here’s what we’ll cover: How Databricks SQL fits in the Data Intelligence Platform, enabling you to operate a multicloud lakehouse architecture that delivers data warehouse performance at data lake economics How to manage and monitor compute resources, data access and users across your lakehouse infrastructure How to query directly on your data lake using your tools of choice or the built-in SQL editor and visualizations How to use AI to increase productivity when querying, completing code or building dashboards Ask your questions during this hands-on lab, and the Databricks experts will guide you.

Revolutionizing PepsiCo BI Capabilities: From Traditional BI to Next-Gen Analytics Powerhouse

2025-06-11 · Data + AI Summit 2025 Watch

talk

by John Abraham (PepsiCo) , Joshua Sayah Lee (PepsiCo Inc.)

AI/ML Analytics BI Data Analytics Databricks

This session will provide an in-depth overview of how PepsiCo, a global leader in food and beverage, transformed its outdated data platform into a modern, unified and centralized data and AI-enabled platform using the Databricks SQL serverless environment. Through three distinct implementations that transpired at PepsiCo in 2024, we will demonstrate how the PepsiCo Data Analytics & AI Group unlocked pivotal capabilities that facilitated the delivery of diverse data-driven insights to the business, reduced operational expenses and enhanced overall performance through the newly implemented platform.

Summit Live: Best Practices for Data Warehouse Migrations

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Laurent Léturgez (Databricks)

AI/ML Databricks DWH

Databricks SQL is the fastest-growing data warehouse on the market, with over 10k organizations thanks to its price performance and AI innovations. See the best practices and common architectural challenges of migrating your legacy DW, including reference architectures. Learn how to easily migrate per the recently acquired the Lakebridge migration tool, and through our partners.

Sponsored by: Domo, Inc | Enabling AI-Powered Business Solutions w/Databricks & Domo

GenAI for SQL & ETL: Build Multimodal AI Workflows at Scale

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Ahmed Bilal (Databricks) , Colton Peltier (Databricks)

AI/ML Databricks ETL/ELT GenAI LLM

Enterprises generate massive amounts of unstructured data — from support tickets and PDFs to emails and product images. But extracting insight from that data requires brittle pipelines and complex tools. Databricks AI Functions make this simpler. In this session, you’ll learn how to apply powerful language and vision models directly within your SQL and ETL workflows — no endpoints, no infrastructure, no rewrites. We’ll explore practical use cases and best practices for analyzing complex documents, classifying issues, translating content, and inspecting images — all in a way that’s scalable, declarative, and secure. What you’ll learn: How to run state-of-the-art LLMs like GPT-4, Claude Sonnet 4, and Llama 4 on your data How to build scalable, multimodal ETL workflows for text and images Best practices for prompts, cost, and error handling in production Real-world examples of GenAI use cases powered by AI Functions

How to Migrate from Teradata to Databricks SQL

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Fabien Contaminard (Databricks) , Mehran Golestaneh (Databricks)

Databricks DWH LLM Teradata

Storage and processing costs of your legacy Teradata data warehouses impact your ability to deliver. Migrating your legacy Teradata data warehouse to the Databricks Data Intelligence Platform can accelerate your data modernization journey. In this session, learn the top strategies for completing this data migration. We will cover data type conversion, basic to complex code conversions, validation and reconciliation best practices. How to use Databricks natively hosted LLMs to assist with migration activities. See before-and-after architectures of customers who have migrated, and learn about the benefits they realized.

How We Turned 200+ Business Users Into Analysts With AI/BI Genie

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Thomas Russell (Databricks)

AI/ML Analytics BI Databricks Marketing

AI/BI Genie has transformed self-service analytics for the Databricks Marketing team. This user-friendly conversational AI tool empowers marketers to perform advanced data analysis using natural language — no SQL required. By reducing reliance on data teams, Genie increases productivity and enables faster, data-driven decisions across the organization. But realizing Genie’s full potential takes more than just turning it on. In this session, we’ll share the end-to-end journey of implementing Genie for over 200 marketing users, including lessons learned, best practices and the real business impact of this Databricks-on-Databricks solution. Learn how Genie democratizes data access, enhances insight generation and streamlines decision-making at scale.

Unity Catalog Lakeguard: Secure and Efficient Compute for Your Enterprise

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Jakob Mund (Databricks) , Scott Van Woudenberg (Databricks)

Cloud Computing Databricks Cloud Functions Cyber Security Spark

Modern data workloads span multiple sources — data lakes, databases, apps like Salesforce and services like cloud functions. But as teams scale, secure data access and governance across shared compute becomes critical. In this session, learn how to confidently integrate external data and services into your workloads using Spark and Unity Catalog on Databricks. We'll explore compute options like serverless, clusters, workflows and SQL warehouses, and show how Unity Catalog’s Lakeguard enforces fine-grained governance — even when concurrently sharing compute by multiple users. Walk away ready to choose the right compute model for your team’s needs — without sacrificing security or efficiency.

What’s New with Databricks Assistant: From Exploration to Production

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Gal Oshri (Amazon SageMaker AWS) , Samantha Banchik (Databricks)

Databricks

Databricks Assistant helps you get from initial exploration all the way to production faster and easier than ever. In this session, we'll show you how Assistant simplifies and accelerates common workflows, boosting your productivity across notebooks and the SQL editor. You'll get practical tips, see end-to-end examples in action, and hear about the latest capabilities we're excited about. We'll also discuss how we're continually improving Assistant to make your development experience faster, more contextual and more customizable. Join us to discover how to get the most out of Databricks Assistant and empower your team to build better and faster.

Accelerating Data Transformation: Best Practices for Governance, Agility and Innovation

2025-06-11 · Data + AI Summit 2025 Watch

lightning_talk

by Kevin Wilson (NCS Australia)

Analytics Data Governance Data Lakehouse Data Quality Databricks dbt ETL/ELT

In this session, we will share NCS’s approach to implementing a Databricks Lakehouse architecture, focusing on key lessons learned and best practices from our recent implementations. By integrating Databricks SQL Warehouse, the DBT Transform framework and our innovative test automation framework, we’ve optimized performance and scalability, while ensuring data quality. We’ll dive into how Unity Catalog enabled robust data governance, empowering business units with self-serve analytical workspaces to create insights while maintaining control. Through the use of solution accelerators, rapid environment deployment and pattern-driven ELT frameworks, we’ve fast-tracked time-to-value and fostered a culture of innovation. Attendees will gain valuable insights into accelerating data transformation, governance and scaling analytics with Databricks.

Bridging Big Data and AI: Empowering PySpark With Lance Format for Multi-Modal AI Data Pipelines

2025-06-11 · Data + AI Summit 2025 Watch

lightning_talk

by LU QIU (LanceDB) , Allison Wang (Databricks)

AI/ML Analytics API Big Data Data Analytics Lance PySpark Python Spark

PySpark has long been a cornerstone of big data processing, excelling in data preparation, analytics and machine learning tasks within traditional data lakes. However, the rise of multimodal AI and vector search introduces challenges beyond its capabilities. Spark’s new Python data source API enables integration with emerging AI data lakes built on the multi-modal Lance format. Lance delivers unparalleled value with its zero-copy schema evolution capability and robust support for large record-size data (e.g., images, tensors, embeddings, etc), simplifying multimodal data storage. Its advanced indexing for semantic and full-text search, combined with rapid random access, enables high-performance AI data analytics to the level of SQL. By unifying PySpark's robust processing capabilities with Lance's AI-optimized storage, data engineers and scientists can efficiently manage and analyze the diverse data types required for cutting-edge AI applications within a familiar big data framework.

Hands-on Learning: AI-Powered Data Engineering with Lakeflow: Techniques for Modern Data Professionals

2025-06-11 · Data + AI Summit 2025

talk

by Frank Munz (Databricks)

AI/ML Data Engineering Data Governance Databricks GenAI GitHub Data Streaming

This introductory workshop caters to data engineers seeking hands-on experience and data architects looking to deepen their knowledge. The workshop is structured to provide a solid understanding of the following data engineering and streaming concepts: Introduction to Lakeflow and the Data Intelligence Platform Getting started with Lakeflow Declarative Pipelines for declarative data pipelines in SQL using Streaming Tables and Materialized Views Mastering Databricks Workflows with advanced control flow and triggers Understanding serverless compute Data governance and lineage with Unity Catalog Generative AI for Data Engineers: Genie and Databricks Assistant We believe you can only become an expert if you work on real problems and gain hands-on experience. Therefore, we will equip you with your own lab environment in this workshop and guide you through practical exercises like using GitHub, ingesting data from various sources, creating batch and streaming data pipelines, and more.

Hands-On Learning: Build Custom Data Intelligence Apps on Databricks

2025-06-11 · Data + AI Summit 2025

talk

by Justin DeBrabant (Databricks) , Giran Moodley (Databricks) , Ivan Trusov (Databricks)

AI/ML BI Databricks

Want to learn how to build your own custom data intelligence applications directly in Databricks? In this workshop, we’ll guide you through a hands-on tutorial for building a Streamlit web app that leverages many of the key products at Databricks as building blocks. You’ll integrate a live DB SQL warehouse, use Genie to ask questions in natural language, and embed AI/BI dashboards for interactive visualizations. In addition, we’ll discuss key concepts and best practices for building production-ready apps, including logging and observability, scalability, different authorization models, and deployment. By the end, you'll have a working AI app—and the skills to build more.

Lakeflow Connect: Easy, Efficient Ingestion From Databases

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Peter Pogorski (Databricks) , Bret Grantham (Databricks)

Cyber Security postgresql

Lakeflow Connect streamlines the ingestion of incremental data from popular databases like SQL Server and PostgreSQL. In this session, we’ll review best practices for networking, security, minimizing database load, monitoring and more — tailored to common industry scenarios. Join us to gain practical insights into Lakeflow Connect's functionality so that you’re ready to build your own pipelines. Whether you're looking to optimize data ingestion or enhance your database integrations, this session will provide you with a deep understanding of how Lakeflow Connect works with databases.

Retail Genie: No-Code AI Apps for Empowering BI Users to be Self-Sufficient

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Harish Rajagopalan (Databricks) , Siddhesh Pore (Databricks)

AI/ML Analytics BI Databricks GenAI NLP

Explore how Databricks AI/BI Genie revolutionizes retail analytics, empowering business users to become self-reliant data explorers. This session highlights no-code AI apps that create a conversational interface for retail data analysis. Genie spaces harness NLP and generative AI to convert business questions into actionable insights, bypassing complex SQL queries. We'll showcase retail teams effortlessly analyzing sales trends, inventory and customer behavior through Genie's intuitive interface. Witness real-world examples of AI/BI Genie's adaptive learning, enhancing accuracy and relevance over time. Learn how this technology democratizes data access while maintaining governance via Unity Catalog integration. Discover Retail Genie's impact on decision-making, accelerating insights and cultivating a data-driven retail culture. Join us to see the future of accessible, intelligent retail analytics in action.

Selectively Overwrite Data With Delta Lake’s Dynamic Insert Overwrite

2025-06-11 · Data + AI Summit 2025 Watch

lightning_talk

by Bart Samwel (Databricks) , Thang Long Vu (Databricks)

Databricks dbt Delta ETL/ELT

Dynamic Insert Overwrite is an important Delta Lake feature that allows fine-grained updates by selectively overwriting specific rows, eliminating the need for full-table rewrites. For examples, this capability is essential for: DBT-Databricks' incremental models/workloads, enabling efficient data transformations by processing only new or updated records ETL Slowly Changing Dimension (SCD) Type 2 In this lightning talk, we will: Introduce Dynamic Insert Overwrite: Understand its functionality and how it works Explore key use cases: Learn how it optimizes performance and reduces costs Share best practices: Discover practical tips for leveraging this feature on Databricks, including on the cutting-edge Serverless SQL Warehouses

Using Clean Rooms for Privacy-Centric Data Collaboration

2025-06-11 · Data + AI Summit 2025 Watch

talk

by DJ Sharkey (Databricks) , Nikhil Gaekwad (Databricks)

AI/ML Analytics Databricks Delta Python

Databricks Clean Rooms make privacy-safe collaboration possible for data, analytics, and AI — across clouds and platforms. Built on Delta Sharing, Clean Rooms enable organizations to securely share and analyze data together in a governed, isolated environment — without ever exposing raw data. In this session, you’ll learn how to get started with Databricks Clean Rooms and unlock advanced use cases including: Cross-platform collaboration and joint analytics Training machine learning and AI models Enforcing custom privacy policies Analyzing unstructured data Incorporating proprietary libraries in Python and SQL notebooks Auditing clean room activity for compliance Whether you're a data scientist, engineer or data leader, this session will equip you to drive high-value collaboration while maintaining full control over data privacy and governance.

What’s New in Security and Compliance on the Databricks Data Intelligence Platform

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Filippo Seracini (Databricks) , Suresh Thiru (Databricks)

AI/ML AWS Azure Cloud Computing Databricks GCP Cyber Security

In this session, we’ll walk through the latest advancements in platform security and compliance on Databricks — from networking updates to encryption, serverless security and new compliance certifications across AWS, Azure and Google Cloud. We’ll also share our roadmap and best practices for how to securely configure workloads on Databricks SQL Serverless, Unity Catalog, Mosaic AI and more — at scale. If you're building on Databricks and want to stay ahead of evolving risk and regulatory demands, this session is your guide.

Your Wish is AI Command — Get to Grips With Databricks Genie

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Simon Whiteley (Advancing Analytics)

AI/ML Analytics BI Databricks LLM

Picture the scene — you're exploring a deep, dark cave looking for insights to unearth when, in a burst of smoke, Genie appears and offers you not three but unlimited data wishes. This isn't a folk tale, it's the growing wave of Generative BI that is going to be a part of analytics platforms. Databricks Genie is a tool powered by a SQL-writing LLM that redefines how we interact with data. We'll look at the basics of creating a new Genie room, scoping its data tables and asking questions. We'll help it out with some complex pre-defined questions and ensure it has the best chance of success. We'll give the tool a personality, set some behavioural guidelines and prepare some hidden easter eggs for our users to discover. Generative BI is going to be a fundamental part of the analytics toolset used across businesses. If you're using Databricks, you should be aware of Genie, if you're not, you should be planning your Generative BI Roadmap, and this session will answer your wishes.

Enterprise Cost Management for Data Warehousing with Databricks SQL

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Patrick Yang (Databricks) , Joo Ho Yeo (Databricks)

Databricks DWH

This session shows you how to gain visibility into your Databricks SQL spend and ensure cost efficiency. Learn about the latest features to gain detailed insights into Databricks SQL expenses so you can easily monitor and control your costs. Find out how you can enable attribution to internal projects, understand the Total Cost of Ownership, set up proactive controls and find ways to continually optimize your spend.

talk-data.com

Activity Trend

Top Events

Top Speakers

Hands-on Learning: Databricks SQL in Action: Intelligent Data Warehousing, Analytics and BI Workshop

Revolutionizing PepsiCo BI Capabilities: From Traditional BI to Next-Gen Analytics Powerhouse

Summit Live: Best Practices for Data Warehouse Migrations

Sponsored by: Domo, Inc | Enabling AI-Powered Business Solutions w/Databricks & Domo

GenAI for SQL & ETL: Build Multimodal AI Workflows at Scale

How to Migrate from Teradata to Databricks SQL

How We Turned 200+ Business Users Into Analysts With AI/BI Genie

Unity Catalog Lakeguard: Secure and Efficient Compute for Your Enterprise

What’s New with Databricks Assistant: From Exploration to Production

Accelerating Data Transformation: Best Practices for Governance, Agility and Innovation

Bridging Big Data and AI: Empowering PySpark With Lance Format for Multi-Modal AI Data Pipelines

Hands-on Learning: AI-Powered Data Engineering with Lakeflow: Techniques for Modern Data Professionals

Hands-On Learning: Build Custom Data Intelligence Apps on Databricks

Lakeflow Connect: Easy, Efficient Ingestion From Databases

Retail Genie: No-Code AI Apps for Empowering BI Users to be Self-Sufficient

Selectively Overwrite Data With Delta Lake’s Dynamic Insert Overwrite

Using Clean Rooms for Privacy-Centric Data Collaboration

What’s New in Security and Compliance on the Databricks Data Intelligence Platform

Your Wish is AI Command — Get to Grips With Databricks Genie

Enterprise Cost Management for Data Warehousing with Databricks SQL