SQL

Declarative Pipelines — Ask Us Anything

2025-06-12 · Data + AI Summit 2025

lightning_talk

by Sandy Ryza (Databricks) , Denny Lee (Databricks) , Xiao Li (Databricks)

ETL/ELT

Join us for an insightful Ask Me Anything (AMA) session on Declarative Pipelines — a powerful approach to simplify and optimize data workflows. Learn how to define data transformations using high-level, SQL-like semantics, reducing boilerplate code while improving performance and maintainability. Whether you're building ETL processes, feature engineering pipelines, or analytical workflows, this session will cover best practices, real-world use cases and how Declarative Pipelines can streamline your data applications. Bring your questions and discover how to make your data processing more intuitive and efficient!

Performance Best Practices for Fast Queries, High Concurrency, and Scaling on Databricks SQL

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Mostafa Mokhtar (Databricks) , Jeremy Lewallen (Databricks)

Databricks DWH

Data warehousing in enterprise and mission-critical environments needs special consideration for price/performance. This session will explain how Databricks SQL addresses the most challenging requirements for high-concurrency, low-latency performance at scale. We will also cover the latest advancements in resource-based scheduling, autoscaling and caching enhancements that allow for seamless performance and workload management.

What’s New in Databricks SQL: Latest Features and Live Demos

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Gaurav Saraf (Databricks) , Kent Marten (Databricks)

AI/ML BI Databricks Data Streaming

Databricks SQL has added significant features in the last year at a fast pace. This session will share the most impactful features and the customer use cases that inspired them. We will highlight the new SQL editor, SQL coding features, streaming tables and materialized views, BI integrations, cost management features, system tables and observability features, and more. We will also share AI-powered performance optimizations.

Cooking With SQL: From Ingredients to Insights With Minimal Prep

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Fabien Contaminard (Databricks) , Serge Rielau (Databricks)

In this session we’ll dive into the SQL kitchen and use a combination of SQL staples and nouvelle cuisine such as recursive queries, temporary tables, and stored procedures. We’ll leave you with well-scripted recipes to execute immediately or store for later consumption in your Unity Catalog. Think of this session as building your go-to cookbook of SQL techniques. Bon appétit!

Hands-on Learning: AI-Powered Data Engineering with Lakeflow: Techniques for Modern Data Professionals (repeat)

2025-06-12 · Data + AI Summit 2025

talk

by Frank Munz (Databricks)

AI/ML Data Engineering Data Governance Databricks GenAI GitHub Data Streaming

This session is repeated. This introductory workshop caters to data engineers seeking hands-on experience and data architects looking to deepen their knowledge. The workshop is structured to provide a solid understanding of the following data engineering and streaming concepts: Introduction to Lakeflow and the Data Intelligence Platform Getting started with Lakeflow Declarative Pipelines for declarative data pipelines in SQL using Streaming Tables and Materialized Views Mastering Databricks Workflows with advanced control flow and triggers Understanding serverless compute Data governance and lineage with Unity Catalog Generative AI for Data Engineers: Genie and Databricks Assistant We believe you can only become an expert if you work on real problems and gain hands-on experience. Therefore, we will equip you with your own lab environment in this workshop and guide you through practical exercises like using GitHub, ingesting data from various sources, creating batch and streaming data pipelines, and more.

Hands-On Learning: Build Custom Data Intelligence Apps on Databricks (repeat)

2025-06-12 · Data + AI Summit 2025

talk

by Aakrati Talati (Databricks) , Giran Moodley (Databricks) , Ivan Trusov (Databricks)

AI/ML BI Databricks

Want to learn how to build your own custom data intelligence applications directly in Databricks? In this workshop, we’ll guide you through a hands-on tutorial for building a Streamlit web app that leverages many of the key products at Databricks as building blocks. You’ll integrate a live DB SQL warehouse, use Genie to ask questions in natural language, and embed AI/BI dashboards for interactive visualizations. In addition, we’ll discuss key concepts and best practices for building production-ready apps, including logging and observability, scalability, different authorization models, and deployment. By the end, you'll have a working AI app—and the skills to build more.

What’s New in Apache Spark™ 4.0?

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Daniel Tenedorio (Databricks) , Wenchen Fan (Databricks)

AI/ML API Java Python Scala Spark Data Streaming

Join this session for a concise tour of Apache Spark™ 4.0’s most notable enhancements: SQL features: ANSI by default, scripting, SQL pipe syntax, SQL UDF, session variable, view schema evolution, etc. Data type: VARIANT type, string collation Python features: Python data source, plotting API, etc. Streaming improvements: State store data source, state store checkpoint v2, arbitrary state v2, etc. Spark Connect improvements: More API coverage, thin client, unified Scala interface, etc. Infrastructure: Better error message, structured logging, new Java/Scala version support, etc. Whether you’re a seasoned Spark user or new to the ecosystem, this talk will prepare you to leverage Spark 4.0’s latest innovations for modern data and AI pipelines.

Elevate SQL Productivity: The Power of Notebooks and SQL Editor

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Jason Messer (Databricks)

Databricks

Writing SQL is a core part of any data analyst’s workflow, but small inefficiencies can add up, slowing down analysis and making it harder to iterate quickly. In this session, we’ll explore our powerful features in the Databricks SQL editor and notebook that help you to be more productive when writing SQL on Databricks. We’ll demo the new features and the customer use cases that inspired them.

From 10 Hours to 10 Minutes: Unleashing the Power of Lakeflow Declarative Pipelines

2025-06-12 · Data + AI Summit 2025

talk

by Sidney Cardoso (Michelin) , Yash Joshi (Accenture)

Analytics Azure ADF BI Data Quality Databricks Power BI

How do you transform a data pipeline from sluggish 10-hour batch processing into a real-time powerhouse that delivers insights in just 10 minutes? This was the challenge we tackled at one of France's largest manufacturing companies, where data integration and analytics were mission-critical for supply chain optimization. Power BI dashboards needed to refresh every 15 minutes. Our team struggled with legacy Azure Data Factory batch pipelines. These outdated processes couldn’t keep up, delaying insights and generating up to three daily incident tickets. We identified Lakeflow Declarative Pipelines and Databricks SQL as the game-changing solution to modernize our workflow, implement quality checks, and reduce processing times.In this session, we’ll dive into the key factors behind our success: Pipeline modernization with Lakeflow Declarative Pipelines: improving scalability Data quality enforcement: clean, reliable datasets Seamless BI integration: Using Databricks SQL to power fast, efficient queries in Power BI

Healthcare Interoperability: End-to-End Streaming FHIR Pipelines With Databricks & Redox

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Tim Kessler (Redox, Inc.) , Matthew Giglia (Databricks)

AI/ML API Amazon EMR BI Data Lakehouse Databricks Delta ETL/ELT Data Streaming

Redox & Databricks direct integration can streamline your interoperability workflows from responding in record time to preauthorization requests to letting attending physicians know about a change in risk for sepsis and readmission in near real time from ADTs. Data engineers will learn how to create fully-streaming ETL pipelines for ingesting, parsing and acting on insights from Redox FHIR bundles delivered directly to Unity Catalog volumes. Once available in the Lakehouse, AI/BI Dashboards and Agentic Frameworks help write FHIR messages back to Redox for direct push down to EMR systems. Parsing FHIR bundle resources has never been easier with SQL combined with the new VARIANT data type in Delta and streaming table creation against Serverless DBSQL Warehouses. We'll also use Databricks accelerators dbignite and redoxwrite for writing and posting FHIR bundles back to Redox integrated EMRs and we'll extend AI/BI with Unity Catalog SQL UDFs and the Redox API for use in Genie.

How to Migrate From Snowflake to Databricks SQL

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Koundinya Srinivasarao (Databricks) , Matt Holzapfel (Databricks)

Cloud Computing Databricks DWH ETL/ELT Snowflake

Migrating your Snowflake data warehouse to the Databricks Data Intelligence Platform can accelerate your data modernization journey. Though a cloud platform-to-cloud platform migration should be relatively easy, the breadth of the Databricks Platform provides flexibility and hence requires careful planning and execution. In this session, we present the migration methodology, technical approaches, automation tools, product/feature mapping, a technical demo and best practices using real-world case studies for migrating data, ELT pipelines and warehouses from Snowflake to Databricks.

Hands-On Learning: AI Agents Workshop: Create, Evaluate, and Deploy using Mosaic AI

2025-06-11 · Data + AI Summit 2025

workshop

by Amber Roberts (Databricks) , Nicolas Pelaez (Databricks)

AI/ML Databricks Python

Looking for a practical workshop on building an AI Agent on Databricks? Well, we have just the thing for you.This hands-on workshop takes you through the process of creating intelligent agents that can reason their way to useful outcomes. You'll start by building your own toolkit of SQL and Python functions that give your agent practical capabilities. Then we'll explore how to select the right foundation model for your needs, connect your custom tools, and watch as your agent tackles complex challenges through visible reasoning paths.The workshop doesn't just stop at building—you'll dive into evaluation techniques using evaluation datasets to identify where your agent shines and where it needs improvement. After implementing and measuring your changes, we'll explore deployment strategies, including a feedback collection interface that enables continuous improvement and governance mechanisms to ensure responsible AI usage in production environments.

Hands-on Learning: Databricks SQL in Action: Intelligent Data Warehousing, Analytics and BI Workshop (repeat)

2025-06-11 · Data + AI Summit 2025

workshop

by Pearl Ubaru (Databricks)

AI/ML Analytics BI Cloud Computing Data Lake Data Lakehouse Databricks DWH

Most organizations run complex cloud data architectures that silo applications, users and data. Join this interactive hands-on workshop to learn how Databricks SQL allows you to operate a multi-cloud lakehouse architecture that delivers data warehouse performance at data lake economics — with up to 12x better price/performance than traditional cloud data warehouses. Here’s what we’ll cover: How Databricks SQL fits in the Data Intelligence Platform, enabling you to operate a multicloud lakehouse architecture that delivers data warehouse performance at data lake economics How to manage and monitor compute resources, data access and users across your lakehouse infrastructure How to query directly on your data lake using your tools of choice or the built-in SQL editor and visualizations How to use AI to increase productivity when querying, completing code or building dashboards Ask your questions during this hands-on lab, and the Databricks experts will guide you.

HP's Data Platform Migration Journey: Redshift to Lakehouse

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Isaac Chan (HP Inc.) , Kavya Atmakuri (HP Inc.)

AWS Data Lakehouse Databricks dbt ETL/ELT Redshift

HP Print's data platform team took on a migration from a monolithic, shared resource of AWS Redshift, to a modular and scalable data ecosystem on Databricks lakehouse. The result was 30–40% cost savings, scalable and isolated resources for different data consumers and ETL workloads, and performance optimization for a variety of query types. Through this migration, there were technical challenges and learnings relating to the ETL migrations with DBT, new Databricks features like Liquid Clustering, predictive optimization, Photon, SQL serverless warehouses, managing multiple teams on Unity Catalog, and others. This presentation dives into both the business and technical sides of this migration. Come along as we share our key takeaways from this journey.

Crypto at Scale: Building a High-Performance Platform for Real-Time Blockchain Data

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Matthew Moorcroft (Databricks) , Ferran Cabezas Castellvi (Elliptic)

Analytics Blockchain Databricks Delta Data Streaming

In today’s fast-evolving crypto landscape, organizations require fast, reliable intelligence to manage risk, investigate financial crime, and stay ahead of evolving threats. In this session we will discover how Elliptic built a scalable, high-performance Data Intelligence Platform that delivers real-time, actionable Blockchain insights to their customers. We’ll walk you through some of the key components of the Elliptic Platform, including the Elliptic Entity Graph and our User-Facing Analytics. Our focus will be put on the evolution of our User-Facing Analytics capabilities, and specifically how components from the Databricks ecosystem such as Structured Streaming, Delta Lake, and SQL Warehouse have played a vital role. We’ll also share some of the optimizations we’ve made to our streaming jobs to maximize performance and ensure Data Completeness. Whether you’re looking to enhance your streaming capabilities, expand your knowledge of how crypto analytics works or simply discover novel approaches to data processing at scale, this session will provide concrete strategies and valuable lessons learned.

How to Migrate From Oracle to Databricks SQL

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Laurent Léturgez (Databricks)

CSV Databricks DWH Oracle PySpark

Migrating your legacy Oracle data warehouse to the Databricks Data Intelligence Platform can accelerate your data modernization journey. In this session, learn the top strategies for completing this data migration. We will cover data type conversion, basic to complex code conversions, validation and reconciliation best practices. Discover the pros and cons of using CSV files to PySpark or using pipelines to Databricks tables. See before-and-after architectures of customers who have migrated, and learn about the benefits they realized.

Summit Live: Spark Talk - Everything Spark, Lakeflow Declarative Pipelines, and Open Source

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Michael Armbrust (Databricks)

Databricks Delta Spark

Databricks co-founders created Spark, the wildly popular open source foundation of Databricks, way back in 2009. Learn from Michael Armbrust, creator of Spark SQL and leader of Databricks Delta, about the latest happenings in Spark, Lakeflow Declarative Pipelines, and open source.

Multi-Statement Transactions: How to Improve Data Consistency and Performance

2025-06-11 · Data + AI Summit 2025 Watch

lightning_talk

by Franco Patano (Databricks)

Data Lakehouse Databricks DWH

Multi-statement transactions bring the atomicity and reliability of traditional databases to modern data warehousing on the lakehouse. In this session, we’ll explore real-world patterns enabled by multi-statement transactions — including multi-table updates, deduplication pipelines and audit logging — and show how Databricks ensures atomicity and consistency across complex workflows. We’ll also dive into demos and share tips to getting started and migrations with this feature in Databricks SQL.

Sponsored by: Insight Enterprises | Unity Catalog Agent Assistant

Data Triggers and Advanced Control Flow With Lakeflow Jobs

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Anthony Podgorsak (Databricks) , Prashanth Babu Velanati Venkata (Databricks)

AI/ML BI Data Lakehouse Delta Power BI

Lakeflow Jobs is the production-ready fully managed orchestrator for the entire Lakehouse with 99.95% uptime. Join us for a dive into how you can orchestrate your enterprise data operations, from triggering your jobs only when your data is ready to advanced control flow with conditionals, looping and job modularity — with demos! Attendees will gain practical insights into optimizing their data operations by orchestrating with Lakeflow Jobs: New task types: Publish AI/BI Dashboards, push to Power BI or ingest with Lakeflow Connect Advanced execution control: Reference SQL Task outputs, run partial DAGs and perform targeted backfills Repair runs: Re-run failed pipelines with surgical precision using task-level repair Control flow upgrades: Native for-each loops and conditional logic make DAGs more dynamic + expressive Smarter triggers: Kick off jobs based on file arrival or Delta table changes, enabling responsive workflows Code-first approach to pipeline orchestration

talk-data.com

Activity Trend

Top Events

Top Speakers

Declarative Pipelines — Ask Us Anything

Performance Best Practices for Fast Queries, High Concurrency, and Scaling on Databricks SQL

What’s New in Databricks SQL: Latest Features and Live Demos

Cooking With SQL: From Ingredients to Insights With Minimal Prep

Hands-on Learning: AI-Powered Data Engineering with Lakeflow: Techniques for Modern Data Professionals (repeat)

Hands-On Learning: Build Custom Data Intelligence Apps on Databricks (repeat)

What’s New in Apache Spark™ 4.0?

Elevate SQL Productivity: The Power of Notebooks and SQL Editor

From 10 Hours to 10 Minutes: Unleashing the Power of Lakeflow Declarative Pipelines

Healthcare Interoperability: End-to-End Streaming FHIR Pipelines With Databricks & Redox

How to Migrate From Snowflake to Databricks SQL

Hands-On Learning: AI Agents Workshop: Create, Evaluate, and Deploy using Mosaic AI

Hands-on Learning: Databricks SQL in Action: Intelligent Data Warehousing, Analytics and BI Workshop (repeat)

HP's Data Platform Migration Journey: Redshift to Lakehouse

Crypto at Scale: Building a High-Performance Platform for Real-Time Blockchain Data

How to Migrate From Oracle to Databricks SQL

Summit Live: Spark Talk - Everything Spark, Lakeflow Declarative Pipelines, and Open Source

Multi-Statement Transactions: How to Improve Data Consistency and Performance

Sponsored by: Insight Enterprises | Unity Catalog Agent Assistant

Data Triggers and Advanced Control Flow With Lakeflow Jobs