SQL

Crafting Business Brilliance: Leveraging Databricks SQL for Next-Gen Applications

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Mohammad Shalchi (Haleon) , Wasim Ahmad (Databricks)

API Data Lakehouse Databricks GenAI SAP

At Haleon, we've leveraged Databricks APIs and serverless compute to develop customer-facing applications for our business. This innovative solution enables us to efficiently deliver SAP invoice and order management data through front-end applications developed and served via our API Gateway. The Databricks lakehouse architecture has been instrumental in eliminating the friction associated with directly accessing SAP data from operational systems, while enhancing our performance capabilities. Our system acheived response times of less than 3 seconds from API call, with ongoing efforts to optimise this performance. This architecture not only streamlines our data and application ecosystem but also paves the way for integrating GenAI capabilities with robust governance measures for our future infrastructure. The implementation of this solution has yielded significant benefits, including a 15% reduction in customer service costs and a 28% increase in productivity for our customer support team.

Italgas’ AI Factory and the Future of Gas Distribution

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Nicola Giorcelli (Cluster Reply) , Delli, Serena (Italgas)

AI/ML Azure BI Databricks GenAI Synapse

At Italgas, Europe’s leading gas distributor both by network size and number of customers, we are spearheading digital transformation through a state-of-the-art, fully-fledged Databricks Intelligent platform. Achieved 50% cost reduction and 20% performance boost migrating from Azure Synapse to Databricks SQL Deployed 41 ML/GenAI models in production, with 100% of workloads governed by Unity Catalog Empowered 80% of employees with self-service BI through Genie Dashboards Enabled natural language queries for control-room operators analyzing network status The future of gas distribution is data-driven: predictive maintenance, automated operations, and real-time decision making are now realities. Our AI Factory isn't just digitizing infrastructure—it's creating a more responsive, efficient, and sustainable gas network that anticipates needs before they arise.

Sponsored by: Firebolt | The Power of Low-latency Data for AI Apps

Composing High-Accuracy AI Systems With SLMs and Mini-Agents

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Sharon Zhou (Lamini)

AI/ML LLM RAG

This session is repeated. For most companies, building compound AI systems remains aspirational. LLMs are powerful, but imperfect, and their non-deterministic nature makes steering them to high accuracy a challenge. In this session, we’ll demonstrate how to build compound AI systems using SLMs and highly accurate mini-agents that can be integrated into agentic workflows. You'll learn about breakthrough techniques, including: memory RAG, an embedding algorithm that reduces hallucinations using embed-time compute to generate contextual embeddings, improving indexing and retrieval, and memory tuning, a finetuning algorithm that reduces hallucinations using a Mixture of Memory Experts (MoME) to specialize models with proprietary data. We’ll also share real-world examples (text-to-SQL, factual reasoning, function calling, code analysis and more) across various industries. With these building blocks, we’ll demonstrate how to create high accuracy mini-agents that can be composed into larger AI systems.

SQL-Based ETL: Options for SQL-Only Databricks Development

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Dustin Vannoy (Databricks)

Analytics Databricks dbt Delta ETL/ELT Git SQLMesh Data Streaming

Using SQL for data transformation is a powerful way for an analytics team to create their own data pipelines. However, relying on SQL often comes with tradeoffs such as limited functionality, hard-to-maintain stored procedures or skipping best practices like version control and data tests. Databricks supports building high-performing SQL ETL workloads. Attend this session to hear how Databricks supports SQL for data transformation jobs as a core part of your Data Intelligence Platform. In this session we will cover 4 options to use Databricks with SQL syntax to create Delta tables: Lakeflow Declarative Pipelines: A declarative ETL option to simplify batch and streaming pipelines dbt: An open-source framework to apply engineering best practices to SQL based data transformations SQLMesh: an open-core product to easily build high-quality and high-performance data pipelines SQL notebooks jobs: a combination of Databricks Workflows and parameterized SQL notebooks

Tracing the Path of a Row Through a GPU-Enabled Query Engine on the Grace-Blackwell Architecture

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Thomas Graves (NVIDIA) , Clemens Lutz (NVIDIA)

Analytics Data Analytics Spark

Grace-Blackwell is NVIDIA’s most recent GPU system architecture. It addresses a key concern of query engines: fast data access. In this session, we will take a close look at how GPUs can accelerate data analytics by tracing how a row flows through a GPU-enabled query engine.Query engines read large data from CPU memory or from disk. On Blackwell GPUs, a query engine can rely on hardware-accelerated decompression of compact formats. The Grace-Blackwell system takes data access performance even further, by reading data at up to 450 GB/s across its CPU to GPU interconnect. We demonstrate full end-to-end SQL query acceleration using GPUs in a prototype query engine using industry standard benchmark queries. We compare the results to existing CPU solutions.Using Apache Spark™ and the RAPIDS Accelerator for Apache Spark, we demonstrate the impact GPU acceleration has on the performance of SQL queries at the 100TB scale using NDS, a suite that simulates real-world business scenarios.

Transforming HP’s Print ELT Reporting with GenIT: Real-Time Insights Tool Powered by Databricks AI

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Weiwei Hu (HP)

AI/ML DataViz Databricks ETL/ELT GenAI

Timely and actionable insights are critical for staying competitive in today’s fast-paced environment. At HP Print, manual reporting for executive leadership (ELT) has been labor-intensive, hindering agility and productivity. To address this, we developed the Generative Insights Tool (GenIT) using Databricks Genie and Mosaic AI to create a real-time insights engine automating SQL generation, data visualization, and narrative creation. GenIT delivers instant insights, enabling faster decisions, greater productivity, and improved consistency while empowering leaders to respond to printer trends. With automated querying, AI-powered narratives, and a chatbot, GenIT reduces inefficiencies and ensures quality data and insights. Our roadmap integrates multi-modal data, enhances chatbot functionality, and scales globally. This initiative shows how HP Print leverages GenAI to improve decision-making, efficiency, and agility, and we will showcase this transformation at the Databricks AI Summit.

Unify Your Data and Governance With Lakehouse Federation

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Zeashan Pappa (Databricks) , Fuat Can Efeoglu (Databricks)

AI/ML Analytics Data Lakehouse Hive Snowflake

In today's data landscape, organizations often grapple with fragmented data spread across various databases, data warehouses and catalogs. Lakehouse Federation addresses this challenge by enabling seamless discovery, querying, and governance of distributed data without the need for duplication or migration. This session will explore how Lakehouse Federation integrates external data sources like Hive Metastore, Snowflake, SQL Server and more into a unified interface, providing consistent access controls, lineage tracking and auditing across your entire data estate. Learn how to streamline analytics and AI workloads, enhance compliance and reduce operational complexity by leveraging a single, cohesive platform for all your data needs.

Using Databricks to Power News Sentiment, a Capital IQ Pro Application

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Debbie Connolly (S&P Global)

Databricks ETL/ELT

The News Sentiment application enhances the discoverability of news content through our flagship platform, Capital IQ Pro. We processed news articles for 10,000+ public companies through entity recognition, along with a series of proprietary financial sentiment models to assess whether the news was positive or negative, as well as its significance and relevance to the company. We built a database containing over 1.5 million signals and operationalized the end-to-end ETL as a daily Workflow on Databricks. The development process included model training and selection. We utilized training data from our internal financial analysts to train Google’s T5-Flan to create our proprietary sentiment model and two additional models. Our models are deployed on Databricks Model-Serving as serverless endpoints that can be queried on-demand. The last phase of the project was to develop a UI, in which we utilized Databricks serverless SQL warehouses to surface this data in real-time.

GPU Accelerated Spark Connect

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Gera Shegalov (NVIDIA) , Erik eordentlich (NVIDIA)

AI/ML API ETL/ELT Cyber Security Spark

Spark Connect, first included for SQL/DataFrame API in Apache Spark 3.4 and recently extended to MLlib in 4.0, introduced a new way to run Spark applications over a gRPC protocol. This has many benefits, including easier adoption for non-JVM clients, version independence from applications and increased stability and security of the associated Spark clusters. The recent Spark Connect extension for ML also included a plugin interface to configure enhanced server-side implementations of the MLlib algorithms when launching the server. In this talk, we shall demonstrate how this new interface, together with Spark SQL’s existing plugin interface, can be used with NVIDIA GPU-accelerated plugins for ML and SQL to enable no-code change, end-to-end GPU acceleration of Spark ETL and ML applications over Spark Connect, with optimal performance up to 9x at 80% cost reduction compared to CPU baselines.

How to Get the Most Out of Your BI Tools on Databricks

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Kyle Hale (Databricks)

AI/ML Analytics BI Databricks DWH Power BI

Unlock the full potential of your BI tools with Databricks. This session explores how features like Photon, Databricks SQL, Liquid Clustering, AI/BI Genie and Publish to Power BI enhance performance, scalability and user experience. Learn how Databricks accelerates query performance, optimizes data layouts and integrates seamlessly with BI tools. Gain actionable insights and best practices to improve analytics efficiency, reduce latency and drive better decision-making. Whether migrating from a data warehouse or optimizing an existing setup, this talk provides the strategies to elevate your BI capabilities.

Introduction to Databricks SQL

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Himanshu Raja (Databricks) , Pearl Ubaru (Databricks)

Databricks DWH

This session is repeated. If you are brand new to Databricks SQL and want to get a lightning tour of this intelligent data warehouse, this session is for you. Learn about the architecture of Databricks SQL. Then show how simple, streamlined interfaces are making it easier for analysts, developers, admins and business users to get their jobs done and questions answered. We’ll show how easy it is to create a warehouse, get data, transform it and build queries and dashboards. By the end of the session, you’ll be able to build a Databricks SQL warehouse in 5 minutes.

Simplifying Data Pipelines With Lakeflow Declarative Pipelines: A Beginner’s Guide

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Matt Jones (Databricks) , Brad Turnbaugh (84.51)

AI/ML Analytics Data Engineering ETL/ELT Kafka Data Streaming

As part of the new Lakeflow data engineering experience, Lakeflow Declarative Pipelines makes it easy to build and manage reliable data pipelines. It unifies batch and streaming, reduces operational complexity and ensures dependable data delivery at scale — from batch ETL to real-time processing.Lakeflow Declarative Pipelines excels at declarative change data capture, batch and streaming workloads, and efficient SQL-based pipelines. In this session, you’ll learn how we’ve reimagined data pipelining with Lakeflow Declarative Pipelines, including: A brand new pipeline editor that simplifies transformations Serverless compute modes to optimize for performance or cost Full Unity Catalog integration for governance and lineage Reading/writing data with Kafka and custom sources Monitoring and observability for operational excellence “Real-time Mode” for ultra-low-latency streaming Join us to see how Lakeflow Declarative Pipelines powers better analytics and AI with reliable, unified pipelines.

Sponsored by: dbt Labs | Empowering the Enterprise for the Next Era of AI and BI

Accelerating Analytics: Integrating BI and Partner Tools to Databricks SQL

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Fuat Can Efeoglu (Databricks) , Toussaint Webb (Databricks)

Analytics BI Databricks dbt Microsoft Fabric Power BI Tableau

This session is repeated. Did you know that you can integrate with your favorite BI tools directly from Databricks SQL? You don’t even need to stand up an additional warehouse. This session shows the integrations with Microsoft Power Platform, Power BI, Tableau and dbt so you can have a seamless integration experience. Directly connect your Databricks workspace with Fabric and Power BI workspaces or Tableau to publish and sync data models, with defined primary and foreign keys, between the two platforms.

Data Management and Governance With UC

2025-06-10 · Data + AI Summit 2025

talk

AI/ML API Cloud Computing Data Governance Data Management Databricks Python Spark

In this course, you'll learn concepts and perform labs that showcase workflows using Unity Catalog - Databricks' unified and open governance solution for data and AI. We'll start off with a brief introduction to Unity Catalog, discuss fundamental data governance concepts, and then dive into a variety of topics including using Unity Catalog for data access control, managing external storage and tables, data segregation, and more. Pre-requisites: Beginner familiarity with the Databricks Data Intelligence Platform (selecting clusters, navigating the Workspace, executing notebooks), cloud computing concepts (virtual machines, object storage, etc.), production experience working with data warehouses and data lakes, intermediate experience with basic SQL concepts (select, filter, groupby, join, etc), beginner programming experience with Python (syntax, conditions, loops, functions), beginner programming experience with the Spark DataFrame API (Configure DataFrameReader and DataFrameWriter to read and write data, Express query transformations using DataFrame methods and Column expressions, etc.) Labs: Yes Certification Path: Databricks Certified Data Engineer Associate

Deploy Workloads with Lakeflow Jobs (previously Databricks Workflows)

2025-06-10 · Data + AI Summit 2025

talk

Analytics API Cloud Computing Dashboard Databricks Python Spark

In this course, you’ll learn how to orchestrate data pipelines with Lakeflow Jobs (previously Databricks Workflows) and schedule dashboard updates to keep analytics up-to-date. We’ll cover topics like getting started with Lakeflow Jobs, how to use Databricks SQL for on-demand queries, and how to configure and schedule dashboards and alerts to reflect updates to production data pipelines. Pre-requisites: Beginner familiarity with the Databricks Data Intelligence Platform (selecting clusters, navigating the Workspace, executing notebooks), cloud computing concepts (virtual machines, object storage, etc.), production experience working with data warehouses and data lakes, intermediate experience with basic SQL concepts (select, filter, groupby, join, etc), beginner programming experience with Python (syntax, conditions, loops, functions), beginner programming experience with the Spark DataFrame API (Configure DataFrameReader and DataFrameWriter to read and write data, Express query transformations using DataFrame methods and Column expressions, etc.) Labs: No Certification Path: Databricks Certified Data Engineer Associate

Getting Started With Lakeflow Connect

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Peter Pogorski (Databricks) , Giselle Goicochea (Databricks)

AI/ML Analytics CI/CD Databricks Google Analytics SaaS postgresql

Hundreds of customers are already ingesting data with Lakeflow Connect from SQL Server, Salesforce, ServiceNow, Google Analytics, SharePoint, PostgreSQL and more to unlock the full power of their data. Lakeflow Connect introduces built-in, no-code ingestion connectors from SaaS applications, databases and file sources to help unlock data intelligence. In this demo-packed session, you’ll learn how to ingest ready-to-use data for analytics and AI with a few clicks in the UI or a few lines of code. We’ll also demonstrate how Lakeflow Connect is fully integrated with the Databricks Data Intelligence Platform for built-in governance, observability, CI/CD, automated pipeline maintenance and more. Finally, we’ll explain how to use Lakeflow Connect in combination with downstream analytics and AI tools to tackle common business challenges and drive business impact.

The Future of Real Time Insights with Databricks and SAP

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Alejandro Saucedo (Zalando SE) , Jon Levine (JPL) (Databricks) , Olaf Melchior (Zalando SE)

AI/ML Analytics Databricks ETL/ELT SAP

Tired of waiting on SAP data? Join this session to see how Databricks and SAP make it easy to query business-ready data—no ETL. With Databricks SQL, you’ll get instant scale, automatic optimizations, and built-in governance across all your enterprise analytics data. Fast and AI-powered insights from SAP data are finally possible—and this is how.

ThredUp’s Journey with Databricks: Modernizing Our Data Infrastructure

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Aniket Mane (ThredUp Inc.) , Chintan Patel (NVIDIA)

AI/ML Analytics Data Management Databricks Delta

Building an AI-ready data platform requires strong governance, performance optimization, and seamless adoption of new technologies. At ThredUp, our Databricks journey began with a need for better data management and evolved into a full-scale transformation powering analytics, machine learning, and real-time decision-making. In this session, we’ll cover: Key inflection points: Moving from legacy systems to a modernized Delta Lake foundation Unity Catalog’s impact: Improving governance, access control, and data discovery Best practices for onboarding: Ensuring smooth adoption for engineering and analytics teams What’s next? Serverless SQL and conversational analytics with Genie Whether you’re new to Databricks or scaling an existing platform, you’ll gain practical insights on navigating the transition, avoiding pitfalls, and maximizing AI and data intelligence.

talk-data.com

Activity Trend

Top Events

Top Speakers

Crafting Business Brilliance: Leveraging Databricks SQL for Next-Gen Applications

Italgas’ AI Factory and the Future of Gas Distribution

Sponsored by: Firebolt | The Power of Low-latency Data for AI Apps

Composing High-Accuracy AI Systems With SLMs and Mini-Agents

SQL-Based ETL: Options for SQL-Only Databricks Development

Tracing the Path of a Row Through a GPU-Enabled Query Engine on the Grace-Blackwell Architecture

Transforming HP’s Print ELT Reporting with GenIT: Real-Time Insights Tool Powered by Databricks AI

Unify Your Data and Governance With Lakehouse Federation

Using Databricks to Power News Sentiment, a Capital IQ Pro Application

GPU Accelerated Spark Connect

How to Get the Most Out of Your BI Tools on Databricks

Introduction to Databricks SQL

Simplifying Data Pipelines With Lakeflow Declarative Pipelines: A Beginner’s Guide

Sponsored by: dbt Labs | Empowering the Enterprise for the Next Era of AI and BI

Accelerating Analytics: Integrating BI and Partner Tools to Databricks SQL

Data Management and Governance With UC

Deploy Workloads with Lakeflow Jobs (previously Databricks Workflows)

Getting Started With Lakeflow Connect

The Future of Real Time Insights with Databricks and SAP

ThredUp’s Journey with Databricks: Modernizing Our Data Infrastructure