DWH

Welcome Lakehouse, from a DWH transformation to a M&A data sharing

2025-06-12 · Data + AI Summit 2025 Watch

lightning_talk

by Gianfranco Arena (Dxc Technology)

AWS Data Lakehouse Databricks Delta

At DXC, we helped our customer FastWeb with their "Welcome Lakehouse" project - a data warehouse transformation from on-premises to Databricks on AWS. But the implementation became something more. Thanks to features such as Lakehouse Federation and Delta Sharing, from the first day of the Fastweb+Vodafone merger, we have been able to connect two different platforms with ease and make the business focus on the value of data and not on the IT integration. This session will feature our customer Alessandro Gattolin of Fastweb to talk about the experience.

Iceberg Geo Type: Transforming Geospatial Data Management at Scale

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Szehon Ho (Databricks) , Jia Yu (Wherobots Inc.)

Analytics Data Management Iceberg Spark

The Apache Iceberg™ community is introducing native geospatial type support, addressing key challenges in managing geospatial data at scale, including fragmented formats and inefficiencies in storing large spatial datasets. This talk will delve into the origins of the Iceberg geo type, its specification design and future goals. We will examine the impact on both the geospatial and Iceberg communities, in introducing a standard data warehouse storage layer to the geospatial community, and enabling optimized geospatial analytics for Iceberg users. We will also present a live demonstration of the Iceberg geo data type with Apache Sedona™ and Apache Spark™, showcasing how it simplifies and accelerates geospatial analytics workflows and queries. Finally, we will also provide an in-depth look at its current capabilities and outline the roadmap for future developments, and offer a perspective on its role in advancing geospatial data management in the industry.

Performance Best Practices for Fast Queries, High Concurrency, and Scaling on Databricks SQL

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Mostafa Mokhtar (Databricks) , Jeremy Lewallen (Databricks)

Databricks SQL

Data warehousing in enterprise and mission-critical environments needs special consideration for price/performance. This session will explain how Databricks SQL addresses the most challenging requirements for high-concurrency, low-latency performance at scale. We will also cover the latest advancements in resource-based scheduling, autoscaling and caching enhancements that allow for seamless performance and workload management.

How Navy Federal's Enterprise Data Ecosystem Leverages Unity Catalog for Data + AI Governance

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Krishnakumar Sivasubramanian (NFCU) , Ricardo Portilla (Databricks)

AI/ML Cloud Computing Data Engineering Data Lake

Navy Federal Credit Union has 200+ enterprise data sources in the enterprise data lake. These data assets are used for training 100+ machine learning models and hydrating a semantic layer for serving, at an average 4,000 business users daily across the credit union. The only option for extracting data from analytic semantic layer was to allow consuming application to access it via an already-overloaded cloud data warehouse. Visualizing data lineage for 1,000 + data pipelines and associated metadata is impossible and understanding the granular cost for running data pipelines is a challenge. Implementing Unity Catalog opened alternate path for accessing analytic semantic data from lake. It also opened the doors to remove duplicate data assets stored across multiple lakes which will save hundred thousands of dollars in data engineering efforts, compute and storage costs.

How to Migrate From Snowflake to Databricks SQL

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Koundinya Srinivasarao (Databricks) , Matt Holzapfel (Databricks)

Cloud Computing Databricks ETL/ELT Snowflake SQL

Migrating your Snowflake data warehouse to the Databricks Data Intelligence Platform can accelerate your data modernization journey. Though a cloud platform-to-cloud platform migration should be relatively easy, the breadth of the Databricks Platform provides flexibility and hence requires careful planning and execution. In this session, we present the migration methodology, technical approaches, automation tools, product/feature mapping, a technical demo and best practices using real-world case studies for migrating data, ELT pipelines and warehouses from Snowflake to Databricks.

Hands-on Learning: Databricks SQL in Action: Intelligent Data Warehousing, Analytics and BI Workshop (repeat)

2025-06-11 · Data + AI Summit 2025

workshop

by Pearl Ubaru (Databricks)

AI/ML Analytics BI Cloud Computing Data Lake Data Lakehouse Databricks SQL

Most organizations run complex cloud data architectures that silo applications, users and data. Join this interactive hands-on workshop to learn how Databricks SQL allows you to operate a multi-cloud lakehouse architecture that delivers data warehouse performance at data lake economics — with up to 12x better price/performance than traditional cloud data warehouses. Here’s what we’ll cover: How Databricks SQL fits in the Data Intelligence Platform, enabling you to operate a multicloud lakehouse architecture that delivers data warehouse performance at data lake economics How to manage and monitor compute resources, data access and users across your lakehouse infrastructure How to query directly on your data lake using your tools of choice or the built-in SQL editor and visualizations How to use AI to increase productivity when querying, completing code or building dashboards Ask your questions during this hands-on lab, and the Databricks experts will guide you.

How to Migrate From Oracle to Databricks SQL

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Laurent Léturgez (Databricks)

CSV Databricks Oracle PySpark SQL

Migrating your legacy Oracle data warehouse to the Databricks Data Intelligence Platform can accelerate your data modernization journey. In this session, learn the top strategies for completing this data migration. We will cover data type conversion, basic to complex code conversions, validation and reconciliation best practices. Discover the pros and cons of using CSV files to PySpark or using pipelines to Databricks tables. See before-and-after architectures of customers who have migrated, and learn about the benefits they realized.

Iceberg Table Format Adoption and Unified Metadata Catalog Implementation in Lakehouse Platform

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Ruotian Wang (Doordash) , Sergey Zavgorodni (DoorDash)

Amazon EMR Data Lake Data Lakehouse Databricks Iceberg Snowflake Trino

DoorDash Data organization actively adopts LakeHouse paradigm. This presentation describes the methodology which allows to migrate the classic Data Warehouse and Data Lake platforms to unified LakeHouse solution.The objective of this effort include Elimination of excessive data movement.Seamless integration and consolidation of the query engine layers, including Snowflake, Databricks, EMR and Trino.Query performance optimization.Abstracting away complexity of underlying storage layers and table formatsStrategic and justified decision on the Unified Metadata catalog used across varios compute platforms

Multi-Statement Transactions: How to Improve Data Consistency and Performance

2025-06-11 · Data + AI Summit 2025 Watch

lightning_talk

by Franco Patano (Databricks)

Data Lakehouse Databricks SQL

Multi-statement transactions bring the atomicity and reliability of traditional databases to modern data warehousing on the lakehouse. In this session, we’ll explore real-world patterns enabled by multi-statement transactions — including multi-table updates, deduplication pipelines and audit logging — and show how Databricks ensures atomicity and consistency across complex workflows. We’ll also dive into demos and share tips to getting started and migrations with this feature in Databricks SQL.

Hands-on Learning: Databricks SQL in Action: Intelligent Data Warehousing, Analytics and BI Workshop

2025-06-11 · Data + AI Summit 2025

workshop

by Pearl Ubaru (Databricks)

AI/ML Analytics BI Cloud Computing Data Lake Data Lakehouse Databricks SQL

Most organizations run complex cloud data architectures that silo applications, users and data. Join this interactive hands-on workshop to learn how Databricks SQL allows you to operate a multi-cloud lakehouse architecture that delivers data warehouse performance at data lake economics — with up to 12x better price/performance than traditional cloud data warehouses.Here’s what we’ll cover: How Databricks SQL fits in the Data Intelligence Platform, enabling you to operate a multicloud lakehouse architecture that delivers data warehouse performance at data lake economics How to manage and monitor compute resources, data access and users across your lakehouse infrastructure How to query directly on your data lake using your tools of choice or the built-in SQL editor and visualizations How to use AI to increase productivity when querying, completing code or building dashboards Ask your questions during this hands-on lab, and the Databricks experts will guide you.

Summit Live: Best Practices for Data Warehouse Migrations

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Laurent Léturgez (Databricks)

AI/ML Databricks SQL

Databricks SQL is the fastest-growing data warehouse on the market, with over 10k organizations thanks to its price performance and AI innovations. See the best practices and common architectural challenges of migrating your legacy DW, including reference architectures. Learn how to easily migrate per the recently acquired the Lakebridge migration tool, and through our partners.

How to Migrate from Teradata to Databricks SQL

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Fabien Contaminard (Databricks) , Mehran Golestaneh (Databricks)

Databricks LLM SQL Teradata

Storage and processing costs of your legacy Teradata data warehouses impact your ability to deliver. Migrating your legacy Teradata data warehouse to the Databricks Data Intelligence Platform can accelerate your data modernization journey. In this session, learn the top strategies for completing this data migration. We will cover data type conversion, basic to complex code conversions, validation and reconciliation best practices. How to use Databricks natively hosted LLMs to assist with migration activities. See before-and-after architectures of customers who have migrated, and learn about the benefits they realized.

Sponsored by: Informatica | Modernize analytics and empower AI in Databricks with trusted data using Informatica

Delta and Databricks as a Performant Exabyte-Scale Application Backend

2025-06-11 · Data + AI Summit 2025 Watch

lightning_talk

by Scott Schenkein (Capital One Financial)

Databricks Delta ETL/ELT NoSQL Cyber Security Spark Data Streaming

The Delta Lake architecture promises to provide a single, highly functional, and high-scale copy of data that can be leveraged by a variety of tools to satisfy a broad range of use cases. To date, most use cases have focused on interactive data warehousing, ETL, model training, and streaming. Real-time access is generally delegated to costly and sometimes difficult-to-scale NoSQL, indexed storage, and domain-specific specialty solutions, which provide limited functionality compared to Spark on Delta Lake. In this session, we will explore the Delta data-skipping and optimization model and discuss how Capital One leveraged it along with Databricks photon and Spark Connect to implement a real-time web application backend. We’ll share how we built a highly-functional and performant security information and event management user experience (SIEM UX) that is cost effective.

Enterprise Cost Management for Data Warehousing with Databricks SQL

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Patrick Yang (Databricks) , Joo Ho Yeo (Databricks)

Databricks SQL

This session shows you how to gain visibility into your Databricks SQL spend and ensure cost efficiency. Learn about the latest features to gain detailed insights into Databricks SQL expenses so you can easily monitor and control your costs. Find out how you can enable attribution to internal projects, understand the Total Cost of Ownership, set up proactive controls and find ways to continually optimize your spend.

Sponsored by: RowZero | Spreadsheets in the modern data stack: security, governance, AI, and self-serve analytics

AI Powering Epsilon's Identity Strategy: Unified Marketing Platform on Databricks

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Gairik Chakraborty (Epsilon Data Management) , Boaz Super (Epsilon Data Management)

AI/ML Cloud Computing Data Lakehouse Data Management Data Science Databricks Delta Hadoop LLM Marketing

Join us to hear about how Epsilon Data Management migrated Epsilon’s unique, AI-powered marketing identity solution from multi-petabyte on-prem Hadoop and data warehouse systems to a unified Databricks Lakehouse platform. This transition enabled Epsilon to further scale its Decision Sciences solution and enable new cloud-based AI research capabilities on time and within budget, without being bottlenecked by the resource constraints of on-prem systems. Learn how Delta Lake, Unity Catalog, MLflow and LLM endpoints powered massive data volume, reduced data duplication, improved lineage visibility, accelerated Data Science and AI, and enabled new data to be immediately available for consumption by the entire Epsilon platform in a privacy-safe way. Using the Databricks platform as the base for AI and Data Science at global internet scale, Epsilon deploys marketing solutions across multiple cloud providers and multiple regions for many customers.

From Datavault to Delta Lake: Streamlining Data Sync with Lakeflow Connect

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Olivia Ren (Databricks) , Andrew Clarke (Australian Red Cross Lifeblood)

Analytics Azure Data Lakehouse Data Vault Databricks Delta SQL

In this session, we will explore the Australian Red Cross Lifeblood's approach to synchronizing an Azure SQL Datavault 2.0 (DV2.0) implementation with Unity Catalog (UC) using Lakeflow Connect. Lifeblood's DV2.0 data warehouse, which includes raw vault (RV) and business vault (BV) tables, as well as information marts defined as views, required a multi-step process to achieve data/business logic sync with UC. This involved using Lakeflow Connect to ingest RV and BV data, followed by a custom process utilizing JDBC to ingest view definitions, and the automated/manual conversion of T-SQL to Databricks SQL views, with Lakehouse Monitoring for validation. In this talk, we will share our journey, the design decisions we made, and how the resulting solution now supports analytics workloads, analysts, and data scientists at Lifeblood.

The JLL Training and Upskill Program for Our Warehouse Migration to Databricks

2025-06-10 · Data + AI Summit 2025 Watch

lightning_talk

by Kristopher Curtis (JLL)

Cloud Computing Data Lakehouse Databricks

Databricks Odyssey is JLL’s bespoke training program designed to upskill and prepare data professionals for a new world of data lakehouse. Based on the concepts of learn, practice and certify, participants earn points, moving through five levels by completing activities with business application of Databricks key features. Databricks Odyssey facilitates cloud data warehousing migration by providing best practice frameworks, ensuring efficient use of pay-per-compute platforms. JLL/T Insights and Data fosters a data culture through learning programs that develop in-house talent and create career pathways. Databricks Odyssey offers: JLL-specific hands-on learning Gamified 'level up' approach Practical, applicable skills Benefits include: Improved platform efficiency Enhanced data accuracy and client insights Ongoing professional development Potential cost savings through better utilization

Comprehensive Data Warehouse Migrations to Databricks SQL

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Simon Eligulashvili (Databricks) , Sundar Shankar (Databricks)

Databricks SQL

This session is repeated. Databricks has a free, comprehensive solution for migrating legacy data warehouses from a wide range of source systems. See how we accelerate migrations from legacy data warehouses to Databricks SQL, achieving 50% faster migration than traditional methods. We'll cover the tool’s automated migration process: Discovery: Source system profiling Assessment: Legacy code analysis Conversion: Advanced code transpilation Reconciliation: Data validation This comprehensive approach increases the predictability of migration projects, allowing businesses to plan and execute migrations with greater confidence.

talk-data.com

Activity Trend

Top Events

Top Speakers

Welcome Lakehouse, from a DWH transformation to a M&A data sharing

Iceberg Geo Type: Transforming Geospatial Data Management at Scale

Performance Best Practices for Fast Queries, High Concurrency, and Scaling on Databricks SQL

How Navy Federal's Enterprise Data Ecosystem Leverages Unity Catalog for Data + AI Governance

How to Migrate From Snowflake to Databricks SQL

Hands-on Learning: Databricks SQL in Action: Intelligent Data Warehousing, Analytics and BI Workshop (repeat)

How to Migrate From Oracle to Databricks SQL

Iceberg Table Format Adoption and Unified Metadata Catalog Implementation in Lakehouse Platform

Multi-Statement Transactions: How to Improve Data Consistency and Performance

Hands-on Learning: Databricks SQL in Action: Intelligent Data Warehousing, Analytics and BI Workshop

Summit Live: Best Practices for Data Warehouse Migrations

How to Migrate from Teradata to Databricks SQL

Sponsored by: Informatica | Modernize analytics and empower AI in Databricks with trusted data using Informatica

Delta and Databricks as a Performant Exabyte-Scale Application Backend

Enterprise Cost Management for Data Warehousing with Databricks SQL

Sponsored by: RowZero | Spreadsheets in the modern data stack: security, governance, AI, and self-serve analytics

AI Powering Epsilon's Identity Strategy: Unified Marketing Platform on Databricks

From Datavault to Delta Lake: Streamlining Data Sync with Lakeflow Connect

The JLL Training and Upskill Program for Our Warehouse Migration to Databricks

Comprehensive Data Warehouse Migrations to Databricks SQL