AWS re:Invent 2025 - Modernize your data warehouse by moving to Amazon Redshift (ANT317)

2025-12-06 · AWS re:Invent 2024 Watch

video

Agile/Scrum AI/ML Analytics AWS Cloud Computing Redshift

Are you spending too much time on data warehouse management tasks like hardware provisioning, software patching, and performance tuning and not enough time building your applications and innovating with data? Tens of thousands of customers rely on AWS Analytics every day to run and scale analytics in seconds on all their data without managing data warehouse infrastructure. In this session, you’ll learn best practices and proven strategies for modernizing your data warehouse, helping your build powerful analytics and machine learning applications that operate at scale while keeping costs low.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2025 - What's new in Amazon Redshift and Amazon Athena (ANT206)

2025-12-03 · AWS re:Invent 2024 Watch

video

Agile/Scrum AI/ML Analytics Athena AWS Cloud Computing Redshift S3 SQL

Learn how AWS is enhancing its SQL analytics offerings with new capabilities in Amazon Redshift and Amazon Athena. Discover how Redshift's AI-powered data warehousing capabilities are enabling customers to modernize their analytics workloads with enhanced performance and cost optimization. Explore Athena's latest features for interactively querying data directly in their Amazon S3 data lakes. This session showcases new features and real-world examples of how organizations are using these services to accelerate business insights while optimizing costs.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2025 - Scaling Amazon Redshift with a multi-warehouse architecture (ANT318)

2025-12-02 · AWS re:Invent 2024 Watch

video

Agile/Scrum Analytics AWS Cloud Computing GenAI Redshift

Enterprise analytics platforms are undergoing a major transformation—from centralized, overloaded data warehouses to federated, governed, GenAI-ready multi-warehouse architectures. In this session, you’ll learn how to design your data warehouse architecture to scale with your business needs. We’ll explore the end-to-end architectural evolution from a monolithic Redshift cluster to a modern multi-warehouse architecture and the best practices to deploy them in a cost-effective manner.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

Bilt's conversational data layer: How we connected data to LLMs with dbt

2025-10-14 · dbt Coalesce 2025 Watch

talk

by Nicholas Heron (Bilt Rewards) , Ben Kramer (Bilt Rewards) , James Dorado (Bilt Rewards)

AI/ML dbt LLM

Bilt Rewards turned their dbt project into a natural language interface. By connecting their semantic layer and underlying data warehouse to an LLM, business users and data analysts can ask real business questions and get trusted and creative insights. This session shows how they modeled their data for AI, how they kept accuracy intact, and increased data driven conversations across the business.

Minus Three Tier: Data Architecture Turned Upside Down

2025-09-26 · PyData Amsterdam 2025 Watch

talk

by Hannes Mühleisen (DuckDB Labs)

API Data Engineering Data Lakehouse

Every data architecture diagram out there makes it abundantly clear who's in charge: At the bottom sits the analyst, above that is an API server, and on the very top sits the mighty data warehouse. This pattern is so ingrained we never ever question its necessity, despite its various issues like slow data response time, multi-level scaling issues, and massive cost.

But there is another way: Disconnect of storage and compute enables localization of query processing closer to people, leading to much snappier responses, natural scaling with client-side query processing, and much reduced cost.

In this talk, it will be discussed how modern data engineering paradigms like decomposition of storage, single-node query processing, and lakehouse formats enable a radical departure from the tired three-tier architecture. By inverting the architecture we can put user's needs first. We can rely on commoditised components like object store to enable fast, scalable, and cost-effective solutions.

Legacy to Snowflake: How to Avoid Failing Your Data Warehouse Migration

2025-06-17 · Summit 2025 - On Demand Watch

session

AI/ML Cloud Computing Snowflake

Migrating a legacy data warehouse to Snowflake should be a predictable task. However, after participating in numerous projects, common failure patterns have emerged. In this session, we’ll explore typical pitfalls when moving to the Snowflake AI Data Cloud and offer recommendations for avoiding them. We’ll cover mistakes at every stage of the process, from technical details to end-user involvement and everything in between — code conversion (using SnowConvert!), data migration, deployment, optimization, testing and project management.

Welcome Lakehouse, from a DWH transformation to a M&A data sharing

2025-06-12 · Data + AI Summit 2025 Watch

lightning_talk

by Gianfranco Arena (Dxc Technology)

AWS Data Lakehouse Databricks Delta

At DXC, we helped our customer FastWeb with their "Welcome Lakehouse" project - a data warehouse transformation from on-premises to Databricks on AWS. But the implementation became something more. Thanks to features such as Lakehouse Federation and Delta Sharing, from the first day of the Fastweb+Vodafone merger, we have been able to connect two different platforms with ease and make the business focus on the value of data and not on the IT integration. This session will feature our customer Alessandro Gattolin of Fastweb to talk about the experience.

Iceberg Geo Type: Transforming Geospatial Data Management at Scale

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Szehon Ho (Databricks) , Jia Yu (Wherobots Inc.)

Analytics Data Management Iceberg Spark

The Apache Iceberg™ community is introducing native geospatial type support, addressing key challenges in managing geospatial data at scale, including fragmented formats and inefficiencies in storing large spatial datasets. This talk will delve into the origins of the Iceberg geo type, its specification design and future goals. We will examine the impact on both the geospatial and Iceberg communities, in introducing a standard data warehouse storage layer to the geospatial community, and enabling optimized geospatial analytics for Iceberg users. We will also present a live demonstration of the Iceberg geo data type with Apache Sedona™ and Apache Spark™, showcasing how it simplifies and accelerates geospatial analytics workflows and queries. Finally, we will also provide an in-depth look at its current capabilities and outline the roadmap for future developments, and offer a perspective on its role in advancing geospatial data management in the industry.

Performance Best Practices for Fast Queries, High Concurrency, and Scaling on Databricks SQL

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Mostafa Mokhtar (Databricks) , Jeremy Lewallen (Databricks)

Databricks SQL

Data warehousing in enterprise and mission-critical environments needs special consideration for price/performance. This session will explain how Databricks SQL addresses the most challenging requirements for high-concurrency, low-latency performance at scale. We will also cover the latest advancements in resource-based scheduling, autoscaling and caching enhancements that allow for seamless performance and workload management.

How Navy Federal's Enterprise Data Ecosystem Leverages Unity Catalog for Data + AI Governance

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Krishnakumar Sivasubramanian (NFCU) , Ricardo Portilla (Databricks)

AI/ML Cloud Computing Data Engineering Data Lake

Navy Federal Credit Union has 200+ enterprise data sources in the enterprise data lake. These data assets are used for training 100+ machine learning models and hydrating a semantic layer for serving, at an average 4,000 business users daily across the credit union. The only option for extracting data from analytic semantic layer was to allow consuming application to access it via an already-overloaded cloud data warehouse. Visualizing data lineage for 1,000 + data pipelines and associated metadata is impossible and understanding the granular cost for running data pipelines is a challenge. Implementing Unity Catalog opened alternate path for accessing analytic semantic data from lake. It also opened the doors to remove duplicate data assets stored across multiple lakes which will save hundred thousands of dollars in data engineering efforts, compute and storage costs.

How to Migrate From Snowflake to Databricks SQL

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Koundinya Srinivasarao (Databricks) , Matt Holzapfel (Databricks)

Cloud Computing Databricks ETL/ELT Snowflake SQL

Migrating your Snowflake data warehouse to the Databricks Data Intelligence Platform can accelerate your data modernization journey. Though a cloud platform-to-cloud platform migration should be relatively easy, the breadth of the Databricks Platform provides flexibility and hence requires careful planning and execution. In this session, we present the migration methodology, technical approaches, automation tools, product/feature mapping, a technical demo and best practices using real-world case studies for migrating data, ELT pipelines and warehouses from Snowflake to Databricks.

How to Migrate From Oracle to Databricks SQL

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Laurent Léturgez (Databricks)

CSV Databricks Oracle PySpark SQL

Migrating your legacy Oracle data warehouse to the Databricks Data Intelligence Platform can accelerate your data modernization journey. In this session, learn the top strategies for completing this data migration. We will cover data type conversion, basic to complex code conversions, validation and reconciliation best practices. Discover the pros and cons of using CSV files to PySpark or using pipelines to Databricks tables. See before-and-after architectures of customers who have migrated, and learn about the benefits they realized.

Iceberg Table Format Adoption and Unified Metadata Catalog Implementation in Lakehouse Platform

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Ruotian Wang (Doordash) , Sergey Zavgorodni (DoorDash)

Amazon EMR Data Lake Data Lakehouse Databricks Iceberg Snowflake Trino

DoorDash Data organization actively adopts LakeHouse paradigm. This presentation describes the methodology which allows to migrate the classic Data Warehouse and Data Lake platforms to unified LakeHouse solution.The objective of this effort include Elimination of excessive data movement.Seamless integration and consolidation of the query engine layers, including Snowflake, Databricks, EMR and Trino.Query performance optimization.Abstracting away complexity of underlying storage layers and table formatsStrategic and justified decision on the Unified Metadata catalog used across varios compute platforms

Multi-Statement Transactions: How to Improve Data Consistency and Performance

2025-06-11 · Data + AI Summit 2025 Watch

lightning_talk

by Franco Patano (Databricks)

Data Lakehouse Databricks SQL

Multi-statement transactions bring the atomicity and reliability of traditional databases to modern data warehousing on the lakehouse. In this session, we’ll explore real-world patterns enabled by multi-statement transactions — including multi-table updates, deduplication pipelines and audit logging — and show how Databricks ensures atomicity and consistency across complex workflows. We’ll also dive into demos and share tips to getting started and migrations with this feature in Databricks SQL.

Summit Live: Best Practices for Data Warehouse Migrations

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Laurent Léturgez (Databricks)

AI/ML Databricks SQL

Databricks SQL is the fastest-growing data warehouse on the market, with over 10k organizations thanks to its price performance and AI innovations. See the best practices and common architectural challenges of migrating your legacy DW, including reference architectures. Learn how to easily migrate per the recently acquired the Lakebridge migration tool, and through our partners.

How to Migrate from Teradata to Databricks SQL

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Fabien Contaminard (Databricks) , Mehran Golestaneh (Databricks)

Databricks LLM SQL Teradata

Storage and processing costs of your legacy Teradata data warehouses impact your ability to deliver. Migrating your legacy Teradata data warehouse to the Databricks Data Intelligence Platform can accelerate your data modernization journey. In this session, learn the top strategies for completing this data migration. We will cover data type conversion, basic to complex code conversions, validation and reconciliation best practices. How to use Databricks natively hosted LLMs to assist with migration activities. See before-and-after architectures of customers who have migrated, and learn about the benefits they realized.

Sponsored by: Informatica | Modernize analytics and empower AI in Databricks with trusted data using Informatica

Delta and Databricks as a Performant Exabyte-Scale Application Backend

2025-06-11 · Data + AI Summit 2025 Watch

lightning_talk

by Scott Schenkein (Capital One Financial)

Databricks Delta ETL/ELT NoSQL Cyber Security Spark Data Streaming

The Delta Lake architecture promises to provide a single, highly functional, and high-scale copy of data that can be leveraged by a variety of tools to satisfy a broad range of use cases. To date, most use cases have focused on interactive data warehousing, ETL, model training, and streaming. Real-time access is generally delegated to costly and sometimes difficult-to-scale NoSQL, indexed storage, and domain-specific specialty solutions, which provide limited functionality compared to Spark on Delta Lake. In this session, we will explore the Delta data-skipping and optimization model and discuss how Capital One leveraged it along with Databricks photon and Spark Connect to implement a real-time web application backend. We’ll share how we built a highly-functional and performant security information and event management user experience (SIEM UX) that is cost effective.

Enterprise Cost Management for Data Warehousing with Databricks SQL

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Patrick Yang (Databricks) , Joo Ho Yeo (Databricks)

Databricks SQL

This session shows you how to gain visibility into your Databricks SQL spend and ensure cost efficiency. Learn about the latest features to gain detailed insights into Databricks SQL expenses so you can easily monitor and control your costs. Find out how you can enable attribution to internal projects, understand the Total Cost of Ownership, set up proactive controls and find ways to continually optimize your spend.

Sponsored by: RowZero | Spreadsheets in the modern data stack: security, governance, AI, and self-serve analytics

talk-data.com

DWH

Activity Trend

Top Events

Top Speakers

AWS re:Invent 2025 - Modernize your data warehouse by moving to Amazon Redshift (ANT317)

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2025 - What's new in Amazon Redshift and Amazon Athena (ANT206)

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2025 - Scaling Amazon Redshift with a multi-warehouse architecture (ANT318)

AWSreInvent #AWSreInvent2025 #AWS

Bilt's conversational data layer: How we connected data to LLMs with dbt

Minus Three Tier: Data Architecture Turned Upside Down

Legacy to Snowflake: How to Avoid Failing Your Data Warehouse Migration

Welcome Lakehouse, from a DWH transformation to a M&A data sharing

Iceberg Geo Type: Transforming Geospatial Data Management at Scale

Performance Best Practices for Fast Queries, High Concurrency, and Scaling on Databricks SQL

How Navy Federal's Enterprise Data Ecosystem Leverages Unity Catalog for Data + AI Governance

How to Migrate From Snowflake to Databricks SQL

How to Migrate From Oracle to Databricks SQL

Iceberg Table Format Adoption and Unified Metadata Catalog Implementation in Lakehouse Platform

Multi-Statement Transactions: How to Improve Data Consistency and Performance

Summit Live: Best Practices for Data Warehouse Migrations

How to Migrate from Teradata to Databricks SQL

Sponsored by: Informatica | Modernize analytics and empower AI in Databricks with trusted data using Informatica

Delta and Databricks as a Performant Exabyte-Scale Application Backend

Enterprise Cost Management for Data Warehousing with Databricks SQL

Sponsored by: RowZero | Spreadsheets in the modern data stack: security, governance, AI, and self-serve analytics