talk-data.com

Topic

Cloud Storage

object_storage file_storage cloud

Activities

tagged

Activity Trend

5 peak/qtr

2020-Q1 2026-Q2

Top Events

O'Reilly Data Engineering Books 21 Google Cloud Next '25 10 Google Cloud Next '24 7 Databricks DATA + AI Summit 2023 6 Microsoft Ignite 2025 5 Data + AI Summit 2025 5 Data Engineering Podcast 2 The Analytics Power Hour 1 SciPy 2025 1 DATA MINER Big Data Europe Conference 2020 1 Get things in order with GCP Workflows - PART 2 || FREE Community Training 1 Snowflake World Tour - Stockholm 1

Top Speakers

Larry Coyne 6 Joe Hew 5 Michael Scott 5 Derek Erdmann 5 Alberto Barajas Ortiz 5 Aderson Pacini 4 Bert Dufrasne 4 Tomoaki Ogino 4 Chen Zhu 4 Taisei Takai 4 Carlos Villuendas (Microsoft) 3 Trevor Davis (Microsoft) 3

Activities

Showing filtered results

All Video Podcast Book

Filtering by: Data + AI Summit 2025 ×

Sponsored by: Google Cloud | Powering AI & Analytics: Innovations in Google Cloud Storage for Data Lakes

Mastering Change Data Capture With Lakeflow Declarative Pipelines

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Ray Zhu (Databricks) , Jacob Gollub (Square)

Analytics Cloud Computing Data Streaming

Transactional systems are a common source of data for analytics, and Change Data Capture (CDC) offers an efficient way to extract only what’s changed. However, ingesting CDC data into an analytics system comes with challenges, such as handling out-of-order events or maintaining global order across multiple streams. These issues often require complex, stateful stream processing logic.This session will explore how Lakeflow Declarative Pipelines simplifies CDC ingestion using the Apply Changes function. With Apply Changes, global ordering across multiple change feeds is handled automatically — there is no need to manually manage state or understand advanced streaming concepts like watermarks. It supports both snapshot-based inputs from cloud storage and continuous change feeds from systems like message buses, reducing complexity for common streaming use cases.

Real-Time Analytics Pipeline for IoT Device Monitoring and Reporting

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Nayan Sharma (CKDelta) , Padraic Kirrane (CK Delta)

Analytics API Cloud Computing Data Quality Databricks IoT Data Streaming

This session will show how we implemented a solution to support high-frequency data ingestion from smart meters. We implemented a robust API endpoint that interfaces directly with IoT devices. This API processes messages in real time from millions of distributed IoT devices and meters across the network. The architecture leverages cloud storage as a landing zone for the raw data, followed by a streaming pipeline built on Lakeflow Declarative Pipelines. This pipeline implements a multi-layer medallion architecture to progressively clean, transform and enrich the data. The pipeline operates continuously to maintain near real-time data freshness in our gold layer tables. These datasets connect directly to Databricks Dashboards, providing stakeholders with immediate insights into their operational metrics. This solution demonstrates how modern data architecture can handle high-volume IoT data streams while maintaining data quality and providing accessible real-time analytics for business users.

Sponsored by: Fivetran | Raw Data to Real-Time Insights: How Dropbox Revolutionized Data Ingestion

Lakeflow Connect: Smarter, Simpler File Ingestion With the Next Generation of Auto Loader

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Sandip Agarwala (Databricks) , Chavdar Botev (Databricks)

API Cloud Computing Data Lakehouse Data Quality Databricks Delta

Auto Loader is the definitive tool for ingesting data from cloud storage into your lakehouse. In this session, we’ll unveil new features and best practices that simplify every aspect of cloud storage ingestion. We’ll demo out-of-the-box observability for pipeline health and data quality, walk through improvements for schema management, introduce a series of new data formats and unveil recent strides in Auto Loader performance. Along the way, we’ll provide examples and best practices for optimizing cost and performance. Finally, we’ll introduce a preview of what’s coming next — including a REST API for pushing files directly to Delta, a UI for creating cloud storage pipelines and more. Join us to help shape the future of file ingestion on Databricks.