talk-data.com talk-data.com

Ray Zhu

Speaker

Ray Zhu

2

talks

Director, Product Management Databricks

Ray Zhu is a Director of Product Management at Databricks, where he focuses on data engineering and streaming products. He's passionate about building products that empower data practitioners of different skillsets to simplify and scale their data pipelines, enabling greater customer and business value.

Bio from: Data + AI Summit 2025

Filtering by: Data + AI Summit 2025 ×

Filter by Event / Source

Talks & appearances

Showing 2 of 2 activities

Search activities →
Mastering Change Data Capture With Lakeflow Declarative Pipelines

Transactional systems are a common source of data for analytics, and Change Data Capture (CDC) offers an efficient way to extract only what’s changed. However, ingesting CDC data into an analytics system comes with challenges, such as handling out-of-order events or maintaining global order across multiple streams. These issues often require complex, stateful stream processing logic.This session will explore how Lakeflow Declarative Pipelines simplifies CDC ingestion using the Apply Changes function. With Apply Changes, global ordering across multiple change feeds is handled automatically — there is no need to manually manage state or understand advanced streaming concepts like watermarks. It supports both snapshot-based inputs from cloud storage and continuous change feeds from systems like message buses, reducing complexity for common streaming use cases.

A Comprehensive Guide to Streaming on the Data Intelligence Platform

This session is repeated.Is stream processing the future? We think so — and we’re building it with you using the latest capabilities in Apache Spark™ Structured Streaming. If you're a power user, this session is for you: we’ll demo new advanced features, from state transformations to real-time mode. If you prefer simplicity, this session is also for you: we’ll show how Lakeflow Declarative Pipelines simplifies managing streaming pipelines. And if you’re somewhere in between, we’ve got you covered — we’ll explain when to use your own streaming jobs versus Lakeflow Declarative Pipelines.