talk-data.com talk-data.com

YouTube 2022-07-19 at 16:18

How Robinhood Built a Streaming Lakehouse to Bring Data Freshness from 24h to Less Than 15 Mins

Description

Robinhood’s data lake is the bedrock foundation that powers business analytics, product experimentation, and other machine learning applications throughout our organization. Come join this session where we will share our journey of building a scalable streaming data lakehouse with Spark, Postgres and other leading open source technologies.

We will lay out our architecture in depth and describe how we perform CDC streaming ingestion and incremental processing of 1000’s of Postgres tables into our data lake.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/