Zillow has well-established, comprehensive systems for defining and enforcing data quality contracts and detecting anomalies.In this session, we will share how we evaluated Databricks’ native data quality features and why we chose Lakeflow Declarative Pipelines expectations for Lakeflow Declarative Pipelines, along with a combination of enforced constraints and self-defined queries for other job types. Our evaluation considered factors such as performance overhead, cost and scalability. We’ll highlight key improvements over our previous system and demonstrate how these choices have enabled Zillow to enforce scalable, production-grade data quality.Additionally, we are actively testing Databricks’ latest data quality innovations, including enhancements to lakehouse monitoring and the newly released DQX project from Databricks Labs.In summary, we will cover Zillow’s approach to data quality in the lakehouse, key lessons from our migration and actionable takeaways.
talk-data.com
F
Speaker
Firas Farah
1
talks
Sr. Solutions Architect
Databricks
Firas Farah is a Solutions Architect at Databricks, where he supports digital-native customers in building and scaling data and AI solutions. He has worked across industries to help organizations unlock the full potential of the Lakehouse platform. Prior to joining Databricks, Firas was a consultant specializing in MLOps and DataOps.
Bio from: Data + AI Summit 2025
Filter by Event / Source
Talks & appearances
1 activities · Newest first