Data quality is the most important attribute of a successful data platform that can accelerate data adoption and empower any organization with data-driven decisions. However, traditional profiling-based data quality and counts-based data quality and business rules-based data quality are outdated and not practical at the scale of petabyte-scaled data platforms where billions of rows get processed every day. In this talk, Sandhya Devineni and Rajesh Gundugollu will present a framework for using machine learning to detect data quality at scale in data products. The two data leaders at Asurion will highlight the lessons learned over years of crafting the advanced state of data quality using machine learning at scale, as well as discuss the pain points and blind spots of traditional data quality processes. After sharing lessons learned, the pair will dive into their implemented framework which can be utilized to improve the accuracy and reliability of data-driven decisions by identifying bad quality data records and revolutionizing how organiations approach data-driven decision making.
talk-data.com
Speaker
Rajesh Gundugollu
1
talks
Solving and simplifying petabyte scale problems!
Rajesh is an experienced Data and Thought Leader with accomplished history of making an impact. Driven by Customer Obsession, Simplified Design, Imagination, Creativity and Passion to innovate. Rajesh as a Director of Engineering and Architecture for Data, ML and AI, helped rapid innovation of Data and ML platforms at Asurion to stay simple, available, and cost effective while processing petabytes of data and helping business unlock insights from data.
Bio from: Data Universe 2024
Filtering by:
Data Universe 2024
×
Filter by Event / Source
Talks & appearances
Showing 1 of 2 activities