talk-data.com talk-data.com

Rajesh Gundugollu

Speaker

Rajesh Gundugollu

2

talks

Solving and simplifying petabyte scale problems!

Rajesh is an experienced Data and Thought Leader with accomplished history of making an impact. Driven by Customer Obsession, Simplified Design, Imagination, Creativity and Passion to innovate. Rajesh as a Director of Engineering and Architecture for Data, ML and AI, helped rapid innovation of Data and ML platforms at Asurion to stay simple, available, and cost effective while processing petabytes of data and helping business unlock insights from data.

Bio from: Data Universe 2024

Filter by Event / Source

Talks & appearances

2 activities · Newest first

Search activities →

Data quality is the most important attribute of a successful data platform that can accelerate data adoption and empower any organization with data-driven decisions. However, traditional profiling-based data quality and counts-based data quality and business rules-based data quality are outdated and not practical at the scale of petabyte-scaled data platforms where billions of rows get processed every day. In this talk, Sandhya Devineni and Rajesh Gundugollu will present a framework for using machine learning to detect data quality at scale in data products. The two data leaders at Asurion will highlight the lessons learned over years of crafting the advanced state of data quality using machine learning at scale, as well as discuss the pain points and blind spots of traditional data quality processes. After sharing lessons learned, the pair will dive into their implemented framework which can be utilized to improve the accuracy and reliability of data-driven decisions by identifying bad quality data records and revolutionizing how organiations approach data-driven decision making.

Workload Orchestration is at the heart of a successful Data lakehouse implementation. Especially for the “house” part which represents the Datawarehouse workloads which often are complex because of the very nature of warehouse data, which have dependency orchestration problems. We at Asurion have spent years in perfecting the Airflow solution to make it a super power for our Data Engineers. We have innovated in key areas like single operator for all use cases, auto DAG code generation, custom UI components for Data Engineers, monitoring tools etc. With over a few million job runs per year running on a platform with over 3 nines of availability, we have condensed years of our learnings into valuable ideas that can inspire and help all other Data enthusiasts. This session is going to walk the audience through some blind spots and pain points of Airflow architecture, scaling, engineering culture.