talk-data.com talk-data.com

S

Speaker

Sergiy Kanyshchev

1

talks

Staff Software Engineer Databricks

Sergiy Kanyshchev is a Staff Software Engineer at Databricks with a background in Web-scale data processing. As a member of the Data Platform team, he focuses on scaling best practices for data handling and stewardship within the internal data lakehouse. He also contributes to federated Feature Usage pipeline, providing insights into customer engagement with Databricks products. Prior to joining Databricks, Sergiy spent 12 years at Google enhancing the company's main Web Crawler - Googlebot. This includes time serving as a Tech Lead on a team responsible for guiding the Crawl towards the useful parts of the Web and measuring Crawl’s impact via analytics and quality metrics at the scale of trillions of URLs.

Bio from: Data + AI Summit 2025

Filter by Event / Source

Talks & appearances

1 activities · Newest first

Search activities →
Trust You Can Measure: Data Quality Standards in The Lakehouse

Do you trust your data? If you’ve ever struggled to figure out which datasets are reliable, well-governed, or safe to use, you’re not alone. At Databricks, our own internal lakehouse faced the same challenge—hundreds of thousands of tables, but no easy way to tell which data met quality standards. In this talk, the Databricks Data Platform team shares how we tackled this problem by building the Data Governance Score—a way to systematically measure and surface trust signals across the entire lakehouse. You’ll learn how we leverage Unity Catalog, governed tags, and enforcement to drive better data decisions at scale. Whether you're a data engineer, platform owner, or business leader, you’ll leave with practical ideas on how to raise the bar for data quality and trust in your own data ecosystem.