talk-data.com talk-data.com

Kushal Thakkar

Speaker

Kushal Thakkar

2

talks

Member of Technical Staff Anthropic

Frequent Collaborators

Filter by Event / Source

Talks & appearances

2 activities · Newest first

Search activities →
Zero-footprint SQL testing: From framework to culture shift

We built a zero-footprint SQL testing framework using mock data and the full power of the pytest ecosystem to catch syntactic and semantic issues before they reach production. More than just a tool, it helped shift our team’s mindset by integrating into CI/CD, encouraging contract-driven development, and promoting testable SQL. In this session, we’ll share our journey, key lessons learned, and how we open-sourced the framework to make it available for everyone.

This session explores how to bring unit testing to SQL pipelines using Airflow. I’ll walk through the development of a SQL testing library that allows isolated testing of SQL logic by injecting mock data into base tables. To support this, we built a type system for AWS Glue tables using Pydantic, enabling schema validation and mock data generation. Over time, this type system also powered production data quality checks via a custom Airflow operator. Learn how this approach improves reliability, accelerates development, and scales testing across data workflows.