talk-data.com talk-data.com

Topic

Pydantic

schemas python data_modeling data_validation

1

tagged

Activity Trend

5 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: Gurmeet Saran ×

This session explores how to bring unit testing to SQL pipelines using Airflow. Iโ€™ll walk through the development of a SQL testing library that allows isolated testing of SQL logic by injecting mock data into base tables. To support this, we built a type system for AWS Glue tables using Pydantic, enabling schema validation and mock data generation. Over time, this type system also powered production data quality checks via a custom Airflow operator. Learn how this approach improves reliability, accelerates development, and scales testing across data workflows.