Data quality has become a much discussed topic in the fields of data engineering and data science, and it has become clear that data validation is absolutely crucial to ensuring the reliability of any data products and insights produced by an organization’s data pipelines. This session will outline patterns for combining three popular open source tools in the data ecosystem - dbt, Airflow, and Great Expectations - and use them to build a robust data pipeline with data validation at each critical step.
talk-data.com
S
Speaker
Sam Bail
2
talks
Independent Data Professional
Superconductive
Filter by Event / Source
Talks & appearances
2 activities · Newest first
How do dbt and Great Expectations complement each other? In this video, Sam Bail of Superconductive will outline a convenient pattern for using these tools together and highlight where each one can play its strengths: Data pipelines are built and tested during development using dbt, while Great Expectations can handle data validation, pipeline control flow, and alerting in a production environment.
Check out the sample repo here: https://github.com/spbail/dag-stack