talk-data.com talk-data.com

Company

SnapTravel

Speakers

1

Activities

1

Speakers from SnapTravel

Talks & appearances

1 activities from SnapTravel speakers

To improve automation of data pipelines, I propose a universal approach to ELT pipeline that optimizes for data integrity, extensibility, and speed to delivery. The workflow is built using open source tools and standards like Apache Airflow, Singer, Great Expectations, and DBT. Templating ETLs is challenging! The creation and maintenance of data pipelines in production require hard work to manage bugs in code and bad data. I like to propose a data pipeline pattern that can simplify building pipelines while optimizing for data integrity and observability. The workflow is built using open source tools like Singer, Great Expectations, and DBT. Goals: Make EL T simple and fast to implement Validate your assumptions of the data before you make it available for use Allow analysts/data scientists add pain-free contributions to EL T using SQL Generate data documentation, failure logs for quick recovery, and fixes outages in your pipeline Target Audience: Approachable to any level of developer Novice data personals interested in starting ELT workflow and learning about different tools of the ecosystem Intermediate+ developers interested in supercharging their pipeline with Write Audit Publish pattern and reducing pipeline debt