ETL data pipelines are the bread and butter of data teams that must design, develop, and author DAGs to accommodate the various business requirements. dbt is becoming one of the most used tools to perform SQL transformations on the Data Warehouse, allowing teams to harness the power of queries at scale. Airflow users are constantly finding new ways to integrate dbt with the Airflow ecosystem and build a single pane of glass where Data Engineers can manage and administer their pipelines. Astronomer Cosmos, an open-source product, has been introduced to integrate Airflow with dbt Core seamlessly. Now you can easily see your dbt pipelines fully integrated on Airflow. You will learn the following: How to integrate dbt Core with Airflow How to use Cosmos How to build data pipelines at scale
talk-data.com
Topic
Cosmos
Azure Cosmos DB
2
tagged
Activity Trend
Top Events
Airflow is a popular choice for organizations looking to integrate open-source dbt within their existing data infrastructure. This talk will explore two primary methods of running dbt in Airflow: job-level and model-level. We’ll discuss the tradeoffs associated with each approach, highlighting the simplicity and efficiency of job-level orchestration, contrasted with the enhanced observability and control provided by model-level orchestration. We’ll also explain how the balance has shifted in recent years, with improvements to dbt core making model-level more efficient and innovative Airflow extensions like Cosmos making it easier to implement. Finally, we’ll provide benchmarks to help you determine which paradigm is the best fit for your needs.