talk-data.com talk-data.com

Topic

Dagster

data_orchestration data_pipelines data_engineering airflow etl

6

tagged

Activity Trend

10 peak/qtr
2020-Q1 2026-Q1

Activities

6 activities · Newest first

Sponsored by: Dagster Labs | The Age of AI is Changing Data Engineering for Good

The last major shift in data engineering came during the rise of the cloud, transforming how we store, manage, and analyze data. Today, we stand at the cusp of the next revolution: AI-driven data engineering. This shift promises not just faster pipelines, but a fundamental change in the way data systems are designed and maintained. AI will redefine who builds data infrastructure, automating routine tasks, enabling more teams to contribute to data platforms, and (if done right) freeing up engineers to focus on higher-value work. However, this transformation also brings heightened pressure around governance, risk, and data security, requiring new approaches to control and oversight. For those prepared, this is a moment of immense opportunity – a chance to embrace a future of smarter, faster, and more responsive data systems.

Outgrowing a single `dbt run`

When does your team decided it’s time to move beyond a singular dbt run? For most analytics engineers, there comes a time when the dbt run commands on fixed schedules simply won’t make the cut. Join Prratek Ramchandani (Vox Media) as he breaks down an alternative approach to orchestrating your dbt project with Dagster that balances meeting SLAs with safely handling the edge cases a simple schedule-based dbt run might create.

Check the slides here: https://docs.google.com/presentation/d/1zivYO_EpN6T9JYM9HjAJAz3bK3e2TREwdKffylkzuUw/edit?usp=sharing

Coalesce 2023 is coming! Register for free at https://coalesce.getdbt.com/.

Rethinking Orchestration as Reconciliation: Software-Defined Assets in Dagster

This talk discusses “software-defined assets”, a declarative approach to orchestration and data management that makes it drastically easier to trust and evolve datasets and ML models. Dagster is an open source orchestrator built for maintaining software-defined assets.

In traditional data platforms, code and data are only loosely coupled. As a consequence, deploying changes to data feels dangerous, backfills are error-prone and irreversible, and it’s difficult to trust data, because you don’t know where it comes from or how it’s intended to be maintained. Each time you run a job that mutates a data asset, you add a new variable to account for when debugging problems.

Dagster proposes an alternative approach to data management that tightly couples data assets to code - each table or ML model corresponds to the function that’s responsible for generating it. This results in a “Data as Code” approach that mimics the “Infrastructure as Code” approach that’s central to modern DevOps. Your git repo becomes your source of truth on your data, so pushing data changes feels as safe as pushing code changes. Backfills become easy to reason about. You trust your data assets because you know how they’re computed and can reproduce them at any time. The role of the orchestrator is to ensure that physical assets in the data warehouse match the logical assets that are defined in code, so each job run is a step towards order.

Software-defined assets is a natural approach to orchestration for the modern data stack, in part because dbt models are a type of software-defined asset.

Attendees of this session will learn how to build and maintain lakehouses of software-defined assets with Dagster.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Orchestrating dbt with Dagster

dbt defined an entire new subspecialty of software engineering: Analytics Engineering. But it is one discipline among many: analytics engineers must collaborate with data scientists, data engineers, and data platform engineers to deliver a cohesive data platform. In this video, Nick Schrock of Elementl talks about how orchestrating dbt with Dagster allows you to place dbt in context, de-silo your operational systems, improve monitoring, and enable self-service operations.

Analytics on your analytics, Drizly

Using dbt's metadata on dbt runs (run_results.json) Drizly analytics is able to track, monitor, and alert on its dbt models using Looker to visualize the data. In this video, Emily Hawkins covers how Drizly did this before, using dbt macros and inserts, and how the process was improved using run_results.json in conjunction with Dagster (and teamwork with Fishtown Analytics!)