talk-data.com talk-data.com

P

Speaker

Pádraic Slattery

2

talks

Analytics Engineer Xebia Data

Filter by Event / Source

Talks & appearances

2 activities · Newest first

Search activities →
Coalesce 2024: Empowering dbt developers: Self-serve dbt Cloud jobs from your dbt repo

I work as an Analytics Engineer for a data consultancy, as part of this work I frequently help clients to orchestrate dbt Cloud jobs. As a result I’ve seen a lot of pain points that are encountered when doing this while at the same time I’ve seen a lot of different approaches to overcoming these pain points. Let's discuss open-source packages that can empower us in these experiences.

Speakers: Pádraic Slattery Analytics Engineer Xebia Data

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Airflow, traditionally used by Data Engineers, is now popular among Analytics Engineers who aim to provide analysts with high-quality tooling while adhering to software engineering best practices. dbt, an open-source project that uses SQL to create data transformation pipelines, is one such tool. One approach to orchestrating dbt using Airflow is using dynamic task mapping to automatically create a task for each sub-directory inside dbt’s staging, intermediate, and marts directories. This enables analysts to write SQL code that is automatically added as a dedicated task in Airflow at runtime. Combining this new Airflow feature with dbt best practices offers several benefits, such as analysts not needing to make Airflow changes and engineers being able to re-run subsets of dbt models should errors occur. In this talk, I would like to share some lessons I have learned while successfully implementing this approach for several clients.