talk-data.com talk-data.com

Topic

dbt

dbt (data build tool)

data_transformation analytics_engineering sql

758

tagged

Activity Trend

134 peak/qtr
2020-Q1 2026-Q1

Activities

758 activities · Newest first

Coalesce 2024: Leveraging column-level lineage to scale your dbt projects

Today, we have tools to enforce quality checks on projects, at the model level, like dbt_project_evaluator. Those tools are indispensable to allow teams to scale their dbt transformation.

But while we've been focusing on rules at the model level. Could we leverage CLL to also define rules at the column level now?

The idea of this talk would be to build an open source tool and present what problems it can solve.

Speakers: Benoit Perigaud Senior Resident Architect dbt Labs

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: And for my next Mesh trick... Helping teams scale beyond the monolith

Are you looking to bring more colleagues into the work of defining data transformations with dbt? Wondering how your already-expansive DAG could scale to more teams? Join Jeremy and his magical assistants for a résumé of the dbt Mesh pattern that’s supporting multi-project collaboration in dbt Cloud, for hundreds of data teams large & small — including a few new tricks they’ve got up their sleeves for managing dbt at scale.

Speakers: Jeremy Cohen Anders Swanson

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: Mixed model arts: The convergence of data modeling across apps, analytics, and AI

For decades, siloed data modeling has been the norm: applications, analytics, and machine learning/AI. However, the emergence of AI, streaming data, and “shifting left" are changing data modeling, making siloed data approaches insufficient for the diverse world of data use cases. Today's practitioners must possess an end-to-end understanding of the myriad techniques for modeling data throughout the data lifecycle. This presentation covers "mixed model arts," which advocates converging various data modeling methods and the innovations of new ones.

Speaker: Joe Reis Author Nerd Herd

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: Orchestration as code with dbt Cloud and Snowflake

In this talk, data engineers from AB CarVal will discuss how to orchestrate jobs in an efficient and timely manner for business-critical data that arrives on a non-regular cadence, and why Infrastructure-as-Code is important and how to extend this to your dbt Cloud jobs running on Snowflake.

Speaker: Rafael Cohn-Gruenwald Sr. Data Engineer Alliance Bernstein

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: Empowering dbt developers: Self-serve dbt Cloud jobs from your dbt repo

I work as an Analytics Engineer for a data consultancy, as part of this work I frequently help clients to orchestrate dbt Cloud jobs. As a result I’ve seen a lot of pain points that are encountered when doing this while at the same time I’ve seen a lot of different approaches to overcoming these pain points. Let's discuss open-source packages that can empower us in these experiences.

Speakers: Pádraic Slattery Analytics Engineer Xebia Data

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: Data alone is not enough

Data initiatives often prioritize democratizing access to information without sufficiently focusing on driving business impact. While access to information is crucial, it alone cannot drive organizational change. For meaningful transformation, companies must integrate their data with tools that enable action. In this talk, Preston will share insights on why merely providing data is insufficient for fostering significant change. He will outline three key strategies, centered around dbt, that Settle employs within its data team and across the broader organization to bridge the gap between data access and actionable outcomes.

Speaker: Preston Wong Analytics Engineer Settle

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: How Amplify optimized their incremental models with dbt on Snowflake

Like many other dbt users, Amplify has some very large data sets. (Their largest model needs to be updated every two hours and would cost $2.6 million to build annually if they fully refreshed it every time). Turning this into an incremental model was a natural choice, and helped a lot. However, they found that simply adding materialized = ‘incremental didn’t solve all of their problems.

Specifically, they still had issues running not_null and unique tests against such a large model, issues sizing their Snowflake warehouse appropriately to accommodate both incremental builds and full-refreshes, and perhaps most importantly, the model was still costing $50,000 annually to build (which can quickly add up when you have dozens of similarly sized models). In this talk they discuss several innovative solutions that they implemented to address these issues, including how they ultimately brought the cost of building this particular model down to just $600 annually!

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: Needle in the (data) stack: How Spotify powers Salesforce

Spotify has absurd quantities of data. This is a huge asset, but it makes it difficult to power their frontline partnership team in Salesforce with the relevant cuts of that data they need. After struggling with both ad-hoc solutions and Salesforce consultant-led solutions, they've landed on a flexible, secure, and automated data strategy: they use dbt and Hightouch to refine critical data in Google BigQuery, sync updated records to Salesforce, and then close the loop for intelligence and analytics.

They'll share their optimal solution, with no caveats, for the real, everyday data issues that many teams encounter at scale with Salesforce.

Speaker: Tim Leonard Sr Insights Manager Spotify

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: Your first 90 days in dbt Cloud

Are you new to or interested in dbt Cloud? We invite data practitioners to join us and learn how to get started, implement best practices, and optimize their dbt Cloud journey!

Speaker: Brian Jan Lead Cloud Onboarding Architect dbt Labs

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: Customer health with dbt Cloud: A LiveRamp data journey

We aim to illustrate the transition from an antiquated methodology for generating final tables/views in Google Cloud Platform (GCP) to the implementation of a structured process utilizing dbt.

This transition involves defining how we develop source, staging, intermediate, and final models within dbt, facilitating enhanced change management and error detection mechanisms. We will talk about how far we have come and our plan to maintain this work-stream.

Speaker: Kyle Salomon Business Analytics Manager LiveRamp

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: Breaking the mold: A smarter approach to data testing

Current data testing practices—meticulously testing individual models and methods—are not only outdated but also costly and inefficient. In this talk, Aiven challenges this traditional approach, which they argue accumulates unnecessary technical debt and inflates warehousing costs without improving data quality.

Speakers: Anton Heikinheimo Senior Data Engineer Aiven

Emiel Verkade Senior Analytics Engineer Aiven

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: The iceberg cometh: Open table formats for analytics engineering

Want to learn more about dbt Cloud’s cross-platform mesh vision? Want to brainstorm how open table formats are of value to analytics engineers? Want to talk with peers who are considering or scoping how to use dbt and lakehouses to orchestrate workloads across platforms to solve business problems?

In this peer exchange, participants talk with peers and think through any and all of the above.

Speakers: Anders Swanson I serve the realm dbt Labs

Ulrik Svanborg Møller Lead Data Engineer Vestas Wind Systems

William Krill Data Engineer Vestas Wind Systems

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: How Cox Automotive turbocharged data engineering with ELT

A migration story with the underlying philosophy, and strategic approach to move from a low-code ETL tool like Alteryx to a modern data engineering mindset with dbt. This transition is not just about adopting new tools but embracing a code-first philosophy that promotes best practices in software engineering, such as modularity, reusability, and transparency.

Speakers: Somnath Chatterjee Lead Data Engineer Cox Automotive

Brett Darcy Lead Software Engineer Cox Automotive

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: Tale of two clouds: Pacific Life’s quest for data enlightenment

This session will detail Pacific Life’s strategic transition to advanced data platforms, dbt and Snowflake. It will provide a comprehensive overview of the challenges encountered, the strategies employed to overcome them, and the significant advantages gained through this transformation.

Speakers: Trang Do Data Engineer Excellence Pacific Life

Jacob Emerson Data Engineer Pacific Life

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: How Eleanor Health modernized its data stack

This session will cover the before and after of the Eleanor Health analytics architecture story moving from dbt Core to dbt Cloud. As a part of it, it will also go over our foundational architectural principles and design patterns that helped us create a new Analytics Layer that increased productivity for our analysts.

Speaker: Scott Parent Director, Data & Analytics Eleanor Health

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: How USAA moves quickly and manages vulnerabilities​ with dbt Cloud​

As part of a rapid modernization initiative, USAA Property and Casualty migrated from a legacy, GUI-based ETL tool and on-prem servers to dbt Cloud and a cloud database. Adopting dbt Cloud enabled near real time data delivery, but dbt Python models opened the door to dependency management. In this session, USAA shares how to go fast and manage vulnerabilities.

Speakers: Kit Alderson USAA Data Engineer, CI/CD Wizard USAA

Ted Douglass Senior Data Engineer USAA

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: Day in the life of a data person: 2029 edition

AI is changing software engineering, and it will change analytics too. Come for a glimpse of the day-to-day job of an analyst in the future, and some strategies for how to maximally benefit from AI.

Speaker: Bryan Bischof Head of AI Hex

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: What does enterprise AI lose by not investing in semantics and knowledge?

In this talk, we will make the case that the success of enterprise AI depends on an investment in semantics and knowledge, not just data. Our LLM Accuracy benchmark research provided evidence that by layering semantic layers/knowledge graphs on enterprise SQL databases increases the accuracy of LLMs at least 4X for question answering. This work has been reproduced and validated by many others, including dbt labs. It's fantastic that semantics and knowledge are getting the attention it deserves. We need more.

This talk is targeted to 1) those who believe AI accuracy can be improved by simply adding more data to fine-tune/train models, and 2) the believers in semantics and knowledge who need help getting executive buy-in.

We will dive into: - the knowledge engineering work that needs to be done - who should be leading this work (hint: analytics engineers) - what companies lose by not doing this knowledge engineering work

Speaker: Juan Sequeda Principal Scientist and Head of AI Lab data.world

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: Designing Figma Data Science’s first ML system

Despite the popularity of ML as a technical solution, there are few resources on the practical aspects of deploying your first ML model. This talk covers Figma’s journey from ideation to post launch: how we decided to invest, how we designed and built our first pipeline with dbt, what we wish we did differently on the way to production, and what came after our first launch.

Speaker: Emily Jia Data Scientist Figma

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: Transitioning from dbt Core to dbt Cloud: A user story

Join us as we share our journey of migrating from dbt Core to dbt Cloud. We'll discuss why we made this shift – focusing on security, ownership, and standardization. Starting with separate team-based projects on dbt Core, we moved towards a unified structure, and eventually embraced dbt Cloud. Now, all teams follow a common structure and standardized requirements, ensuring better security and collaboration.

In our session, we'll explore how we improved our data analytics processes by migrating from dbt Core to dbt Cloud. Initially, each team had its way of working on dbt Core, leading to security risks and inconsistent practices. To address this, we transitioned to a more unified approach on dbt Core. This year we migrated dbt Cloud, which allowed us to centralize our data analytics workflows, enhancing security and promoting collaboration.

For scheduling we manage our own Airflow instance using AWS EKS. We use Datahub as data catalog.

Key points: Enhanced Security: dbt Cloud provided robust security features, helping us safeguard our data pipelines. Ownership and Collaboration: With dbt Cloud, teams took ownership of their projects while collaborating more effectively. Standardization: We enforced standardized requirements across all projects, ensuring consistency and efficiency, using dbt-project-evaluator.

Speakers: Alejandro Ivanez Platform Engineer DPG Media

Mathias Lavaert Principal Platform Engineer DPG Media

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements