talk-data.com talk-data.com

Topic

YAML

Yet Another Markup Language (YAML)

data_serialization configuration_file_format human_readable file_format

6

tagged

Activity Trend

9 peak/qtr
2020-Q1 2026-Q1

Activities

6 activities · Newest first

How We Automate Chaos: Agentic AI and Community Ops at PyCon DE & PyData

Using AI agents and automation, PyCon DE & PyData volunteers have transformed chaos into streamlined conference ops. From YAML files to LLM-powered assistants, they automate speaker logistics, FAQs, video processing, and more while keeping humans focused on creativity. This case study reveals practical lessons on making AI work in real-world scenarios: structured workflows, validation, and clear context beat hype. Live demos and open-source tools included.

dbt for rapid deployment of a data product - Coalesce 2023

The team at nib Health has internal projects that contain standardized packages for running a dbt project, such as pipeline management, data testing, and data modeling macros. In this talk, they share how they utilized the yaml documentation files in dbt to create standardized tagging for both data security (PII), project tags, and product domain tags that get pushed into Snowflake, Immuta, and Select Star.

Speaker: Pip Sidaway, Data Product Manager, nib

Register for Coalesce at https://coalesce.getdbt.com

Unlocking model governance and multi-project deployments with dbt-meshify - Coalesce 2023

Join us for story hour, as we follow two intrepid analytics engineers working in a large dbt project as they go on a journey to meshify their dbt project, with help from a ✨special guest✨

Along the way, learn about dbt-meshify - a new CLI tool to automate the creation of model governance and cross-project lineage features in your dbt project. dbt-meshify refactors your code for you, helping you add model contracts, versions, groups, access, cross-project lineage, and more -- all in a matter of minutes! No bespoke YAML writing needed.

Speakers: Grace Goheen, Product Manager, dbt Labs; Nicholas Yager, Principal Analytics Engineer, HubSpot; Dave Connors, DX, dbt Labs

Register for Coalesce at https://coalesce.getdbt.com

One to many: Moving from a monolithic dbt project to multi-project collaboration - Coalesce 2023

At the beginning of this year, Cityblock Health was afforded a unique opportunity: to rebuild their existing dbt project from scratch.

Launched in mid-2019, the legacy project had grown organically into a tangled mess of 1800+ models, with further development becoming more and more difficult.

Faced with the challenge of retroactively imposing order on the existing project, their leadership gave them the opportunity to start fresh instead.

They jumped at the chance, and began applying many of the lessons they learned at Coalesce 2022 to set the new project up for success: - SQL linting, with SQLFluff - YAML linting, with yamllint - dbt best practices, with dbt-checkpoint and dbt-project-evaluator

As a result, this core project has become the model for multi-project collaboration at Cityblock. Rather than a single monolithic project, the new state features a collection of smaller projects, each governed by a high bar for code quality.

Speakers: Katie Claiborne, Staff analytics engineer, Cityblock Health; Nathaniel Burren, Analytics Engineer, Cityblock Health

Register for Coalesce at https://coalesce.getdbt.com

Databricks Asset Bundles: A Standard, Unified Approach to Deploying Data Products on Databricks

In this session, we will introduce Databricks Asset Bundles, provide a demonstration of how they work for a variety of data products, and how to fit them into an overall CICD strategy for the well-architected Lakehouse.

Data teams produce a variety of assets; datasets, reports and dashboards, ML models, and business applications. These assets depend upon code (notebooks, repos, queries, pipelines), infrastructure (clusters, SQL warehouses, serverless endpoints), and supporting services/resources like Unity Catalog, Databricks Workflows, and DBSQL dashboards. Today, each organization must figure out a deployment strategy for the variety of data products they build on Databricks as there is no consistent way to describe the infrastructure and services associated with project code.

Databricks Asset Bundles is a new capability on Databricks that standardizes and unifies the deployment strategy for all data products developed on the platform. It allows developers to describe the infrastructure and resources of their project through a YAML configuration file, regardless of whether they are producing a report, dashboard, online ML model, or Delta Live Tables pipeline. Behind the scenes, these configuration files use Terraform to manage resources in a Databricks workspace, but knowledge of Terraform is not required to use Databricks Asset Bundles.

Talk by: Rafi Kurlansik and Pieter Noordhuis

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Why Metrics Are Even More Valuable Than You Think They Are

Creating / migrating metric metadata to dbt can be a pain, because the level of underlying data knowledge required to create the YAML files properly. You might have found yourself wondering, “is this worth it just to standardize metric definitions?”. This talk will tell you why it is definitely worth it… because the functionality you unlock goes beyond just standard metric definitions. Adopting the dbt standard metric syntax unlocks three additional possibilities for your data:

  1. Automated time-aware metric calculations

  2. Dynamic drill downs and segmentation to empower slice and dice analysis

  3. Self-service dynamic transforms using templated SQL

Check slides here: https://docs.google.com/presentation/d/1nJHP2E6NGZ-KHG4_gNiI6w2lq4kjWgQIanAC9yf3cng

Coalesce 2023 is coming! Register for free at https://coalesce.getdbt.com/.