talk-data.com talk-data.com

Event

Data Universe 2024

2024-04-10 – 2024-04-11 Big Data LDN/Paris

Activities tracked

3

Filtering by: dbt ×

Sessions & talks

Showing 1–3 of 3 · Newest first

Search within this event →

Empowering Data Ownership and Operational Excellence with Galaxy: A Data Mesh Approach

2024-04-11
Face To Face

Governance is difficult for an organization of any size, and many struggle to execute on data management in an efficient manner. At Assurance, the team has utilized Starburst Galaxy to embed ownership within the data mesh framework, completely transforming the way organizations handle data. By granting data owners complete control and visibility over their data, Assurance enables a more nuanced and effective approach to data management. This approach not only fosters a sense of responsibility but also ensures that data is relevant, up-to-date, and aligned with the evolving needs of the organization. In this presentation, Shen Weng and Mitchell Polsons will discuss the strategic implementation of compute ownership in Starburst Galaxy, showing how it empowers teams to identify and resolve issues quickly, significantly improving the uptime of key computing operations. This approach is vital for achieving operational excellence, characterized by enhanced efficiency, reliability, and quality. Additionally, the new data setup has enabled the Assurance team to simplify data transformation processes using dbt and to improve data quality monitoring with Monte Carlo, further streamlining and strengthening our data management practices.

Building Telemetry Curations and Effective Data Pipelines

2024-04-11
Face To Face

Have you ever wondered how a data company does data? In this session, Isaac Obezo, Staff Data Engineer at Starburst, will take you for a peek behind the curtain into Starburst’s own data architecture built to support batch processing of telemetry data within Galaxy data pipelines. Isaac will walk you through our architecture utilizing tools like git, dbt, and Starburst Galaxy to create a CI/CD process allowing our data engineering team to iterate quickly to deploy new models, develop and land data, and create and improve existing models in the data lake. Isaac will also discuss Starburst’s mentality toward data quality, the use of data products, and the process toward delivering quality analytics.

Finding Novel Data Issues with CI/CD Using dbt and Datafold

2024-04-11
Face To Face

Join the team from Moody's Analytics as they take you on a personal journey of optimizing their data pipelines for data quality and governance. Like many data practitioners, Ryan understands the frustration and anxiety that comes with accidentally introducing bad code into production pipelines—he's spent countless hours putting out fires caused by these unexpected changes. In this session, Ryan will recount his experiences with a previous data stack that lacked standardized testing methods and visibility into the impact of code changes on production data. He'll also share how their new data stack is safeguarded by Datafold's data diffing and continuous integration (CI) capabilities, which enables his team to work with greater confidence, peace of mind, and speed.