talk-data.com talk-data.com

Topic

IaC

Infrastructure as Code (IaC)

cloud devops automation infrastructure_as_code

12

tagged

Activity Trend

6 peak/qtr
2020-Q1 2026-Q1

Activities

12 activities · Newest first

How to scale dbt across independent teams with IaC

How do you deliver a modern, governed analytics stack to dozens of independent and competing companies, each with their own priorities, budgets, and data platforms? SpareBank 1 built a platform-as-a-service using Infrastructure as Code to provision dbt environments on demand. This session shares how SB1 modernized legacy data warehouses and scaled dbt across a network of banks, offering lessons for any organization supporting multiple business units or regions.

MLOps That Ships: Accelerating AI Deployment at Vizient

Deploying AI models efficiently and consistently is a challenge many organizations face. This session will explore how Vizient built a standardized MLOps stack using Databricks and Azure DevOps to streamline model development, deployment and monitoring. Attendees will gain insights into how Databricks Asset Bundles were leveraged to create reproducible, scalable pipelines and how Infrastructure-as-Code principles accelerated onboarding for new AI projects. The talk will cover: End-to-end MLOps stack setup, ensuring efficiency and governance CI/CD pipeline architecture, automating model versioning and deployment Standardizing AI model repositories, reducing development and deployment time Lessons learned, including challenges and best practices By the end of this session, participants will have a roadmap for implementing a scalable, reusable MLOps framework that enhances operational efficiency across AI initiatives.

FinOps: Automated Unity Catalog Cost Observability, Data Isolation and Governance Framework

Westat, a leader in data-driven research for 60 years+, has implemented a centralized Databricks platform to support hundreds of research projects for government, foundations, and private clients. This initiative modernizes Westat’s technical infrastructure while maintaining rigorous statistical standards and streamlining data science. The platform enables isolated project environments with strict data boundaries, centralized oversight, and regulatory compliance. It allows project-specific customization of compute and analytics, and delivers scalable computing for complex analyses. Key features include config-driven Infrastructure as Code (IaC) with Terragrunt, custom tagging and AWS cost integration for ROI tracking, budget policies with alerts for proactive cost management, and a centralized dashboard with row-level security for self-service cost analytics. This unified approach provides full financial visibility and governance while empowering data teams to deliver value. Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.

Sponsored by: Astronomer | Scaling Data Teams for the Future

The role of data teams and data engineers is evolving. No longer just pipeline builders or dashboard creators, today’s data teams must evolve to drive business strategy, enable automation, and scale with growing demands. Best practices seen in the software engineering world (Agile development, CI/CD, and Infrastructure-as-code) from the DevOps movement are gradually making their way into data engineering. We believe these changes have led to the rise of DataOps and a new wave of best practices that will transform the discipline of data engineering. But how do you transform a reactive team into a proactive force for innovation? We’ll explore the key principles for building a resilient, high-impact data team—from structuring for collaboration, testing, automation, to leveraging modern orchestration tools. Whether you’re leading a team or looking to future-proof your career, you’ll walk away with actionable insights on how to stay ahead in the rapidly changing data landscape.

Coalesce 2024: Orchestration as code with dbt Cloud and Snowflake

In this talk, data engineers from AB CarVal will discuss how to orchestrate jobs in an efficient and timely manner for business-critical data that arrives on a non-regular cadence, and why Infrastructure-as-Code is important and how to extend this to your dbt Cloud jobs running on Snowflake.

Speaker: Rafael Cohn-Gruenwald Sr. Data Engineer Alliance Bernstein

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: From Core to Cloud: Unlocking dbt at Warner Brothers Discovery (CNN)

Since the beginning of 2024, the Warner Brothers Discovery team supporting the CNN data platform has been undergoing an extensive migration project from dbt Core to dbt Cloud. Concurrently, the team is also segmenting their project into multi-project frameworks utilizing dbt Mesh. In this talk, Zachary will review how this transition has simplified data pipelines, improved pipeline performance and data quality, and made data collaboration at scale more seamless.

He'll discuss how dbt Cloud features like the Cloud IDE, automated testing, documentation, and code deployment have enabled the team to standardize on a single developer platform while also managing dependencies effectively. He'll share details on how the automation framework they built using Terraform streamlines dbt project deployments with dbt Cloud to a ""push-button"" process. By leveraging an infrastructure as code experience, they can orchestrate the creation of environment variables, dbt Cloud jobs, Airflow connections, and AWS secrets with a unified approach that ensures consistency and reliability across projects.

Speakers: Mamta Gupta Staff Analytics Engineer Warner Brothers Discovery

Zachary Lancaster Manager, Data Engineering Warner Brothers Discovery

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: Why analytics engineering and DevOps go hand-in-hand

There are undoubtedly similarities between the disciplines of analytics engineering and DevOps: in fact, dbt was founded with the goal of helping data professionals embrace DevOps principles as part of the data workflow. As the embedded DevOps engineer for a mature analytics engineering function, Katie Claiborne, Founding Analytics Engineer at Duet, observed parallels between analytics-as-code and infrastructure-as-code, particularly tools like Terraform. In this talk, she'll examine how analytics engineering is a means of empowerment for data practitioners and discuss infrastructure engineering as a means of scaling dbt Cloud deployments. Learn about similarities between analytics and infrastructure configuration tools, how to apply the concepts you've learned about analytics engineering towards new disciplines like DevOps, and how to extend engineering principles beyond data transformation and into the world of infrastructure.

Speaker: Katie Claiborne Founding Analytics Engineer Duet

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

AWS re:Inforce 2024 - Generative AI to identify potential risks in architectural diagrams (ARC222)

Improving your security posture requires continuous evaluation. This lightning talk demonstrates how generative AI can analyze architectural diagrams to identify AWS security best practices from the AWS Well-Architected Framework. Elements like identity providers or secret management within architectural diagrams can provide insights into your architectural design and help you identify areas for improvement. Additionally, see how to apply this concept to infrastructure as code for deeper inspection of your cloud architecture against AWS security best practices.

Learn more about AWS re:Inforce at https://go.aws/reinforce.

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

reInforce2024 #CloudSecurity #AWS #AmazonWebServices #CloudComputing

Maciej Marek: Peer Beneath the Surface of the Earth

Explore the depths of subsurface analysis with Maciej Marek as he unveils 'Peer Beneath the Surface of the Earth.' 🌍📡 Discover Widmo's innovative Spectral Ground Penetrating Radar and their journey from single to multi-cloud solutions, facing challenges along the way. Join the session to learn about data warehousing, Infrastructure as Code, and the role of AI in automation! 💡🛠️ #SubsurfaceAnalysis #Innovation #MultiCloud

Angelika Postaremczak: Best Practices for Storing Data in BigQuery

Join Angelika Postaremczak in an enlightening session on 'Best Practices for Storing Data in BigQuery' and discover the keys to optimizing data storage for lightning-fast queries without breaking the bank! 🚀💾 Explore table design strategies, data partitioning, clustering, and resource management through Infrastructure as Code for maximizing the potential of cloud data storage. ☁️📊 #BigQuery #datastorage

✨ H I G H L I G H T S ✨

🙌 A huge shoutout to all the incredible participants who made Big Data Conference Europe 2023 in Vilnius, Lithuania, from November 21-24, an absolute triumph! 🎉 Your attendance and active participation were instrumental in making this event so special. 🌍

Don't forget to check out the session recordings from the conference to relive the valuable insights and knowledge shared! 📽️

Once again, THANK YOU for playing a pivotal role in the success of Big Data Conference Europe 2023. 🚀 See you next year for another unforgettable conference! 📅 #BigDataConference #SeeYouNextYear

Rethinking Orchestration as Reconciliation: Software-Defined Assets in Dagster

This talk discusses “software-defined assets”, a declarative approach to orchestration and data management that makes it drastically easier to trust and evolve datasets and ML models. Dagster is an open source orchestrator built for maintaining software-defined assets.

In traditional data platforms, code and data are only loosely coupled. As a consequence, deploying changes to data feels dangerous, backfills are error-prone and irreversible, and it’s difficult to trust data, because you don’t know where it comes from or how it’s intended to be maintained. Each time you run a job that mutates a data asset, you add a new variable to account for when debugging problems.

Dagster proposes an alternative approach to data management that tightly couples data assets to code - each table or ML model corresponds to the function that’s responsible for generating it. This results in a “Data as Code” approach that mimics the “Infrastructure as Code” approach that’s central to modern DevOps. Your git repo becomes your source of truth on your data, so pushing data changes feels as safe as pushing code changes. Backfills become easy to reason about. You trust your data assets because you know how they’re computed and can reproduce them at any time. The role of the orchestrator is to ensure that physical assets in the data warehouse match the logical assets that are defined in code, so each job run is a step towards order.

Software-defined assets is a natural approach to orchestration for the modern data stack, in part because dbt models are a type of software-defined asset.

Attendees of this session will learn how to build and maintain lakehouses of software-defined assets with Dagster.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Turning Big Biology Data into Insights on Disease – The Power of Circulating Biomarkers

Profiling small molecules in human blood across global populations gives rise to a greater understanding of the varied biological pathways and processes that contribute to human health and diseases. Herein, we describe the development of a comprehensive Human Biology Database, derived from nontargeted molecular profiling of over 300,000 human blood samples from individuals across diverse backgrounds, demographics, geographical locations, lifestyles, diseases, and medication regimens, and its applications to inform drug development.

Approximately 11,000 circulating molecules have been captured and measured per sample using Sapient’s high-throughput, high-specificity rapid liquid chromatography-mass spectrometry (rLC-MS) platform. The samples come from cohorts with adjudicated clinical outcomes from prospective studies lasting 10-25 years, as well as data on individuals’ diet, nutrition, physical exercise, and mental health. Genetic information for a subset of subjects is also included and we have added microbiome sequencing data from over 150,000 human samples in diverse diseases.

An efficient data science environment is established to enable effective health insight mining across this vast database. Built on a customized AWS and Databricks “infrastructure-as-code” Terraform configuration, we employ streamlined data ETL and machine learning-based approaches for rapid rLC-MS data extraction. In mining the database, we have been able to identify circulating molecules potentially causal to disease; illuminate the impact of human exposures like diet and environment on disease development, aging, and mortality over decades of time; and support drug development efforts through identification of biomarkers of target engagement, pharmacodynamics, safety, efficacy, and more.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/