talk-data.com talk-data.com

Topic

CI/CD

Continuous Integration/Continuous Delivery (CI/CD)

devops automation software_development ci_cd

54

tagged

Activity Trend

21 peak/qtr
2020-Q1 2026-Q1

Activities

54 activities · Newest first

AWS re:Invent 2025 - Improve agent quality in production with Bedrock AgentCore Evaluations(AIM3348)

Amazon Bedrock AgentCore Evaluations provides developers with a unified way to test and validate AI agent performance. In this session, you’ll learn how to apply pre-built metrics for key dimensions such as task success, response quality, and tool accuracy, or define custom success criteria tailored to your needs. See how Evaluations integrates into CI/CD pipelines to catch regressions early and supports online evaluation in production by sampling and scoring live traces to surface real-world issues. Finally, learn how Evaluations helps teams deploy reliable agents faster, reduce operational risk, and continuously assess an agent’s performance at scale through practical implementation patterns.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

Towards a more perfect pipeline: CI/CD in the dbt Platform
talk
by Aaiden Witten (United Services Automobile Association) , Michael Sturm (United Services Automobile Association) , Timothy Shiveley (United Services Automobile Association)

In this session we’ll show how we integrated CI/CD dbt jobs to validate data and run tests on every merge request. Attendees will walk away with a blueprint for implementing CI/CD for dbt, lessons learned from our journey and best practices to keep data quality high without slowing down development.

Zero-footprint SQL testing: From framework to culture shift

We built a zero-footprint SQL testing framework using mock data and the full power of the pytest ecosystem to catch syntactic and semantic issues before they reach production. More than just a tool, it helped shift our team’s mindset by integrating into CI/CD, encouraging contract-driven development, and promoting testable SQL. In this session, we’ll share our journey, key lessons learned, and how we open-sourced the framework to make it available for everyone.

Lakeflow in Production: CI/CD, Testing and Monitoring at Scale

Building robust, production-grade data pipelines goes beyond writing transformation logic — it requires rigorous testing, version control, automated CI/CD workflows and a clear separation between development and production. In this talk, we’ll demonstrate how Lakeflow, paired with Databricks Asset Bundles (DABs), enables Git-based workflows, automated deployments and comprehensive testing for data engineering projects. We’ll share best practices for unit testing, CI/CD automation, data quality monitoring and environment-specific configurations. Additionally, we’ll explore observability techniques and performance tuning to ensure your pipelines are scalable, maintainable and production-ready.

MLOps That Ships: Accelerating AI Deployment at Vizient

Deploying AI models efficiently and consistently is a challenge many organizations face. This session will explore how Vizient built a standardized MLOps stack using Databricks and Azure DevOps to streamline model development, deployment and monitoring. Attendees will gain insights into how Databricks Asset Bundles were leveraged to create reproducible, scalable pipelines and how Infrastructure-as-Code principles accelerated onboarding for new AI projects. The talk will cover: End-to-end MLOps stack setup, ensuring efficiency and governance CI/CD pipeline architecture, automating model versioning and deployment Standardizing AI model repositories, reducing development and deployment time Lessons learned, including challenges and best practices By the end of this session, participants will have a roadmap for implementing a scalable, reusable MLOps framework that enhances operational efficiency across AI initiatives.

The Full Stack of Innovation: Building Data and AI Products With Databricks Apps

In this deep-dive technical session, Ivan Trusov (Sr. SSA @ Databricks) and Giran Moodley (SA @ Databricks) — will explore the full-stack development of Databricks Apps, covering everything from frameworks to deployment. We’ll walk through essential topics, including: Frameworks & tooling — Pythonic (Dash, Streamlit, Gradio) vs. JS + Python stack Development lifecycle — Debugging, issue resolution and best practices Testing — Unit, integration and load testing strategies CI/CD & deployment — Automating with Databricks Asset Bundles Monitoring & observability — OpenTelemetry, metrics collection and analysis Expect a highly practical session with several live demos, showcasing the development loop, testing workflows and CI/CD automation. Whether you’re building internal tools or AI-powered products, this talk will equip you with the knowledge to ship robust, scalable Databricks Apps.

LLMOps at Intermountain Health: A Case Study on AI Inventory Agents

In this session, we will delve into the creation of an infrastructure, CI/CD processes and monitoring systems that facilitate the responsible and efficient deployment of Large Language Models (LLMs) at Intermountain Healthcare. Using the "AI Inventory Agents" project as a case study, we will showcase how an LLM Agent can assist in effort and impact estimates, as well as provide insights into various AI products, both custom-built and third-party hosted. This includes their responsible AI certification status, development status and monitoring status (lights on, performance, drift, etc.). Attendees will learn how to build and customize their own LLMOps infrastructure to ensure seamless deployment and monitoring of LLMs, adhering to responsible AI practices.

Streamlining AI Application Development With Databricks Apps

Think Databricks is just for data and models? Think again. In this session, you’ll see how to build and scale a full-stack AI app capable of handling thousands of queries per second entirely on Databricks. No extra cloud platforms, no patchwork infrastructure. Just one unified platform with native hosting, LLM integration, secure access, and built-in CI/CD. Learn how Databricks Apps, along with services like Model Serving, Jobs, and Gateways, streamline your architecture, eliminate boilerplate, and accelerate development, from prototype to production.

Comprehensive Guide to MLOps on Databricks

This in-depth session explores advanced MLOps practices for implementing production-grade machine learning workflows on Databricks. We'll examine the complete MLOps journey from foundational principles to sophisticated implementation patterns, covering essential tools including MLflow, Unity Catalog, Feature Stores and version control with Git. Dive into Databricks' latest MLOps capabilities including MLflow 3.0, which enhances the entire ML lifecycle from development to deployment with particular focus on generative AI applications. Key session takeaways include: Advanced MLflow 3.0 features for LLM management and deployment Enterprise-grade governance with Unity Catalog integration Robust promotion patterns across development, staging and production CI/CD pipeline automation for continuous deployment GenAI application evaluation and streamlined deployment

From Imperative to Declarative Paradigm: Rebuilding a CI/CD Infrastructure Using Hatch and DABs

Building and deploying Pyspark pipelines to Databricks should be effortless. However, our team at FreeWheel has, for the longest time, struggled with a convoluted and hard-to-maintain CI/CD infrastructure. It followed an imperative paradigm, demanding that every project implement custom scripts to build artifacts and deploy resources, and resulting in redundant boilerplate code and awkward interactions with the Databricks REST API. We set our mind on rebuilding it from scratch, following a declarative paradigm instead. We will share how we were able to eliminate thousands of lines of code from our repository, create a fully configuration-driven infrastructure where projects can be easily onboarded, and improve the quality of our codebase using Hatch and Databricks Asset Bundles as our tools of choice. In particular, DAB has made deploying across our 3 environments a breeze, and has allowed us to quickly adopt new features as soon as they are released by Databricks.

A Prescription for Success: Leveraging DABs for Faster Deployment and Better Patient Outcomes

Health Catalyst (HCAT) transformed its CI/CD strategy by replacing a rigid, internal deployment tool with Databricks Asset Bundles (DABs), unlocking greater agility and efficiency. This shift streamlined deployments across both customer workspaces and HCAT's core platform, accelerating time to insights and driving continuous innovation. By adopting DABs, HCAT ensures feature parity, standardizes metric stores across clients, and rapidly delivers tailored analytics solutions. Attendees will gain practical insights into modernizing CI/CD pipelines for healthcare analytics, leveraging Databricks to scale data-driven improvements. HCAT's next-generation platform, Health Catalyst Ignite™, integrates healthcare-specific data models, self-service analytics, and domain expertise—powering faster, smarter decision-making.

From Days to Seconds — Reducing Query Times on Large Geospatial Datasets by 99%

The Global Water Security Center translates environmental science into actionable insights for the U.S. Department of Defense. Prior to incorporating Databricks, responding to these requests required querying approximately five hundred thousand raster files representing over five hundred billion points. By leveraging lakehouse architecture, Databricks Auto Loader, Spark Streaming, Databricks Spatial SQL, H3 geospatial indexing and Databricks Liquid Clustering, we were able to drastically reduce our “time to analysis” from multiple business days to a matter of seconds. Now, our data scientists execute queries on pre-computed tables in Databricks, resulting in a “time to analysis” that is 99% faster, giving our teams more time for deeper analysis of the data. Additionally, we’ve incorporated Databricks Workflows, Databricks Asset Bundles, Git and Git Actions to support CI/CD across workspaces. We completed this work in close partnership with Databricks.

Sponsored by: Astronomer | Scaling Data Teams for the Future

The role of data teams and data engineers is evolving. No longer just pipeline builders or dashboard creators, today’s data teams must evolve to drive business strategy, enable automation, and scale with growing demands. Best practices seen in the software engineering world (Agile development, CI/CD, and Infrastructure-as-code) from the DevOps movement are gradually making their way into data engineering. We believe these changes have led to the rise of DataOps and a new wave of best practices that will transform the discipline of data engineering. But how do you transform a reactive team into a proactive force for innovation? We’ll explore the key principles for building a resilient, high-impact data team—from structuring for collaboration, testing, automation, to leveraging modern orchestration tools. Whether you’re leading a team or looking to future-proof your career, you’ll walk away with actionable insights on how to stay ahead in the rapidly changing data landscape.

How Databricks Powers Real-Time Threat Detection at Barracuda XDR

As cybersecurity threats grow in volume and complexity, organizations must efficiently process security telemetry for best-in-class detection and mitigation. Barracuda’s XDR platform is redefining security operations by layering advanced detection methodologies over a broad range of supported technologies. Our vision is to deliver unparalleled protection through automation, machine learning and scalable detection frameworks, ensuring threats are identified and mitigated quickly. To achieve this, we have adopted Databricks as the foundation of our security analytics platform, providing greater control and flexibility while decoupling from traditional SIEM tools. By leveraging Lakeflow Declarative Pipelines, Spark Structured Streaming and detection-as-code CI/CD pipelines, we have built a real-time detection engine that enhances scalability, accuracy and cost efficiency. This session explores how Databricks is shaping the future of XDR through real-time analytics and cloud-native security.

Using Identity Security With Unity Catalog for Faster, Safer Data Access

Managing authentication effectively is key to securing your data platform. In this session, we’ll explore best practices from Databricks for overcoming authentication challenges, including token visibility, MFA/SSO, CI/CD token federation and risk containment. Discover how to map your authentication maturity journey while maximizing security ROI. We'll showcase new capabilities like access token reports for improved visibility, streamlined MFA implementation and secure SSO with token federation. Learn strategies to minimize token risk through TTL limits, scoped tokens and network policies. You'll walk away with actionable insights to enhance your authentication practices and strengthen platform security on Databricks.

CI/CD for Databricks: Advanced Asset Bundles and GitHub Actions

This session is repeated.Databricks Asset Bundles (DABs) provide a way to use the command line to deploy and run a set of Databricks assets — like notebooks, Python code, Lakeflow Declarative Pipelines and workflows. To automate deployments, you create a deployment pipeline that uses the power of DABs along with other validation steps to ensure high quality deployments.In this session you will learn how to automate CI/CD processes for Databricks while following best practices to keep deployments easy to scale and maintain. After a brief explanation of why Databricks Asset Bundles are a good option for CI/CD, we will walk through a working project including advanced variables, target-specific overrides, linting, integration testing and automatic deployment upon code review approval. You will leave the session clear on how to build your first GitHub Action using DABs.ub Action using DABs.

Getting Started With Lakeflow Connect

Hundreds of customers are already ingesting data with Lakeflow Connect from SQL Server, Salesforce, ServiceNow, Google Analytics, SharePoint, PostgreSQL and more to unlock the full power of their data. Lakeflow Connect introduces built-in, no-code ingestion connectors from SaaS applications, databases and file sources to help unlock data intelligence. In this demo-packed session, you’ll learn how to ingest ready-to-use data for analytics and AI with a few clicks in the UI or a few lines of code. We’ll also demonstrate how Lakeflow Connect is fully integrated with the Databricks Data Intelligence Platform for built-in governance, observability, CI/CD, automated pipeline maintenance and more. Finally, we’ll explain how to use Lakeflow Connect in combination with downstream analytics and AI tools to tackle common business challenges and drive business impact.

Shifting From Reactive to Proactive at Glassdoor | Zakariah Siyaji | Shift Left Data Conference 2025

Shifting From Reactive to Proactive at Glassdoor | Zakariah Siyaji | Shift Left Data Conference 2025

As Glassdoor scaled to petabytes of data, ensuring data quality became critical for maintaining trust and supporting strategic decisions. Glassdoor implemented a proactive, “shift left” strategy focused on embedding data quality practices directly into the development process. This talk will detail how Glassdoor leveraged data contracts, static code analysis integrated into the CI/CD pipeline, and automated anomaly detection to empower software engineers and prevent data issues at the source. Attendees will learn how proactive data quality management reduces risk, promotes stronger collaboration across teams, enhances operational efficiency, and fosters a culture of trust in data at scale.

Coalesce 2024: Making data rewarding at Bilt (Rewards)

Dive into the technical evolution of Bilt’s data infrastructure as they moved from fragmented, slow, and costly analytics to a streamlined, scalable, and holistic solution with dbt Cloud. In this session, the Bilt team will share how they implemented data modeling practices, established a robust CI/CD pipeline, and leveraged dbt’s Semantic Layer to enable a more efficient and trusted analytics environment. Attendees will gain a deep understanding of Bilt’s approach to data including: cost optimization, enhancing data accessibility and reliability, and most importantly, supporting scale and growth.

Speakers: Ben Kramer Director, Data & Analytics Bilt Rewards

James Dorado VP, Data Analytics Bilt Rewards

Nick Heron Senior Manager, Data Analytics Bilt Rewards

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements