talk-data.com
Airflow Summit
session
2025-07-01
Scaling ML Infrastructure: Lessons from Building Distributed Systems
Event:
Airflow Summit 2025
Topics
Description
In today’s data-driven world, scalable ML infrastructure is mission-critical. As ML workloads grow, orchestration tools like Apache Airflow become essential for managing pipelines, training, deployment, and observability. In this talk, I’ll share lessons from building distributed ML systems across cloud platforms, including GPU-based training and AI-powered healthcare. We’ll cover patterns for scaling Airflow DAGs, integrating telemetry and auto-healing, and aligning cross-functional teams. Whether you’re launching your first pipeline or managing ML at scale, you’ll gain practical strategies to make Airflow the backbone of your ML infrastructure.