talk-data.com talk-data.com

Topic

Airflow

Apache Airflow

workflow_management data_orchestration etl

139

tagged

Activity Trend

157 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: Airflow Summit 2025 ×

Operating within the stringent regulatory landscape of Corporate Banking, Deutsche Bank relies heavily on robust data orchestration. This session explores how Deutsche Bank’s Corporate Bank leverages Apache Airflow across diverse environments, including both on-premises infrastructure and cloud platforms. Discover their approach to managing critical data & analytics workflows, encompassing areas like regulatory reporting, data integration and complex data processing pipelines. Gain insights into the architectural patterns and operational best practices employed to ensure compliance, security, and scalability when running Airflow at scale in a highly regulated, hybrid setting.

Ever seen a DAG go rogue and deploy itself? Or try to time travel back to 1999? Join us for a light-hearted yet painfully relatable look at how not to scale your Airflow deployment to avoid chaos and debugging nightmares. We’ll cover the classics: hardcoded secrets, unbounded retries (hello, immortal task!), and the infamous spaghetti DAG where 200 tasks are lovingly connected by hand and no one dares open the Airflow UI anymore. If you’ve ever used datetime.now() in your DAG definition and watched your backfills implode, this talk is for you. From the BashOperator that became sentient to the XCom that tried to pass a whole Pandas DataFrame and the key to your mother’s house, we’ll walk through real-world bloopers with practical takeaways. You’ll learn why overusing PythonOperator is a recipe for mess, how not to use sensors unless you enjoy resource starvation, and why scheduling in local timezones is basically asking for a daylight savings time horror story. Other highlights include: Over-provisioning resources in KubernetesPodOperator: many teams allocate excessive memory/CPU “just in case”, leading to cluster contention and resource waste. Dynamic task mapping gone wild: 10,000 mapped tasks later… the scheduler is still crying. SLAs used as data quality guarantees: creating alerts so noisy, nobody listens. Design-free DAGs: no docs, no comments, no idea why a task has a 3-day timeout. Finally, we’ll round it out with some dos and don’ts: using environment variables, avoiding memory-hungry monolith DAGs, skipping global imports, and not allocating 10x more memory “just in case.” Whether you’re new to Airflow or battle-hardened from a thousand failed backfills, come learn how to scale your pipelines without losing your mind (or your cluster).

In this talk, we’ll share our journey and lessons learned from developing a new open-source Airflow operator that integrates a newly-launched AWS service with the Airflow ecosystem. This real-world case study will illuminate the complete lifecycle of building an Airflow operator, from initial design to successful community contribution. We’ll dive deep into the practical challenges and solutions encountered throughout the journey, including: Evaluating when to build a new operator versus extending existing ones Navigating the Apache Airflow Open-source contribution process Best practices for operator design and implementation Key learnings and common pitfalls to avoid during the testing and release process Whether you’re looking to contribute to Apache Airflow or build custom operators, this session will provide valuable insights into the development process, common pitfalls to avoid, and best practices when contributing to and collaborating with the Apache Airflow community. Expect to leave with a practical roadmap for your own contributions and the confidence to successfully engage with the Apache Airflow ecosystem.

Apache Airflow 3 is a new state-of-the-art version of Airflow. For many users who plan to adopt Airflow 3 it’s important to understand how Airflow 3 behaves from performance perspective compared to Airflow 2. This presentation is going to present performance results for various Airflow 3 configurations and provides potential Airflow 3 adopters good understanding of its performance. The reference Airflow 3 configuration will be using Kubernetes cluster as a compute layer, PostgreSQL as Airflow Database and would be performed on Google Cloud Platform. Performance tests will be performed using community version of performance tests framework and there might be references to Cloud Composer (managed service for Apache Airflow). The tests will be done in production-grade configurations that might be good references for Airflow community users. Users will be provided with comparison of Airflow 3 and Airflow 2 from performance standpoint Users also will learn how to optimize Airflow scheduler performance by understanding DAG file processing, task scheduling and configuring Scheduler to run tens of thousands of DAGs/tasks in Airflow 3

At LinkedIn, our data pipelines process exabytes of data, with our offline infrastructure executing 300K ETL workflows daily and 10K concurrent executions. Historically, these workloads ran on our legacy system, Azkaban, which faced UX, scalability, and operational challenges. To modernize our infra, we built a managed Airflow service, leveraging its enhanced developer & operator experience, rich feature set, and strong OSS community support. That initiated LinkedIn’s largest-ever infrastructure migration—transitioning thousands of legacy workflows to Airflow. In this talk, we will share key lessons from migrating massive-scale pipelines with minimal production disruption. We will discuss: Overall Migration Strategy Custom Tooling Enhancements on testing, deployment, and observability Architectural Innovations decoupling orchestration and compute GenAI-powered Migration automating code rewrites Post-Migration Challenges & Airflow 3.0. Attendees will walk away with battle-tested strategies for large-scale Airflow adoption and practical insights into scaling Airflow in enterprise environments.

Apache Airflow is a powerful workflow orchestrator, but as workloads grow, its Python-based components can become performance bottlenecks. This talk explores how Rust, with its speed, safety, and concurrency advantages, can enhance Airflow’s core components (e.g, scheduler, DAG processor, etc). We’ll dive into the motivations behind using Rust, architectural trade-offs, and the challenges of bridging the gap between Python and Rust. A proof-of-concept showcasing an Airflow scheduler rewritten in Rust will demonstrate the potential benefits of this approach.

Last year, we shared how LinkedIn’s continuous deployment platform (LCD) leveraged Apache Airflow to streamline and automate deployment workflows. LCD is the deployment platform inside Linkedin which is actively used by all engineers (10000+) at Likedin. This year, we take a deeper dive into the challenges, solutions, and engineering innovations that helped us scale Airflow to support thousands of concurrent tasks while maintaining usability and reliability. Key Takeaways: Abstracting Airflow for a Better User Experience – How we designed a system where users could define and update their workflows without directly interacting with Airflow. Scaling to 10,000+ Concurrent Tasks – The architectural and configuration changes that enabled us to scale execution efficiently. Enhanced Observability & Monitoring – The tools and techniques we implemented to track Airflow’s health, detect failures, and improve reliability. Lessons from the Field – Key learnings, trade-offs, and best practices for managing large-scale Airflow deployments.

Airflow 3 brings several exciting new features that better support MLOps: Native, intuitive backfills Removal of the unique execution date for dag runs Native support for event-driven scheduling These features, combined with the Airflow AI SDK, enable dag authors to easily build scalable, maintainable, and performant LLMOps pipelines. In this talk, we’ll go through a series of workflows that use the Airflow AI SDK to empower Astronomer’s support staff to more quickly resolve problems faced by Astronomer’s customers.

A real-world journey of how my small team at Xena Intelligence built robust data pipelines for our enterprise customers using Airflow. If you’re a data engineer, or part of a small team, this talk is for you. Learn how we orchestrated a complex workflow to process millions of public reviews. What You’ll Learn: Cost-Efficient DAG Designing: Decomposing complex processes into atomic tasks using the TaskFlow, XComs, Mapped tasks, and Task groups. Diving into one of our DAGs as a concrete example of how our approach optimizes parallelism, error handling, delivery speed, and reliability. Integrating LLM Analysis: Explore how we integrated LLM-based analysis into our pipeline. Learn how we designed the database, queries, and ingestion to Postgres. Extending Airflow UI: We developed a custom Airflow UI plugin that filters and visualizes DAG runs by customer, product, and marketplace, delivering clear insights for faster troubleshooting. Leveraging Airflow REST API: Discover how we leveraged the API to trigger DAGs on demand, elevating the UX by tracking mapped DAG progress and computing ETAs. CI/CD and Cost Management: Get practical tips for deploying DAGs with CI/CD.

MWAA is an AWS-managed service that simplifies the deployment and maintenance of the open-source Apache Airflow data orchestration platform. MWAA has recently introduced several new features to enhance the experience for data engineering teams. Features such as Graceful Worker Replacement Strategy that enable seamless MWAA environment updates with zero downtime, IPv6 support, and in place minor Airflow Version Downgrade are some of the many new improvements MWAA has brought to their users in 2025. Last, but not the least, the release of Airflow 3.0 support brings the latest open-source features introducing a new web-server UI, better isolation and security for environments. These enhancements demonstrate Amazon’s continued investment in making Airflow more accessible and scalable for enterprises through the MWAA service.

MWAA is an AWS-managed service that simplifies the deployment and maintenance of the open-source Apache Airflow data orchestration platform. MWAA has recently introduced several new features to enhance the experience for data engineering teams. Features such as Graceful Worker Replacement Strategy that enable seamless MWAA environment updates with zero downtime, IPv6 support, and in place minor Airflow Version Downgrade are some of the many new improvements MWAA has brought to their users in 2025. Last, but not the least, the release of Airflow 3.0 support brings the latest open-source features introducing a new web-server UI, better isolation and security for environments. These enhancements demonstrate Amazon’s continued investment in making Airflow more accessible and scalable for enterprises through the MWAA service.

In today’s data-driven world, effective workflow management and AI are crucial for success. However, there’s a notable gap between Airflow and AI. Our presentation offers a solution to close this gap. Proposing MCP (Model Context Protocol) server to act as a bridge. We’ll dive into two paths: AI-Augmented Airflow: Enhancing Airflow with AI to improve error handling, automate DAG generation, proactively detect issues, and optimize resource use. Airflow-Powered AI: Utilizing Airflow’s reliability to empower LLMs in executing complex tasks, orchestrating AI agents, and supporting decision-making with real-time data. Key takeaways: Understanding how to integrate AI insights directly into your workflow orchestration. Learning how MCP empowers AI with robust orchestration capabilities, offering full logging, monitoring, and auditability. Gaining insights into how to transform LLMS from a reactive responder to a proactive, intelligent, and reliable executor. Inviting you to explore how MCP can help workflow management, making AI-driven decisions more reliable and turning workflow systems into intelligent, autonomous agents.

This session details practical strategies for introducing Apache Airflow in strict, compliance-heavy organizations. Learn how on-premise deployment and hybrid tooling can help modernize legacy workflows when public cloud solutions and container technologies are restricted. Discover how cross-platform engineering teams can collaborate securely using CI/CD bridges, and what it takes to meet rigorous security and governance standards. Key lessons address navigating resistance to change, achieving production sign-off, and avoiding common compliance pitfalls, relevant to anyone automating in public sector settings.

As Data Engineers, our jobs regularly include scheduling or scaling workflows. But have you ever asked yourself, can I scale my scheduling ? It turns out that you can! But doing so raises a number of issues that need to be addressed. In this talk we’ll be: Recapping Asset-aware scheduling in Apache Airflow Discussing diverse methods to upscale our scheduling Solving the issue of maintaining our Airflow Asset synchronized between instances Comparing our professional push based solution and the built-in solution from AIP-82 and the pros and cons of each method. I hope you will enjoy it!

At Yahoo, we built a secure, scalable, and cost-efficient batch processing platform using Amazon MWAA to orchestrate Apache Flink jobs on EKS, managed by the Flink Kubernetes Operator. This setup enables dynamic job orchestration while meeting strict enterprise compliance standards. In this session, we’ll share how Airflow DAGs: Dynamically launch, monitor, and clean up isolated Flink clusters per batch job, improving resource efficiency. Securely fetch EKS kubeconfig, submit FlinkDeployment CRDs using FlinkKubernetesOperator, and poll job status using Airflow sensors. Integrate IAM for access control and meet Yahoo’s security requirements, including mutual TLS (mTLS) with Athenz. Optimize for cost and resilience through automated cleanup of jobs and the operator, and handle job failures and retries. Join us for practical strategies and lessons from Yahoo’s production-scale Flink workflows in a Kubernetes environment.

Our development workflows look dramatically different than they did a year ago. Code generation, automated testing, and AI-assisted documentation tools are now part of many developers’ daily work. Yet as these tools reshape how we code, I’ve noticed something worth examining: while our toolbox is changing rapidly, the core of being a good developer hasn’t. Problem-solving, collaborative debugging, and systems thinking remain as crucial as ever. In this keynote, I’ll share observations about: Which parts of our workflow are genuinely enhanced by new tools. The development skills that continue to separate good code from great code. How teams can collaborate effectively when everyone’s tools are evolving. What Airflow’s journey teaches us about balancing innovation with stability. No hype or grand pronouncements—just an honest look at incorporating new tools while preserving the craft that makes us developers in the first place.

Before Airflow, our BigQuery pipelines at Create Music Group operated like musicians without a conductor—each playing on its own schedule, regardless of whether upstream data was ready. As our data platform grew, this chaos led to spiralling costs, performance bottlenecks, and became utterly unsustainable. This talk tells the story of how Create Music Group brought harmony to its data workflows by adopting Apache Airflow and the Medallion architecture, ultimately slashing our data processing costs by 50%. We’ll show how moving to event-driven scheduling with datasets helped eliminate stale data issues, dramatically improved performance, and unlocked faster iteration across teams. Discover how we replaced repetitive SQL with standardized dimension/fact tables, empowering analysts in a safer sandbox.

How a Complete Beginner in Data Engineering / Junior Computer Science Student Became an Apache Airflow Committer in Just 5 Months—With 70+ PRs and 300 Hours of Contributions This talk is aimed at those who are still hesitant about contributing to Apache Airflow. I hope to inspire and encourage anyone to take the first step and start their journey in open-source—let’s build together!

Apache Airflow® 3 is here, bringing major improvements to data orchestration. In this keynote, core Airflow contributors will walk through key enhancements that boost flexibility, efficiency, and user experience. Vikram Koka will kick things off with an overview of Airflow 3, followed by deep dives into DAG versioning (Jed Cunningham), enhanced backfilling (Daniel Standish), and a modernized UI (Brent Bovenzi & Pierre Jeambrun). Next, Ash Berlin-Taylor, Kaxil Naik, and Amogh Desai will introduce the Task Execution Interface and Task SDK, enabling tasks in any environment and language. Jens Scheffler will showcase the Edge Executor, while Tzu-ping Chung and Vincent Beck will demo event-driven scheduling and data assets. Finally, Buğra Öztürk will unveil CLI enhancements for automation and debugging. This keynote sets the stage for Airflow 3—don’t miss the chance to learn from the experts shaping the future of workflow orchestration!

Yes, you read that right — 200,000 pipelines, nearly 1 million task executions per day, all powered by a single Airflow instance. In this session, we’ll take you behind the scenes of one of the boldest orchestration projects ever attempted: how Uber’s data platform team is executing what might be the largest Apache Airflow migration in history — and doing it straight to Airflow 3. From scaling challenges and architectural choices to lessons learned in high-throughput orchestration, this is a deep dive into the tech, the chaos, and the strategy behind making data fly at unprecedented scale.