Airflow Summit 2024

Adaptive Memory Scaling for Robust Airflow Pipelines

2024-07-01

session

David Sacerdote , Jason Bridgemohansingh , Cyrus Dukart

Airflow Cloud Computing

At Vibrant Planet, we’re on a mission to make the world’s communities and ecosystems more resilient in the face of climate change. Our cloud-based platform is designed for collaborative scenario planning to tackle wildfires, climate threats, and ecosystem restoration on a massive scale. In this talk we will dive into how we are using Airflow. Particularly we will focus on how we’re making Airflow pipelines smarter and more resilient, especially when dealing with the task of processing large satellite imagery and other geospatial data. Self-Healing Pipelines: Discuss our self-healing pipelines which identify likely out-of-memory events and incrementally allocate more memory for task instance retries, ensuring robust and uninterrupted workflow execution. Initial Memory Recommendations: We’ll discuss how we set intelligent initial memory allocations for each task instance, enhancing resource efficiency from the outset.

A Game of Constant Learning & Adjustment: Orchestrating ML Pipelines at the Philadelphia Phillies

2024-07-01

session

Sophie Keith , Mike Hirsch

AI/ML Airflow Analytics API Cloud Computing GitHub

When developing Machine Learning (ML) models, the biggest challenges are often infrastructural. How do we deploy our model and expose an inference API? How can we retrain? Can we continuously evaluate performance and monitor model drift? In this talk, we will present how we are tackling these problems at the Philadelphia Phillies by developing a suite of tools that enable our software engineering and analytics teams to train, test, evaluate, and deploy ML models - that can be entirely orchestrated in Airflow. This framework abstracts away the infrastructural complexities that productionizing ML Pipelines presents and allows our analysts to focus on developing robust baseball research for baseball operations stakeholders across player evaluation, acquisition, and development. We’ll also look at how we use Airflow, MLflow, MLServer, cloud services, and GitHub Actions to architect a platform that supports our framework for all points of the ML Lifecycle.

Airflow and Control-M: Where Data Pipelines Meet Business Applications in Production

2024-07-01

session

Basil Faruqui

Airflow Cloud Computing Cyber Security

This talk is presented by BMC Software With Airflow’s mainstream acceptance in the enterprise, the operational challenges of running with applications in production have emerged. At last year’s Airflow Summit in Toronto, three providers of Apache Airflow met to discuss “The Future of Airflow: What Users Want”. Among the user requirements in the session were: An improved security model allowing “Alice” and “Bob” to run their single DAGs without each requiring a separate Airflow cluster, while still adhering to their organization’s compliance requirements. An “Orchestrator of Orchestrators” relationship in which Airflow oversees the myriad orchestrators embedded in many tools and provided by cloud vendors. That panel discussion described what Airflow users now understand to be mandatory for their workloads in enterprise production, and defined the exact operational requirements our customers have successfully tackled for decades. Join us in this session to learn how Control-M’s Airflow integration helps data engineers do what they need to do with Airflow and gives IT Ops the key to deliver enterprise business application results in production.

Airflow at Ford: A Job Router Training Advance Driver Assistance Systems

2024-07-01

session

Doug Rogan , Vasantha Kosuri Marshall

AI/ML Airflow Astronomer Cloud Computing DataOps ELK

Ford Motor Company operates extensively across various nations. The Data Operations (DataOps) team for Advanced Driver Assistance Systems (ADAS) at Ford is tasked with the processing of terabyte-scale daily data from lidar, radar, and video. To manage this, the DataOps team is challenged with orchestrating diverse, compute-intensive pipelines across both on-premises infrastructure and the GCP and deal with sensitive of customer data across both environments The team is also responsible for facilitating the execution of on-demand, compute-intensive algorithms at scale through. To achieve these objectives, the team employs Astronomer/Airflow at the core of its strategic approach. This involves various deployments of Astronomer/Airflow that integrate seamlessly and securely (via Apigee) to initiate batch data processing and ML jobs on the cloud, as well as compute-intensive computer vision tasks on-premises, with essential alerting provided through the ELK stack. This presentation will delve into the architecture and strategic planning surrounding the hybrid batch router, highlighting its pivotal role in promoting rapid innovation and scalability in the development of ADAS features.

Airflow Datasets and Pub/Sub for Dynamic DAG Triggering

2024-07-01

session

Andrea Bombino , Nawfel Bacha

Airflow Cloud Computing Data Engineering dbt GCP Pub/Sub

Looking for a way to streamline your data workflows and master the art of orchestration? As we navigate the complexities of modern data engineering, Airflow’s dynamic workflow and complex data pipeline dependencies are starting to become more and more common nowadays. In order to empower data engineers to exploit Airflow as the main orchestrator, Airflow Datasets can be easily integrated in your data journey. This session will showcase the Dynamic Workflow orchestration in Airflow and how to manage multi-DAGs dependencies with Multi-Dataset listening. We’ll take you through a real-time data pipeline with Pub/Sub messaging integration and dbt in Google Cloud environment, to ensure data transformations are triggered only upon new data ingestion, moving away from rigid time-based scheduling or the use of sensors and other legacy ways to trigger a DAG.

Airflow - Path to Industry Orchestration Standard

2024-07-01

session

Rafal Biegacz , Filip Knapik

AI/ML Airflow Cloud Computing Data Engineering

In the realm of data engineering, machine learning pipelines and using cloud and web services there is a huge demand for orchestration technologies. Apache Airflow belongs to the most popular orchestration technologies or even is the most popular one. In this presentation we are going to focus these aspects of Airflow that make it so popular and whether it became the orchestration industry standard.

Airflow, Spark, and LLMs: Turbocharging MLOps at ASAPP

2024-07-01

session

Udit Saxena

AI/ML Airflow Cloud Computing GenAI LLM MLOps

This talk will explore ASAPP’s use of Apache Airflow to streamline and optimize our machine learning operations (MLOps). Key highlights include: Integrating with our custom Spark solution for achieving speedup, efficiency, and cost gains for generative AI transcription, summarization and intent categorization pipelines Different design patterns of integrating with efficient LLM servers - like TGI/vllm/tensor-RT for Summarization pipelines with/without Spark. An overview of batched LLM inference using Airflow as opposed to real time inference outside of it [Tentative] Possible extension of this scaffolding to Reinforcement Learning from Human Feedback (RLHF) and Reinforcement Learning from AI Feedback (RLAIF) for fine-tuning LLMs, using Airflow as the orchestrator. Additionally, the talk will cover ASAPP’s MLOps journey with Airflow over the past few years, including an overview of our cloud infrastructure, various data backends, and sources. The primary focus will be on the machine learning workflows at ASAPP, rather than the data workflows, providing a detailed look at how Airflow enhances our MLOps processes.

Boost Airflow Monitoring and Alerting with Automation Analytics & Intelligence by Broadcom

2024-07-01

session

Jennifer Chisik

Airflow Analytics Cloud Computing

This talk is presented by Broadcom. Airflow’s “workflow as code” approach has many benefits, including enabling dynamic pipeline generation and flexibility and extensibility in a seamless development environment. However, what challenges do you face as you expand your Airflow footprint in your organization? What if you could enhance Airflow’s monitoring capabilities, forecast DAG and task executions, obtain predictive alerting, visualize trends, and get more robust logging? Broadcom’s Automation Analytics & Intelligence (AAI) offers advanced analytics for workload automation for cloud and on-premises automation. It connects easily with Airflow to offer improved visibility into dependencies between tasks in Airflow DAGs along with the workload’s critical path, dynamic SLA management, and more. Join our presentation to hear more about how AAI can help you improve service delivery. We will also lead a workshop that will allow you to dive deeper into how easy it is to install our Airflow Connector and get started visualizing your Airflow DAGs to optimize your workload and identify issues before they impact your business.

Bronco: Managing Terraform at Scale with Airflow

2024-07-01

session

Jack Cusick

Airflow Cloud Computing Terraform

Airflow is not just purpose-built for data applications. It is a job scheduler on steroids. This is exactly what a cloud platform team needs: a configurable and scalable automation tool that can handle thousands of administrative tasks. Come learn how one enterprise platform team used Airflow to support cloud infrastructure at unprecedented scale.

Empowering Airflow Users: A framework for performance testing and transparent resource optimization

2024-07-01

session

Bartosz Jankiewicz

Airflow Cloud Computing Docker Kubernetes

Apache Airflow is the backbone of countless data pipelines, but optimizing performance and resource utilization can be a challenge. This talk introduces a novel performance testing framework designed to measure, monitor, and improve the efficiency of Airflow deployments. I’ll delve into the framework’s modular architecture, showcasing how it can be tailored to various Airflow setups (Docker, Kubernetes, cloud providers). By measuring key metrics across schedulers, workers, triggers, and databases, this framework provides actionable insights to identify bottlenecks and compare performance across different versions or configurations. Attendees will learn: The motivation behind developing a standardized performance testing approach. Key design considerations and challenges in measuring performance across diverse Airflow environments. How to leverage the framework to construct test suites for different use cases (e.g., version comparison). Practical tips for interpreting performance test results and making informed decisions about resource allocation. How this framework contributes to greater transparency in Airflow release notes, empowering users with performance data.

Evolution of Orchestration at GoDaddy: A Journey from On-prem to Cloud-based Single Pane Model

2024-07-01

session

Ozcan Ilikhan , Amit Kumar

Airflow Cloud Computing

Explore the evolutionary journey of orchestration within GoDaddy, tracing its transformation from initial on-premise deployment to a robust cloud-based Apache Airflow orchestration model. This session will detail the pivotal shifts in design, organizational decisions, and governance that have streamlined GoDaddy’s Data Platform and enhanced overall governance. Attendees will gain insights valuable for optimizing Airflow deployments and simplifying complex orchestration processes. Recap of the transformation journey and its impact on GoDaddy’s data operations. Future directions and ongoing improvements in orchestration at GoDaddy. This session will benefit attendees by providing a comprehensive case study on optimizing orchestration in a complex enterprise environment, emphasizing practical insights and scalable solutions.

Overcoming Custom Python Package Hurdles in Airflow

2024-07-01

session

Shubham Raj , Amogh Rajesh Desai

Airflow Cloud Computing Docker Python

DAG Authors, while constructing DAGs, generally use native libraries provided by Airflow in conjunction with python libraries available over public PyPI repositories. But sometimes, DAG authors need to construct DAG using libraries that are either in-house or not available over public PyPI repositories. This poses a serious challenge for users who want to run their custom code with Airflow DAGs, particularly when Airflow is deployed in a cloud-native fashion. Traditionally, these packages are baked in Airflow Docker images. This won’t work post deployment and is super impractical if your library is under development. We propose a solution that creates a dedicated Airflow global python environment that dynamically generates the requirements, establishes a version-compatible pyenv adhering to Airflow’s policies, and manages custom pip repository authentication seamlessly. Importantly, the service executes these steps in a fail-safe manner, not compromising core components. Join us as we discuss the solution to this common problem, touching upon the design, and seeing the solution in action. We also candidly discuss some challenges, and the shortcomings of the proposed solution.

Scale and Security: How Autodesk Securely Develops and Tests PII Pipelines with Airflow

2024-07-01

session

Bhavesh Jaisinghani

Airflow Astronomer Big Data Cloud Computing Cyber Security Spark

In today’s data-driven era, ensuring data reliability and enhancing our testing and development capabilities are paramount. Local unit testing has its merits but falls short when dealing with the volume of big data. One major challenge is running Spark jobs pre-deployment to ensure they produce expected results and handle production-level data volumes. In this talk, we will discuss how Autodesk leveraged Astronomer to improve pipeline development. We’ll explore how it addresses challenges with sensitive and large data sets that cannot be transferred to local machines or non-production environments. Additionally, we’ll cover how this approach supports over 10 engineers working simultaneously on different feature branches within the same repo. We will highlight the benefits, such as conflict-free development and testing, and eliminating concerns about data corruption when running DAGs on production Airflow servers. Join me to discover how solutions like Astronomer empower developers to work with increased efficiency and reliability. This talk is perfect for those interested in big data, cloud solutions, and innovative development practices.

Unlocking FMOps/LLMOps with Airflow: A guide to operationalizing and managing Large Language Models

2024-07-01

session

Parnab Basak (Amazon Web Services)

AI/ML Airflow Cloud Computing GenAI LLM MLOps

In the last few years Large Language Models (LLMs) have risen to prominence as outstanding tools capable of transforming businesses. However, bringing such solutions and models to the business-as-usual operations is not an easy task. In this session, we delve into the operationalization of generative AI applications using MLOps principles, leading to the introduction of foundation model operations (FMOps) or LLM operations using Apache Airflow. We further zoom into aspects of expected people and process mindsets, new techniques for model selection and evaluation, data privacy, and model deployment. Additionally, know how you can use the prescriptive features of Apache Airflow to aid your operational journey. Whether you are building using out of the box models (open-source or proprietary), creating new foundation models from scratch, or fine-tuning an existing model, with the structured approaches described you can effectively integrate LLMs into your operations, enhancing efficiency and productivity without causing disruptions in the cloud or on-premises.

Using Airflow operational data to optimize Cloud services

2024-07-01

session

Olivier Daneau

Airflow Astronomer BigQuery Cloud Computing Oracle Snowflake

Cost management is a continuous challenge for our data teams at Astronomer. Understanding the expenses associated with running our workflows is not always straightforward, and identifying which process ran a query causing unexpected usage on a given day can be time-consuming. In this talk, we will showcase an Airflow Plugin and specific DAGs developed and used internally at Astronomer to track and optimize the costs of running DAGs. Our internal tool monitors Snowflake query costs, provides insights, and sends alerts for abnormal usage. With it, Astronomer identified and refactored its most costly DAGs, resulting in an almost 25% reduction in Snowflake spending. We will demonstrate how to track Snowflake-related DAG costs and discuss how the tool can be adapted to any database supporting query tagging like BigQuery, Oracle, and more. This talk will cover the implementation details and show how Airflow users can effectively adopt this tool to monitor and manage their DAG costs.

Weathering the Cloud Storms With Multi-Region Airflow Workflows

2024-07-01

session

Amit Chauhan

Airflow Cloud Computing

Cloud availability zones and regions are not immune to outages. These zones regularly go down, and regions become unavailable due to natural disasters or human-caused incidents. Thus, if an availability zone or region goes down, so do your Airflow workflows and applications… unless your Airflow workflows function across multiple geographic locations. This hands-on session introduces you to the design patterns of multi-region Airflow workflows in the cloud, which can tolerate zone and region-level incidents. We will start with a traditional single-region configuration and then switch to a multi-region setting. By the end, we’ll have a working prototype of a multi-region Airflow pipeline that recovers from region-level outages within a few seconds, with no data loss or disruption to the application layer.

talk-data.com

Top Topics

Top Speakers

Adaptive Memory Scaling for Robust Airflow Pipelines

A Game of Constant Learning & Adjustment: Orchestrating ML Pipelines at the Philadelphia Phillies

Airflow and Control-M: Where Data Pipelines Meet Business Applications in Production

Airflow at Ford: A Job Router Training Advance Driver Assistance Systems

Airflow Datasets and Pub/Sub for Dynamic DAG Triggering

Airflow - Path to Industry Orchestration Standard

Airflow, Spark, and LLMs: Turbocharging MLOps at ASAPP

Boost Airflow Monitoring and Alerting with Automation Analytics & Intelligence by Broadcom

Bronco: Managing Terraform at Scale with Airflow

Empowering Airflow Users: A framework for performance testing and transparent resource optimization

Evolution of Orchestration at GoDaddy: A Journey from On-prem to Cloud-based Single Pane Model

Overcoming Custom Python Package Hurdles in Airflow

Scale and Security: How Autodesk Securely Develops and Tests PII Pipelines with Airflow

Unlocking FMOps/LLMOps with Airflow: A guide to operationalizing and managing Large Language Models

Using Airflow operational data to optimize Cloud services

Weathering the Cloud Storms With Multi-Region Airflow Workflows