Apache Airflow’s executor landscape has traditionally presented users with a clear trade-off: choose either the speed of local execution or the scalability, isolation and configurability of remote execution. The AWS Lambda Executor introduces a new paradigm that bridges this gap, offering near-local execution speeds with the benefits of remote containerization. This talk will begin with a brief overview of Airflow’s executors, how they work and what they are responsible for, highlighting the compromises between different executors. We will explore the emerging niche for fast, yet remote execution and demonstrate how the AWS Lambda Executor fills this space. We will also address practical considerations when using such an executor, such as working within Lambda’s 15 minute execution limit, and how to mitigate this using multi-executor configuration. Whether you’re new to Airflow or an experienced user, this session will provide valuable insights into task execution and how you can combine the best of both local and remote execution paradigms.
talk-data.com
Topic
AWS Lambda
2
tagged
Activity Trend
Top Events
Discover how Apache Airflow powers scalable ELT pipelines, enabling seamless data ingestion, transformation, and machine learning-driven insights. This session will walk through: Automating Data Ingestion: Using Airflow to orchestrate raw data ingestion from third-party sources into your data lake (S3, GCP), ensuring a steady pipeline of high-quality training and prediction data. Optimizing Transformations with Serverless Computing: Offloading intensive transformations to serverless functions (GCP Cloud Run, AWS Lambda) and machine learning models (BigQuery ML, Sagemaker), integrating their outputs seamlessly into Airflow workflows. Real-World Impact: A case study on how INTRVL leveraged Airflow, BigQuery ML, and Cloud Run to analyze early voting data in near real-time, generating actionable insights on voter behavior across swing states. This talk not only provides a deep dive into the Political Tech space but also serves as a reference architecture for building robust, repeatable ELT pipelines. Attendees will gain insights into modern serverless technologies from AWS and GCP that enhance Airflow’s capabilities, helping data engineers design scalable, cloud-agnostic workflows.