Reliability is a complex and important topic. I will focus on both reliability definition and best practices. I will begin by reviewing the Apache Airflow components that impact reliability. I will subsequently examine those aspects, showing the single points of failure, mitigations, and tradeoffs. The journey starts with the scheduling process. I will focus on the aspects of Scheduler infrastructure and configuration that address reliability improvements. It doesn’t run in a vacuum therefore I’ll share my observations on the reliability aspect of Scheduler infrastructure. We recommend tasks to be idempotent but that is not always possible. I will share the challenges of running user’s code in the distributed architecture of Cloud Composer. I will refer to the volatility of some cloud resources and mitigation methods in various scenarios. Deferrability plays important part in the reliability, but there are also other elements we shouldn’t ignore.
talk-data.com
Topic
Cloud Composer
Google Cloud Composer
workflow_orchestration
airflow
google_cloud
2
tagged
Activity Trend
7
peak/qtr
2020-Q1
2026-Q1
Top Events
Filtering by:
Bartosz Jankiewicz
×
This workshop is sold out Hands on workshop showing how easy it is to deploy Airflow in a public Cloud. Workshop consists of 3 parts: Setting up Airflow environment and CI/CD for DAG deployment Authoring a DAG Troubleshoot Airflow DAG/Task execution failures This workshop will be based on Cloud Composer ( https://cloud.google.com/composer ) This workshop is mostly targeted at Airflow newbies and users who would like to learn more about Cloud Composer and how to develop DAGs using Google Cloud Platform services like BigQuery, Vertex AI, Dataflow.