Airflow 3 introduced a game-changing feature: Dag versioning. Gone are the days of “latest only” Dags and confusing, inconsistent UI views when pipelines change mid-flight. This talk covers: Visualizing Dag changes over time in the UI How Dags code is versioned and can be grabbed from external sources Executing a whole Dag run against the same code version Dynamic Dags? Where do they fit in?! You’ll see real-world scenarios, UI demos, and learn how these advancements will help avoid “Airflow amnesia”.
talk-data.com
Speaker
Ephraim Anierobi
5
talks
Filter by Event / Source
Talks & appearances
5 activities · Newest first
This session presents a comprehensive guide to building applications that integrate with Apache Airflow’s database migration system. We’ll explore how to harness Airflow’s robust Alembic-based migration toolchain to maintain schema compatibility between Airflow and custom applications, enabling developers to create solutions that evolve alongside the Airflow ecosystem without disruption.
Apache Airflow has a lot of configuration options. A change in some of these options can affect the performance of Airflow. If you are wondering why your Airflow instance is not running the number of tasks you expected it to run, after this talk, you will have a better understanding of the configuration options available for improving the number of tasks your Airflow instance can run. We will talk about the DAG parsing configuration options, options for scheduler scalability, etc., and the pros and cons of these options.
Have you ever wondered what is next after learning the basics of software development, how you can improve your programming skills and gain more experience? These questions trouble a lot of people new to software development. They are not aware that they can leverage open-source projects to build their careers and land their dream job. In this session, I will share how you can leverage open-source projects to improve your skills, the challenges you would likely encounter, and how to overcome them and become a successful software engineer.
In Apache Airflow, Xcom is the default mechanism for passing data between tasks in a DAG. In practice, this has been restricted to small data elements, since the Xcom data is persisted in the Airflow metadatabase and is constrained by database and performance limitations. With the new TaskFlow API introduced in Airflow 2.0, it is seamless to pass data between tasks and the use of Xcom is invisible. However, the ability to pass data is restricted to a relatively small set of data types which can be natively converted in JSON. This tutorial describes how to go beyond these limitations by developing and deploying a Custom Xcom backend within Airflow to enable the sharing of large and varied data elements such as Pandas data frames between tasks in a data pipeline, using a cloud storage such as Google Storage or Amazon S3.