talk-data.com talk-data.com

Topic

Airflow

Apache Airflow

workflow_management data_orchestration etl

15

tagged

Activity Trend

157 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: Jarek Potiuk ×

Have you ever wondered why Apache Airflow builds are asymptotically() green? That thrive for “perennial green build” is not magic, it’s the result of continuous, often unseen engineering effort within our CI/CD pipelines & dev environments. This dedication ensures that maintainers can work efficiently & contributors can onboard smoothly. To tackle the ever growing contributor base, we have a CI/CD team run by volunteers putting in significant work in the foundational tooling. In this talk, we reveal some innovative solutions we have implemented like: Handling GitHub Actions pull_request_target challenges Restructuring the repo for better clarity Slack bot for CI failure alerts A cherry picker workflow for releases Pre-commit hooks Faster website and image builds Tackling the new GitHub API rate limits Solving chicken-and-egg build issues during releases Join us to understand the “why” & “how” behind these infra components. You’ll gain insights into the continuous effort required to support a thriving open-source project like Airflow and, hopefully, be inspired to contribute to these areas. () asymptotically = we fix failures as quickly as we can when they happen

Airflow’s power comes from its vast ecosystem, but securing this intricate web requires a united front. This talk unveils a groundbreaking collaborative effort between the Python Software Foundation (PSF), the Apache Software Foundation (ASF), the Airflow Project Management Committee (PMC), and Alpha-Omega Fund - aimed at securing not only Airflow, but the whole ecosystem. We’ll explore this new project dedicated to improving security across the Airflow landscape.

Apache Airflow relies on a silent symphony behind the scenes: its CI/CD (Continuous Integration/Continuous Delivery) and development tooling. This presentation explores the critical role these tools play in keeping Airflow efficient and innovative. We’ll delve into how robust CI/CD ensures bug fixes and improvements are seamlessly integrated, while well-maintained development tools empower developers to contribute effectively. Airflow’s power comes from a well-oiled machine – its CI/CD and development tools. This presentation dives into the world of these often-overlooked heroes. We’ll explore how seamless CI/CD pipelines catch and fix issues early, while robust development tools empower efficient coding and collaboration. Discover how you can use and contribute to a thriving Airflow ecosystem by ensuring these crucial tools stay in top shape.

Apache Airflow relies on a silent symphony behind the scenes: its CI/CD (Continuous Integration/Continuous Delivery) and development tooling. This presentation explores the critical role these tools play in keeping Airflow efficient and innovative. We'll delve into how robust CI/CD ensures bug fixes and improvements are seamlessly integrated, while well-maintained development tools empower developers to contribute effectively.

Apache Airflow has over 650 Python dependencies. In case you did not know already, dependencies in Python are difficult subject. And Airflow has its own, custom ways of managing the dependencies. Airflow has a rather complex system to manage dependencies in their CI system, but this talk is not about it. This talk is directed to the users of Airflow who want to keep their dependencies updated, describing ways they can do it. This presentation will explain how to effectively manage and handle custom dependencies in Airflow. Jarek will guide you through practical solutions and best practices to make your Airflow experience with dependencies - yes you guessed it - a breeze.

session
by Jarek Potiuk (Apache Software Foundation) , Vincent Beck

This sesion is about the current state of implementation for multi-tenancy feature of Airflow. This is a long-term feature that involves multiple changes, separate AIPs to implement, with the long-term vision of having single Airflow instance supporting multiple, independed teams using it - either from the same company or as part of Airflow-As-A-Service implementation.

This talk is a walk through throug a number of ways maintainers of open-source projects (for example Airflow) can improve the communication with their users by exercising empathy. This subject is often overlooked in the cirriculum of average developer and contributor, but one that can make or break the product you developed, simply because it will become more approachable for users. Maintainers often forget or simply do not realize how many assumptions they have in their head. There are a number of techniques maintainers can use to improve it. This talk will walk through a number of examples (from Airflow and other projects), reasoning and ways how communication between maintainers and users can be improved - in the code, documentation, communication but also with involving and engaging the users they are commmunicating with, as more often than not - the users might be of great help when it comes to communication with them - if only asked. This talk is for both - maintainers and users, as I consider communication between users and maintainers two way street.

session
by Jarek Potiuk (Apache Software Foundation)

This session is about the state and future plans of the multi-tenancy feature of Airflow. Airflow has traditionally been single-tenant product. Mutliple instances could be bound together to provide a multi-tenant implementation and when using a modern infrastructure - Kubernetes - you could even reuse resources between those - but it was not a true “multi-tenant” solution. But Airflow becomes more of a platform now and the needs for multi-tenancy as a feature of the platform are highly expected by a number of users. In 2022 we’ve started to add multi-tenant features and we are aiming to make Airflow Multi-Tenant in the near* future. This talk is about the state of the multi-tenancy now and the future plans we have for Airflow becoming full multi-tenant platform.

session
by Elad Kalif , Jarek Potiuk (Apache Software Foundation)

This workshop is sold out By attending this workshop, you will learn how you can become a contributor to the Apache Airflow project. You will learn how to setup a development environment, how to pick your first issue, how to communicate effectively within the community and how to make your first PR - experienced committers of Apache Airflow project will give you step-by-step instructions and will guide you in the process. When you finish the workshop you will be equipped with everything that is needed to make further contributions to the Apache Airflow project.

session
by Jarek Potiuk (Apache Software Foundation) , Kaxil Naik

In this talk Jarek and Kaxil will talk about official, community support for running Airflow in the Kubernetes environment. The full support for Kubernetes deployments was developed by the community for quite a while and in the past users of Airflow had to rely on 3rd-party images and helm-charts to run Airflow on Kubernetes. Over the last year community members made an enormous effort to provide robust, simple and versatile support for those deployments that would respond to all kinds of Airflow users. Starting from official container image, through quick-start docker-compose configuration, culminating in April with release of the official Helm Chart for Airflow. This talk is aimed for Airflow users who would like to make use of all the effort. The users will learn how to: Extend or customize Airflow Official Docker Image to adapt it to their needs Run quickstart docker-compose environment where they can quickly verify their images Configure and deploy Airflow on Kubernetes using the Official Airflow Helm chart

You might have heard some recent news about ransomware attacks for many companies. Quite recently the U. S. Department of Justice has elevated the priority of investigations of ransomware attacks to the same level as terrorism. Certainly security aspects of running software and so called “supply-chain attacks” have made a press recently. Also, you might have read recently about security researcher who made USD 13,000 via bounties by finding and contacting companies that had old, un-patched versions of Airflow - even if the ASF security process was great and PMC of Airflow has fixed those long time ago. If any of this rings a bell, then this session is for you. In this session Dolev (security expert and researchers who submitted security issues recently to Airflow), Ash and Jarek (Airflow PMC members) will discuss the state of security and best practices for keeping your Airflow secure and why it is important. The discussion will be moderated by Tomasz Urbaszek, Airflow PMC member. You can get a glimpse of what they will talk about through this blog post .

Participation in this workshop requires previous registration and has limited capacity. Get your ticket at https://ti.to/airflowsummit/2021-contributor By attending this workshop, you will learn how you can become a contributor to the Apache Airflow project. You will learn how to setup a development environment, how to pick your first issue, how to communicate effectively within the community and how to make your first PR - experienced committers of Apache Airflow project will give you step-by-step instructions and will guide you in the process. When you finish the workshop you will be equipped with everything that is needed to make further contributions to the Apache Airflow project. Prerequisites: You need to have Python experience . Previous experience in Airflow is nice-to-have. The session is geared towards Mac and Linux users. If you are a Windows user, it is best if you install Windows Subsystem for Linux (WSL). In preparation for the class, please make sure you have set up the following prerequisites: make a fork of the https://github.com/apache/airflow clone the forked repository locally follow the Breeze prerequisites: https://github.com/apache/airflow/blob/master/BREEZE.rst#prerequisites run ./breeze --python 3.6 create a virtualenv as described in https://github.com/apache/airflow/blob/master/LOCAL_VIRTUALENV.rst part of preparing the virtualenv is initializing it with ./breeze initialize-local-virtualenv