talk-data.com talk-data.com

Event

Airflow Summit 2022

2022-07-01 Airflow Summit Visit website ↗

Activities tracked

5

Airflow Summit 2022 program

Filtering by: CI/CD ×

Sessions & talks

Showing 1–5 of 5 · Newest first

Search within this event →

Happy DAGs + Happy Teammates: How a little CI/CD can go a long way

2022-07-01
session

With a small amount of Cloud Build automation and the use of GitHub version control, your Airflow DAGs will always be tested and in sync no matter who is working on them. Leah will walk you through a sample CICD workflow for keeping your Airflow DAGs tested and in sync between environments and teammates.

Kyte: Scalable and Isolated DAG Development Experience at Lyft

2022-07-01
session

Developer velocity starts to become an issue as your user base grows and becomes more varied. This is compounded by the fact that it’s not easy to end-to-end test data pipelines as part of continuous integration. In this talk, we’ll go over what we’ve done at Lyft to make an effective development and testing environment, serving over 1000 users who have made over 5000 dags, at a rate of about 50 developer per week.

Managing Apache Airflow at Scale

2022-07-01
session

In this session we’ll be discussing the considerations and challenges when running Apache Airflow at scale. We’ll start by defining what it means to run Airflow at scale. Then we’ll dive deep into understanding limitations of the Airflow architecture, Scheduler processes, and configuration options. We’ll then define scaling workloads via containers and leveraging pools and priority, followed by scaling DAGs via dDynamic DAGs/DAG factories, CI/CD, and DAG access control. Finally we’ll get into managing Multiple Airflow Environments, how to split up workloads, and provide central governance for Airflow environment creation and monitoring with an example of Distributing workloads across environments.

Wisdoms learnt when contributing to Apache Airflow

2022-07-01
session

In this talk, I am going to share things that I learned while contributing to Apache Airflow. I am an Outreachy Intern for Apache Airflow. I made my first contribution to Open Source in the Apache Airflow project. I will also add a short description about myself and my experience working in Software Engineering and how i needed help in contributing to open source and ended up as an Intern for Outreachy. I also like to share about my first contribution towards Apache Airflow in its doc and how much confidence it gave me to continue contributing to it. Key things that I learned when contributing to Apache Airflow are: Clear communication in written form is very powerful. Code is not an asset and don’t worry about throwing it away. Don’t feel shy about asking questions. Open Source is a rich ecosystem where each projects help each other and thrive. Trivial things became no more trivial to me. While the above things are overall learning about open source contribution, I had specific important learnings for me which include writing unit tests, got to communicate with developers across the globe, improved written style of communication, knowing about many python libraries, understanding the CI pipeline.

Workshop: Running Airflow within Cloud Composer

2022-07-01
session

This workshop is sold out Hands on workshop showing how easy it is to deploy Airflow in a public Cloud. Workshop consists of 3 parts: Setting up Airflow environment and CI/CD for DAG deployment Authoring a DAG Troubleshoot Airflow DAG/Task execution failures This workshop will be based on Cloud Composer ( https://cloud.google.com/composer ) This workshop is mostly targeted at Airflow newbies and users who would like to learn more about Cloud Composer and how to develop DAGs using Google Cloud Platform services like BigQuery, Vertex AI, Dataflow.