talk-data.com talk-data.com

Event

Airflow Summit 2022

2022-07-01 Airflow Summit Visit website ↗

Activities tracked

6

Airflow Summit 2022 program

Filtering by: Kubernetes ×

Sessions & talks

Showing 1–6 of 6 · Newest first

Search within this event →

Airflow in the Cloud: Lessons from the Field

2022-07-01
session

Airflow users love to run Airflow in public clouds and on distributed infrastructures like Kubernetes. Running Airflow environments is easier than ever - community offers Helm-based installation for self-managed Airflow and there are many offerings of Airflow-based managed services. Commoditization of Airflow and broader Airflow user base brings new challenges. This talk presents observations of the Airflow service provider delivering “Airflow as a Service’’ to cloud users (very technical, less technical and not technical at all). Information presented during this talk will be directed to the Apache Airflow committers and contributors with the hope that one can influence Airflow’s future roadmap so that Apache Airflow becomes easy to use.

Airflow / Kubernetes: Running on and using k8s

2022-07-01
session

Apache Airflow and Kubernetes work well together. Not only does Airflow have native support for running tasks on Kubernetes, there is also an official helm chart that makes it easy to run Airflow itself on Kubernetes! Confused on the differences between KubernetesExecutor and KubernetesPodOperator? What about CeleryKubernetesExecutor? Or the new LocalKubernetesExecutor? After this talk you will understand how they all fit in the ecosystem. We will talk about the ways you can run Airflow on Kubernetes, run tasks on Kubernetes, or do both. We will also cover things you may want to consider doing to have a reliable Airflow instance.

Automatic Speech Recognition at Scale Using Tensorflow, Kubernetes and Airflow

2022-07-01
session

Automatic Speech Recognition is quite a compute intensive task, which depends on complex Deep Learning models. To do this at scale, we leveraged the power of Tensorflow, Kubernetes and Airflow. In this session, you will learn about our journey to tackle this problem, main challenges, and how Airflow made it possible to create a solution that is powerful, yet simple and flexible.

Data Science Platform at PlayStation and Apache Airflow

2022-07-01
session

In this talk, we explain how Apache Airflow is at the center of our Kubernetes-based Data Science Platform at PlayStation. We talk about how we built a flexible development environment for Data Scientists to interact with Apache Airflow and explain the tools and processes we built to help Data Scientists promote their dags from development to production. We will also talk about the impact of containerization and the usage of KubernetesOperator and the new SparkKubernetesOperator and the benefits of deploying Airflow in Kubernetes using the KubernetesExecutor across multiple environments.

Multitenancy is coming

2022-07-01
session

This session is about the state and future plans of the multi-tenancy feature of Airflow. Airflow has traditionally been single-tenant product. Mutliple instances could be bound together to provide a multi-tenant implementation and when using a modern infrastructure - Kubernetes - you could even reuse resources between those - but it was not a true “multi-tenant” solution. But Airflow becomes more of a platform now and the needs for multi-tenancy as a feature of the platform are highly expected by a number of users. In 2022 we’ve started to add multi-tenant features and we are aiming to make Airflow Multi-Tenant in the near* future. This talk is about the state of the multi-tenancy now and the future plans we have for Airflow becoming full multi-tenant platform.

Running +150 production Airflow on Kubernetes, is that HARD ?

2022-07-01
session

This talk will cover the challenges we can face managing a large number of Airflow instances on private environment. Monitoring and metrics layers for production environment. Collecting and customizing logs. Resource consumption and green IT. Providing support for users and shared responsibility. Pain points