talk-data.com talk-data.com

Vikram Koka

Speaker

Vikram Koka

1

talks

Chief Strategy Officer Astronomer

Vikram Koka is the Chief Strategy Officer at Astronomer, based in the San Francisco Bay Area. He is an experienced engineering leader with a background in distributed systems and data infrastructure. At Astronomer for six years, he led the Engineering and Open Source teams and contributed to Apache Airflow as a member of the Airflow PMC, focusing on architectural initiatives such as Scheduler High Availability, Data-Driven Scheduling, Dynamic Tasks, and the client/server architecture in Airflow 3.

Bio from: Airflow Summit 2021

Frequent Collaborators

Filtering by: Airflow Summit 2021 ×

Filter by Event / Source

Talks & appearances

Showing 1 of 9 activities

Search activities →

In Apache Airflow, Xcom is the default mechanism for passing data between tasks in a DAG. In practice, this has been restricted to small data elements, since the Xcom data is persisted in the Airflow metadatabase and is constrained by database and performance limitations. With the new TaskFlow API introduced in Airflow 2.0, it is seamless to pass data between tasks and the use of Xcom is invisible. However, the ability to pass data is restricted to a relatively small set of data types which can be natively converted in JSON. This tutorial describes how to go beyond these limitations by developing and deploying a Custom Xcom backend within Airflow to enable the sharing of large and varied data elements such as Pandas data frames between tasks in a data pipeline, using a cloud storage such as Google Storage or Amazon S3.