Get your ticket for this workshop Tensorflow Extended (TFX) can run machine learning pipelines on Airflow, but all the steps are run by default in the same workers where the Airflow DAG is running. This can lead to an excessive usage of resources, and breaks the assumption that Airflow is a scheduler; it becomes also the data processing platform. In this session, we will see how to use TFX with third party services, on top of Google Cloud Platform. The data processing steps can be run in Dataflow, Spark, Flink and other runners (parallelizing the processing of data and scaling up to petabytes), and the training steps can be run in Vertex or other external services. After this workshop, you will have learnt how to externalize any TFX heavyweight computing outside Airflow, while maintaining Airflow as the orchestrator for your machine learning pipelines.
talk-data.com
Topic
Dataflow
Google Cloud Dataflow
data_processing
stream_processing
google_cloud
2
tagged
Activity Trend
8
peak/qtr
2020-Q1
2026-Q1
Top Events
Data Engineering Podcast
19
Google Cloud Next '24
7
Google Cloud Next '25
5
O'Reilly Data Engineering Books
4
O'Reilly Data Science Books
3
Airflow Summit 2022
2
DATA MINER Big Data Europe Conference 2020
1
Data Council 2023
1
Experiencing Data w/ Brian T. O’Neill (AI & data product management leadership—powered by UX design)
1
Airflow Summit 2020
1
Making Data Simple
1
Straight Data Talk
1
Filtering by:
Airflow Summit 2022
×
This workshop is sold out Hands on workshop showing how easy it is to deploy Airflow in a public Cloud. Workshop consists of 3 parts: Setting up Airflow environment and CI/CD for DAG deployment Authoring a DAG Troubleshoot Airflow DAG/Task execution failures This workshop will be based on Cloud Composer ( https://cloud.google.com/composer ) This workshop is mostly targeted at Airflow newbies and users who would like to learn more about Cloud Composer and how to develop DAGs using Google Cloud Platform services like BigQuery, Vertex AI, Dataflow.