talk-data.com
PyData
talk
2025-06-08 at 15:15
Scaling AI workloads with Ray & Airflow
Event:
PyData London 2025
Speakers
Description
Ray is an open-source framework for scaling Python applications, particularly machine learning and AI workloads. It provides the layer for parallel processing and distributed computing. Many large language models (LLMs), including OpenAI's GPT models, are trained using Ray.
On the other hand, Apache Airflow is a consolidated data orchestration framework downloaded more than 20 million times monthly.
This talk presents the Airflow Ray provider package that allows users to interact with Ray from an Airflow workflow. In this talk, I'll show how to use the package to create Ray clusters and how Airflow can trigger Ray pipelines in those clusters.