talk-data.com talk-data.com

Airflow Summit session 2025-07-01

Designing Scalable Retrieval-Augmented Generation (RAG) Pipelines at SAP with Apache Airflow

Description

At SAP Business AI, we’ve transformed Retrieval-Augmented Generation (RAG) pipelines into enterprise-grade powerhouses using Apache Airflow. Our Generative AI Foundations Team developed a cutting-edge system that effectively grounds Large Language Models (LLMs) with rich SAP enterprise data. Powering Joule for Consultants, our innovative AI copilot, this pipeline manages the seamless ingestion, sophisticated metadata enrichment, and efficient lifecycle management of over a million structured and unstructured documents. By leveraging Airflow’s Dynamic DAGs, TaskFlow API, XCom, and Kubernetes Event-Driven Autoscaling (KEDA), we achieved unprecedented scalability and flexibility. Join our session to discover actionable insights, innovative scaling strategies, and a forward-looking vision for Pipeline-as-a-Service, empowering seamless integration of customer-generated content into scalable AI workflows