Up until a few years ago, teams at Uber used multiple data workflow systems, with some based on open source projects such as Apache Oozie, Apache Airflow, and Jenkins while others were custom built solutions written in Python and Clojure. Every user who needed to move data around had to learn about and choose from these systems, depending on the specific task they needed to accomplish. Each system required additional maintenance and operational burdens to keep it running, troubleshoot issues, fix bugs, and educate users. After this evaluation, and with the goal in mind of converging on a single workflow system capable of supporting Uber’s scale, we settled on an Airflow-based system. The Airflow-based DSL provided the best trade-off of flexibility, expressiveness, and ease of use while being accessible for our broad range of users, which includes data scientists, developers, machine learning experts, and operations employees. This talk will focus on scaling Airflow to Uber’s scale and providing a no-code seamless user experience
talk-data.com
Topic
Jenkins
ci_cd
automation
devops
software_development
1
tagged
Activity Trend
2
peak/qtr
2020-Q1
2026-Q1
Top Events
DataTalks.Club
2
O'Reilly Data Science Books
2
O'Reilly Data Engineering Books
1
Data Engineering Podcast
1
The Joe Reis Show
1
Data + AI Summit 2025
1
Meetup "Maturité et usages Cloud Public" @ Société Générale
1
Airflow Summit 2024
1
Airflow Summit 2023
1
Google NY Site Reliability Engineering (SRE) Tech Talks, 12 Dec 2024
1
Databricks DATA + AI Summit 2023
1
Data Skeptic
1
Filtering by:
Shobhit Shah
×