talk-data.com talk-data.com

P

Speaker

Pavan kumar Gopidesu

2

talks

Technical Lead at Tavant, Member of The Apache Software Foundation & Apache Airflow Committer

Filter by Event / Source

Talks & appearances

2 activities · Newest first

Search activities →

What if your Airflow tasks could understand natural language AND adapt to schema changes automatically, while maintaining the deterministic, observable workflows we rely on? This talk introduces practical patterns for AI-native orchestration that preserve Airflow’s strengths while adding intelligence where it matters most. Through a real-world example, we’ll demonstrate AI-powered tasks that detect schema drift across multi-cloud systems and perform context-aware data quality checks that go beyond simple validation—understanding business rules, detecting anomalies, and generating validation queries from prompts like “check data quality across regions.” All within static DAG structures you can test and debug normally. We’ll show how AI becomes a first-class citizen by combining Airflow’s features, assets for schema context, Human-in-the-Loop for approvals, and AssetWatchers for automated triggers, with engines such as Apache DataFusion for high-performance query execution and support for cross-cloud data processing with unified access to multiple storage formats. These patterns apply directly to schema validation and similar cases where natural language can simplify complex operations. This isn’t about bolting AI onto Airflow. It’s about evolving how we build workflows, from brittle rules to intelligent adaptation, while keeping everything testable, auditable, and production-ready.

Have you ever wondered why Apache Airflow builds are asymptotically() green? That thrive for “perennial green build” is not magic, it’s the result of continuous, often unseen engineering effort within our CI/CD pipelines & dev environments. This dedication ensures that maintainers can work efficiently & contributors can onboard smoothly. To tackle the ever growing contributor base, we have a CI/CD team run by volunteers putting in significant work in the foundational tooling. In this talk, we reveal some innovative solutions we have implemented like: Handling GitHub Actions pull_request_target challenges Restructuring the repo for better clarity Slack bot for CI failure alerts A cherry picker workflow for releases Pre-commit hooks Faster website and image builds Tackling the new GitHub API rate limits Solving chicken-and-egg build issues during releases Join us to understand the “why” & “how” behind these infra components. You’ll gain insights into the continuous effort required to support a thriving open-source project like Airflow and, hopefully, be inspired to contribute to these areas. () asymptotically = we fix failures as quickly as we can when they happen