Enterprises want the flexibility to operate across multiple clouds, whether to optimize costs, improve resiliency, to avoid vendor lock-in, or for data sovereignty. But for developers, that flexibility usually comes at the cost of extra complexity and redundant code. The goal here is simple: write once, run anywhere, with minimum boilerplate. In Apache Airflow, we’ve already begun tackling this problem with abstractions like Common-SQL, which lets you write database queries once and run them on 20+ databases, from Snowflake to Postgres to SQLite to SAP HANA. Similarly, Common-IO standardizes cloud blob storage interactions across all public clouds. With Airflow 3.0, we are pushing this further by introducing a Common Message Bus provider, which is an abstraction, initially supporting Amazon SQS and expanding to Google PubSub and Apache Kafka soon after. We expect additional implementations such as Amazon Kinesis and Managed Kafka over time. This talk will dive into why these abstractions matter, how they reduce friction for developers while giving enterprises true multi-cloud optionality, and what’s next for Airflow’s evolving provider ecosystem.
talk-data.com
Topic
postgresql
4
tagged
Activity Trend
Top Events
In this talk, I’ll walk through how we built an end-to-end analytics pipeline using open-source tools ( Airbyte, dbt, Airflow, and Metabase). At WirePick, we extract data from multiple sources using Airbyte OSS into PostgreSQL, transform it into business-specific data marts with dbt, and automate the entire workflow using Airflow. Our Metabase dashboards provide real-time insights, and we integrate Slack notifications to alert stakeholders when key business metrics change. This session will cover: Data extraction: Using Airbyte OSS to pull data from multiple sources Transformation & Modeling: How dbt helps create reusable data marts Automation & Orchestration: Managing the workflow with Airflow Data-driven decision-making: Delivering insights through Metabase & Slack alerts
Apache Airflow 3 is a new state-of-the-art version of Airflow. For many users who plan to adopt Airflow 3 it’s important to understand how Airflow 3 behaves from performance perspective compared to Airflow 2. This presentation is going to present performance results for various Airflow 3 configurations and provides potential Airflow 3 adopters good understanding of its performance. The reference Airflow 3 configuration will be using Kubernetes cluster as a compute layer, PostgreSQL as Airflow Database and would be performed on Google Cloud Platform. Performance tests will be performed using community version of performance tests framework and there might be references to Cloud Composer (managed service for Apache Airflow). The tests will be done in production-grade configurations that might be good references for Airflow community users. Users will be provided with comparison of Airflow 3 and Airflow 2 from performance standpoint Users also will learn how to optimize Airflow scheduler performance by understanding DAG file processing, task scheduling and configuring Scheduler to run tens of thousands of DAGs/tasks in Airflow 3
A real-world journey of how my small team at Xena Intelligence built robust data pipelines for our enterprise customers using Airflow. If you’re a data engineer, or part of a small team, this talk is for you. Learn how we orchestrated a complex workflow to process millions of public reviews. What You’ll Learn: Cost-Efficient DAG Designing: Decomposing complex processes into atomic tasks using the TaskFlow, XComs, Mapped tasks, and Task groups. Diving into one of our DAGs as a concrete example of how our approach optimizes parallelism, error handling, delivery speed, and reliability. Integrating LLM Analysis: Explore how we integrated LLM-based analysis into our pipeline. Learn how we designed the database, queries, and ingestion to Postgres. Extending Airflow UI: We developed a custom Airflow UI plugin that filters and visualizes DAG runs by customer, product, and marketplace, delivering clear insights for faster troubleshooting. Leveraging Airflow REST API: Discover how we leveraged the API to trigger DAGs on demand, elevating the UX by tracking mapped DAG progress and computing ETAs. CI/CD and Cost Management: Get practical tips for deploying DAGs with CI/CD.