talk-data.com talk-data.com

Filter by Source

Select conferences and events

People (6 results)

See all 6 →
Showing 6 results

Activities & events

Title & Speakers Event
Tatiana Al-Chueyr Martins – Principal Software Engineer @ Astronomer

dbt has become the de facto standard for transforming data in modern analytics stacks. But as projects grow, so does the question: where should dbt run in production, and how can we make it faster? In this talk, we’ll compare the performance trade-offs between running dbt natively and orchestrating it through Airflow using Cosmos, with a focus on workflow efficiency at scale. Using a 200-model dbt project as a case study, we’ll show how workflow execution time in Cosmos was reduced from 15 minutes to just 5 minutes. We’ll also discuss opportunities to push performance further—ranging from better DAG optimization to warehouse-aware scheduling strategies. Whether you’re a data engineer, analytics engineer, or platform owner, you’ll leave with practical strategies to optimize dbt execution and inspiration for what’s next in large-scale orchestration

Airflow dbt cosmos astronomer cosmos
Giovanni Corsetti Silva – Data Core Engineer @ GetYourGuide

Apache Airflow is the go-to platform for data orchestration, while dbt is widely recognized for analytical data transformations. Using astronomer-cosmos library, integrating dbt projects into Airflow becomes straightforward, allowing each dbt model to be transformed into individual tasks or task groups equipped with Airflow features like retries and callbacks. However, organizing dbt models into separate Airflow DAGs based on domain or user-defined filters presents challenges in maintaining dependencies across these distinct DAGs. Ensuring that downstream dbt tasks only execute after the corresponding upstream tasks in different DAGs have successfully completed is crucial for data consistency—yet this functionality is not supported by default. Join GetYourGuide as we explore our method for dynamically creating inter-DAG sensors in Airflow using Astronomer Cosmos for dbt. We will show how we maintained dbt model dependencies across multiple DAGs, making our pipeline modular, scalable, and robust.

Airflow dbt astronomer cosmos
Robert Bemmann – Senior Data Engineer @ GetYourGuide

This presentation highlights practical validation techniques to prevent misconfigurations and enhance reliability in Apache Airflow environments. We cover two key safeguards: validating that sensors are correctly tied to their upstream tasks, and checking that critical DAGs have PagerDuty alerting enabled. Both validations are automated and integrated into our CI/CD pipeline using GitHub Actions, ensuring continuous enforcement and early detection of potential issues before deployment. In addition, we’ve implemented a solution to track Service Level Objectives (SLOs) for our DAGs, enabling better insight into reliability and performance over time. These checks form a practical defense against operational blind spots, promoting workflow reliability and robust incident response. Join us as we uncover practical strategies to streamline workflow monitoring and enhance system resilience using Apache Airflow's robust capabilities.

Airflow github actions PagerDuty
Airflow Meetup @ G-Research 2023-08-24 · 17:00

Save the date!! Let's meet at the G-Research office for an evening of great talks with pizza and drinks!

***

Talk #1: OpenLineage in Airflow: A Comprehensive Guide

Speaker: Maciej Obuchowski (Software engineer @ GetInData and OpenLineage committer)

With native support for Openlineage in Airflow, users can now easily observe and manage their data pipelines. This talk will cover the benefits of using OpenLineage, its implementation in Airflow, practical examples of how to take advantage of it, and what’s in our roadmap. Whether you’re an Airflow user or provider maintainer, this session will give you the knowledge to make the most of this tool.

***

Talk #2: Running dbt pipelines in Airflow

Speaker: Tatiana Al-Chueyr (Staff Software Engineer @ Astronomer)

dbt is an open-source project that allows data teams to transform data by defining pipelines, mostly with SQL files. Using Airflow to orchestrate and execute dbt projects as DAGs gives users a reliable and scalable environment to run them.

This talk introduces Cosmos, an open-source package from Astronomer that helps users run dbt pipelines as Airflow DAGs with few lines of code. Cosmos allows users to have fine-grained control over dbt resources while benefiting from various Airflow features, such as data-aware scheduling and retries.

***

Thanks to G-Research for hosting the event!


We are looking for speakers for future Meetup Sessions. Please fill out https://forms.gle/ES1YDE6wsHy95xKf8 if you are interested.

Airflow Meetup @ G-Research

** Big give-away at the end: two Designing Machine Learning Systems, O'ReillyGDG Cloud London is thrilled to be collaborating with AICamp (https://www.meetup.com/London-AI-Tech-Talk/) for a deep dive into the AI world.

The event will be held at GoCardless London office.

This meet-up is a unique opportunity to connect with fellow AI enthusiasts, industry practitioners, and researchers in a dynamic and interactive setting. Whether you are a seasoned AI professional or just curious about the latest advancements in NLP, LLMs and Airflow, this meet-up is for you! Join us for an insightful and thought-provoking discussion on the forefront of AI innovation.

At the end of the event, we'll give away two O'Reilly books: Designing Machine Learning Systems (paperback).

Don't miss out, RSVP now!

A big shout-out to our sponsor, Transparent (https://heytransparent.io).

Agenda

6:00 PM: Arrivals and Check In

6:30 PM: Welcome / Community Update

A quick intro from GDG Cloud London, AICamp and Transparent.io.

6:45 PM: Tatiana Al-Chueyr - Creating your own ChatGPT with Apache Airflow

Apache Airflow is an orchestration tool which allows users to build all sorts of pipelines, automating steps and allowing them to run on schedule reliably. Thousands of companies use Airflow to process ETL and machine learning pipelines worldwide. This talk will illustrate creating an Airflow pipeline to process data and train a custom ChatGPT.

7:15 PM: Marty Pitt - Using AI to create data pipelines and Service Orchestration with Orbital and Google Cloud

7:45 PM: Wrap up, Networking and Raffle!

We are going to give away two Designing Machine Learning Systems, O'Reilly books after the last talk!


Speakers

Tatiana Al-Chueyr - Astronomer (Staff Software Engineer)

Tatiana is a Staff Software Engineer at Astronomer and builds open-source authoring tools on top of Apache Airflow. She Graduated in Computer Engineering and has worked for over 18 years building highly scalable software for multiple organisations, including the Ministry of Science and Technology in Brazil, TV Globo and the BBC.

Marty Pitt - Orbital (Founder)

Hosted By

Amanda Cavallaro, GDG Organizer

I'm an Aikidoka, Developer Advocate, Software Developer, Google Developers Expert, Linkedin Learning Author and a Full Stack Web Development Specialist.

Saverio Terracciano, GDG Organizer

Stefano Le Pera, GDG Organizer

Lorenzo Turrino, GDG Organizer

Alessandro Puccetti, GDG Organizer

My name is Alessandro Puccetti, I am Italian 🇮🇹 but I am in fact a citizen of the world 🌎. I love travelling and meeting new people from different cultures, and I enjoy having a particular focus on their food 😉.

Kubra Harmankaya, GDG Organizer

Natalie Godec, GDG Organizer

Nodir Siddikov, GDG Organizer

Bruno Ripa, GDG Organizer

I am an italian software architect, in the industry since 2006. I have been in entrepreneurship in Italy, for 6 years, and then continued my career in United Kingdom, in London (2012), a city (or, better, the City) which I consider as my second home. I have worked in several industries (gaming, fintech, digital asset management) and in many companies, with a 3 years parenthesis in Spain (2017-2020), precisely in Barcelona, where I have worked as a contractor for a few USA startups and an european company working in IoT. In March 2020 I made my way back in London, working for Erlang Solutions. Actually I am a contractor and Consultant at BBC

Arianna Capizzi, GDG Organizer

Complete your event RSVP here: https://gdg.community.dev/events/details/google-gdg-cloud-london-presents-ai-deep-dive-creating-your-own-chatgpt-with-apache-airflow/.

AI Deep Dive: Creating your own ChatGPT with Apache Airflow

Welcome to our in-person ML monthly meetup, in collaboration with Google Developers Group Cloud London. Join us for deep dive tech talks on AI/ML, food/drink, networking with speakers&peers developers, and win lucky draw prizes.

Pre-registration is required here: https://www.aicamp.ai/event/eventdetails/W2023071310

[RSVP instructions]

  • Pre-register at the event website. (venue security may not let you in if you don't pre-register)
  • Contact us to submit topics and/or sponsor the meetup on venue/food/swags/prizes. https://forms.gle/JkMt91CZRtoJBSFUA
  • Community on Slack for events chat, speakers office hour and learning resources, job openings and projects collaboration. join slack (search and join #london channel) *

Description: This meet-up is a unique opportunity to connect with fellow AI enthusiasts, industry practitioners, and researchers in a dynamic and interactive setting. Whether you are a seasoned AI professional or just curious about the latest advancements in ChatGPT and Airflow, this meet-up is for you! Join us for an insightful and thought-provoking discussion on the forefront of AI innovation.

Agenda (BST): * 6:00pm\~6:30pm: Checkin, Food/Snacks/Drink and networking * 6:30pm\~6:45pm: Welcome/community update * 6:45pm\~7:45pm: Tech talks * 7:45pm: Open discussion & Mixer

Tech Talk 1: Creating your own ChatGPT with Apache Airflow Speaker: Tatiana Al-Chueyr, Staff Software Engineer @Astronomer Abstract: Apache Airflow is an orchestration tool which allows users to build all sorts of pipelines, automating steps and allowing them to run on schedule reliably. Thousands of companies use Airflow to process ETL and machine learning pipelines worldwide. This talk will illustrate creating an Airflow pipeline to process data and train a custom ChatGPT.

Tech Talk 2: TBD

AI Deep Dive: Creating your own ChatGPT with Apache Airflow
Showing 6 results