Julian LaNeve

Simplifying DAG creation with an AI-powered IDE for Airflow

2025-07-01 · Airflow Summit 2025

session

AI/ML Airflow Data Engineering Git

As the demand for data products grows, data engineering teams face mounting pressure to deliver more and even faster, often becoming bottlenecks. Astro IDE changes the game. Astro IDE is an AI-powered code editor built for Apache Airflow. It helps data teams go from idea to production in minutes—generating production-ready DAGs, enabling in-browser testing, and integrating directly with Git. In this session, see how Astro IDE accelerates DAG creation, debugging, and deployment so data engineering teams can deliver more, 10x faster.

Simplifying DAG creation with an AI-powered IDE for Airflow

2025-07-01 · Airflow Summit 2025

session

AI/ML Airflow Data Engineering Git

As the demand for data products grows, data engineering teams face mounting pressure to deliver more and even faster, often becoming bottlenecks. Astro IDE changes the game. Astro IDE is an AI-powered code editor built for Apache Airflow. It helps data teams go from idea to production in minutes—generating production-ready DAGs, enabling in-browser testing, and integrating directly with Git. In this session, see how Astro IDE accelerates DAG creation, debugging, and deployment so data engineering teams can deliver more, 10x faster.

Building Reliable Data Products

2024-07-01 · Airflow Summit 2024

session

with Jason Ma , Julian LaNeve (Astronomer)

Analytics Astronomer

Data engineers have shifted from delivering data for internal analytics applications to customer-facing data products. And with that shift comes a whole new level of operational rigor necessary to instill trust and confidence in the data. How do you hold data pipelines to the same standards as traditional software applications? Can you apply principles learned from the field of SRE to the world of data? In this talk, we’ll explore how we’ve seen this evolve in Astronomer’s customer base and highlight best practices learned from the most critical data product applications we’ve seen. We’ll hear from Astronomer’s own data team as they went through the transformation from analytics to data products. And we’ll showcase a new product we’re building to help data teams around the world solve exactly this problem!

Why Do Airflow Tasks Fail? An Analysis through Machine Learning Techniques

2024-07-01 · Airflow Summit 2024

session

with David Xue (Astronomer) , Julian LaNeve (Astronomer)

AI/ML Airflow MLOps NLP

There are 3 certainties in life: death, taxes, and data pipelines failing. Pipelines may fail for a number of reasons: you may run out of memory, your credentials may expire, an upstream data source may not be reliable, etc. But there are patterns we can learn from! Join us as we walk through an analysis we’ve done on a massive dataset of Airflow failure logs. We’ll show how we used natural language processing and dimensionality reduction methods to explore the latent space of Airflow task failures in order to cluster, visualize, and understand failures. We’ll conclude the talk by walking through mitigation methods for common task failure reasons, and walk through how we can use Airflow to build an MLOps platform to turn this one-time analysis into a reliable, recurring activity.

The Frugal Dev’s Guide to LLMs

2024-05-15 · NYC Airflow Rooftop Happy Hour ft. PMC Member Jarek Potiuk!

talk

with David Xue (Astronomer) , Julian LaNeve (Astronomer)

llms fine-tuning NLP Airflow

Julian and David will cover the Hackathon project they worked on that won at the New York Stock Exchange— fine tuning an LLM to generate summaries for airflow task failures.

Building and deploying LLM applications with Apache Airflow

2023-07-01 · Airflow Summit 2023

session

with Kaxil Naik , Julian LaNeve (Astronomer)

AI/ML Airflow LLM

Behind the growing interest in Generate AI and LLM-based enterprise applications lies an expanded set of requirements for data integrations and ML orchestration. Enterprises want to use proprietary data to power LLM-based applications that create new business value, but they face challenges in moving beyond experimentation. The pipelines that power these models need to run reliably at scale, bringing together data from many sources and reacting continuously to changing conditions. This talk focuses on the design patterns for using Apache Airflow to support LLM applications created using private enterprise data. We’ll go through a real-world example of what this looks like, as well as a proposal to improve Airflow and to add additional Airflow Providers to make it easier to interact with LLMs such as the ones from OpenAI (such as GPT4) and the ones on HuggingFace, while working with both structured and unstructured data. In short, this shows how these Airflow patterns enable reliable, traceable, and scalable LLM applications within the enterprise.

talk-data.com

Frequent Collaborators

Filter by Event / Source

Simplifying DAG creation with an AI-powered IDE for Airflow

Simplifying DAG creation with an AI-powered IDE for Airflow

Building Reliable Data Products

Why Do Airflow Tasks Fail? An Analysis through Machine Learning Techniques

The Frugal Dev’s Guide to LLMs

Building and deploying LLM applications with Apache Airflow