talk-data.com talk-data.com

Filter by Source

Select conferences and events

People (4 results)

See all 4 →
Showing 2 results

Activities & events

Title & Speakers Event
OpenLineage Meetup @ Google 2025-04-03 · 15:30

Note: this event is hybrid.

Data engineers and pipeline managers know that producing data lineage – end-to-end pipeline metadata instrumented at runtime or parsed at design time – is a heavy lift without a shared standard for lineage metadata. It requires duplication of effort across pipeline tooling, and deployment of new tools can break existing lineage workflows. Getting useful lineage can seem like a sisyphean task.

Enter OpenLineage, an increasingly adopted open standard for lineage metadata collection. It defines a generic model of run, job, and dataset entities identified using consistent naming strategies. The core lineage model is extensible by defining specific facets to enrich those entities.

Agenda:

  • 17:30 Arrival and welcome
  • 17:40 GCP Composer Migration to OpenLineage (Augusto Hidalgo, Google)
  • 18:05 Adding OpenLineage support to Airflow operators (Kacper Muda, GetInData now Astronomer)
  • 18:30 Are they compatible? Standardized compatibility testing for OpenLineage (Tomasz Nazarewicz, GetInData)
  • 18:55 Break for Food
  • 19:40 Connecting OpenLineage within Observability Ecosystems (Maciej Obuchowski, Datadog)
  • 20:05 Exposing metrics through Spark OpenLineage connector (Paweł Leszczynski, GetInData)
  • 20:30 Adjourn

Please note:

  • First name and last name will be checked against ID before entrance to Google premises.
  • Registrations sent after March 31 will not be accepted
OpenLineage Meetup @ Google
OpenLineage Meetup @ Google 2023-11-29 · 16:30

Data engineers and pipeline managers know that producing data lineage – end-to-end pipeline metadata instrumented at runtime or parsed at design time – is a heavy lift without a shared standard for lineage metadata. It requires duplication of effort across pipeline tooling, and deployment of new tools can break existing lineage workflows. Getting useful lineage can seem like a sisyphean task.

Enter OpenLineage, an increasingly adopted open standard for lineage metadata collection. It defines a generic model of run, job, and dataset entities identified using consistent naming strategies. The core lineage model is extensible by defining specific facets to enrich those entities.

Agenda:

  • Mary Idamkina: OpenLineage in GCP Dataplex
  • Paweł Leszczynski: Updates on the Spark Integration
  • Jakub Dardziński: "Extracting lineage from PythonOperator - how come this is possible?"
  • Paweł Leszczynski: "How to become spark-openlineage contributor in 5 steps"
OpenLineage Meetup @ Google
Showing 2 results