talk-data.com talk-data.com

Event

Toronto OpenLineage Meetup at Airflow Summit

2023-09-18 – 2023-09-19 Meetup Visit website ↗

Activities tracked

7

Data engineers and pipeline managers know that producing data lineage – end-to-end pipeline metadata instrumented at runtime or parsed at design time – is a heavy lift without a shared standard for lineage metadata. It requires duplication of effort across pipeline tooling, and deployment of new tools can break existing lineage workflows. Getting useful lineage can seem like a sisyphean task.

Enter OpenLineage, an increasingly adopted open standard for lineage metadata collection. It defines a generic model of run, job, and dataset entities identified using consistent naming strategies. The core lineage model is extensible by defining specific facets to enrich those entities.

Time: 5 PM ET Place: Canarts Media Studio, 600 Bay St. #410 Toronto, ON, M5G1M6 Venue phone: 416-805-2286

The tentative agenda:

  1. Intros
  2. Evolution of spec presentation/discussion (project background/history)
  3. State of the community
  4. Integrating OpenLineage with Metaphor (by special guests Ye & Ivan)
  5. Spark/Column lineage update
  6. Airflow Provider update
  7. Roadmap Discussion
  8. Action items review/next steps

Join our Slack community: https://bit.ly/lineageslack

Sessions & talks

Showing 1–7 of 7 · Newest first

Search within this event →

Action items review/next steps

2023-09-18
talk

Airflow Provider update

2023-09-18
talk

Evolution of spec presentation/discussion (project background/history)

2023-09-18
talk

Roadmap Discussion

2023-09-18
talk

Spark/Column lineage update

2023-09-18
talk

State of the community

2023-09-18
talk