talk-data.com
Activities tracked
7
Data engineers and pipeline managers know that producing data lineage – end-to-end pipeline metadata instrumented at runtime or parsed at design time – is a heavy lift without a shared standard for lineage metadata. It requires duplication of effort across pipeline tooling, and deployment of new tools can break existing lineage workflows. Getting useful lineage can seem like a sisyphean task.
Enter OpenLineage, an increasingly adopted open standard for lineage metadata collection. It defines a generic model of run, job, and dataset entities identified using consistent naming strategies. The core lineage model is extensible by defining specific facets to enrich those entities.
Time: 5 PM ET Place: Canarts Media Studio, 600 Bay St. #410 Toronto, ON, M5G1M6 Venue phone: 416-805-2286
The tentative agenda:
- Intros
- Evolution of spec presentation/discussion (project background/history)
- State of the community
- Integrating OpenLineage with Metaphor (by special guests Ye & Ivan)
- Spark/Column lineage update
- Airflow Provider update
- Roadmap Discussion
- Action items review/next steps
Join our Slack community: https://bit.ly/lineageslack
Sessions & talks
Showing 1–7 of 7 · Newest first