talk-data.com talk-data.com

Topic

Data Contracts

data_governance data_quality data_engineering

7

tagged

Activity Trend

14 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: Jean-Georges Perrin ×
Building Data Products

As organizations grapple with fragmented data, siloed teams, and inconsistent pipelines, data products have emerged as a practical solution for delivering trusted, scalable, and reusable data assets. In Building Data Products, Jean-Georges Perrin provides a comprehensive, standards-driven playbook for designing, implementing, and scaling data products that fuel innovation and cross-functional collaboration—whether or not your organization adopts a full data mesh strategy. Drawing on extensive industry experience and practitioner interviews, Perrin shows readers how to build metadata-rich, governed data products aligned to business domains. Covering foundational concepts, real-world use cases, and emerging standards like Bitol ODPS and ODCS, this guide offers step-by-step implementation advice and practical code examples for key stages—ownership, observability, active metadata, compliance, and integration. Design data products for modular reuse, discoverability, and trust Implement standards-driven architectures with rich metadata and security Incorporate AI-driven automation, SBOMs, and data contracts Scale product-driven data strategies across teams and platforms Integrate data products into APIs, CI/CD pipelines, and DevOps practices

Bien menée, la gouvernance est un moteur de croissance. Durant cette session, Jean-Georges Perrin montrera comment les data contracts apportent précision, confiance et responsabilité à vos pipelines données et IA, sans créer de goulots d'étranglement. En utilisant l'Open Data Contract Standard (ODCS) du projet Bitol de la Fondation Linux, vous découvrirez comment les organisations peuvent réduire les défauts en aval, accélérer l'intégration des modèles IA, réduire les risques de conformité et simplifier la gestion des incidents, souvent en quelques jours seulement.

When done right, governance is a growth engine. In this talk, Jean-Georges “jgp” Perrin will show how data contracts bring precision, trust, and accountability into your data and AI pipelines—without creating bottlenecks. Using the Open Data Contract Standard (ODCS) from the Linux Foundation’s Bitol project, you’ll see how organizations can cut downstream defects, accelerate AI model onboarding, lower compliance risk, and reduce firefighting—often in just days.

The Data Product Management In Action podcast, brought to you by Soda and executive producer Scott Hirleman, is a platform for data product management practitioners to share insights and experiences. We've released a special edition series of minisodes of our podcast. Recorded live at Data Connect 2024, our host Michael Toland engages in short, sweet, informative, and delightful conversations with five prevelant practitioners who are forging their way forward in data and technology.

About our host Michael Toland: Michael is a Product Management Coach and Consultant with Pathfinder Product, a Test Double Operation. Since 2016, Michael has worked on large-scale system modernizations and migration initiatives at Verizon. Outside his professional career, Michael serves as the Treasurer for the New Leaders Council, mentors with Venture for America, sings with the Columbus Symphony, and writes satire for his blog Dignified Product. He is excited to discuss data product management with the podcast audience. Connect with Michael on LinkedIn About our guest Jean-Georges Perrin: Jean-Georges “jgp” Perrin is the Chief Innovation Officer at AbeaData, where he focuses on developing cutting-edge data tooling. He chairs the Open Data Contract Standard (ODCS) at the Linux Foundation's Bitol project, co-founded the AIDA User Group, and has authored several influential books, including Implementing Data Mesh (O'Reilly) and Spark in Action, 2nd Edition (Manning). With over 25 years in IT, Jean-Georges is recognized as a Lifetime IBM Champion, a PayPal Champion, and a Data Mesh MVP. His expertise spans data engineering, governance, and the industrialization of data science. Outside of tech, he enjoys exploring Upstate New York and New England with his family. Connect with J-GP on LinkedIn.  All views and opinions expressed are those of the individuals and do not necessarily reflect their employers or anyone else. Join the conversation on LinkedIn. Apply to be a guest or nominate a practitioner.  Do you love what you're listening to? Please rate and review the podcast, and share it with fellow practitioners you know. Your support helps us reach more listeners and continue providing valuable insights!

Jean-Georges Perrin is a serial startup founder, currently co-founder of AbeaData [https://abeadata.com/], and co-author of "Implementing Data Mesh." He is the one who championed the PayPal's data contract project, which is now part of Bitol and the Linux Foundation. In this episode, JGP speaks about building and maintaining open-source data contract solutions using open standards. He shares a lot about why and how he came to it and the challenges of maintaining it to avoid appropriation of the solution. JGP discusses how they balance the interests of different groups in developing a community around open data contract standards. More importantly, he shares how data contracts can positively change the life of every data engineer.Check out JGP's LinkedInCheck out Bitol -  Open Standards for Data Contracts and become a contributor.

Summary

There has been a lot of discussion about the practical application of data mesh and how to implement it in an organization. Jean-Georges Perrin was tasked with designing a new data platform implementation at PayPal and wound up building a data mesh. In this episode he shares that journey and the combination of technical and organizational challenges that he encountered in the process.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management Are you tired of dealing with the headache that is the 'Modern Data Stack'? We feel your pain. It's supposed to make building smarter, faster, and more flexible data infrastructures a breeze. It ends up being anything but that. Setting it up, integrating it, maintaining it—it’s all kind of a nightmare. And let's not even get started on all the extra tools you have to buy to get it to do its thing. But don't worry, there is a better way. TimeXtender takes a holistic approach to data integration that focuses on agility rather than fragmentation. By bringing all the layers of the data stack together, TimeXtender helps you build data solutions up to 10 times faster and saves you 70-80% on costs. If you're fed up with the 'Modern Data Stack', give TimeXtender a try. Head over to dataengineeringpodcast.com/timextender where you can do two things: watch us build a data estate in 15 minutes and start for free today. Your host is Tobias Macey and today I'm interviewing Jean-Georges Perrin about his work at PayPal to implement a data mesh and the role of data contracts in making it work

Interview

Introduction How did you get involved in the area of data management? Can you start by describing the goals and scope of your work at PayPal to implement a data mesh?

What are the core problems that you were addressing with this project? Is a data mesh ever "done"?

What was your experience engaging at the organizational level to identify the granularity and ownership of the data products that were needed in the initial iteration? What was the impact of leading multiple teams on the design of how to implement communication/contracts throughout the mesh? What are the technical systems that you are relying on to power the different data domains?

What is your philosophy on enforcing uniformity in technical systems vs. relying on interface definitions as the unit of consistency?

What are the biggest challenges (technical and procedural) that you have encountered during your implementation? How are you managing visibility/auditability across the different data domains? (e.g. observability, data quality, etc.) What are the most interesting, innovative, or unexpected ways that you have seen PayPal's data mesh used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on data mesh? When is a data mesh the wrong choice? What do you have planned for the future of your data mesh at PayPal?

Contact Info

LinkedIn Blog

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers

Links

Data Mesh

O'Reilly Book (affiliate link)

The next generation of Data Platforms is the Data Mesh PayPal Conway's Law Data Mesh For All Ages - US, Data Mesh For All Ages - UK Data Mesh Radio Data Mesh Community Data Mesh In Action Great Expectations

Podcast Episode

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Sponsored By: TimeXtender: TimeXtender Logo TimeXtender is a holistic, metadata-driven solution for data integration, optimized for agility. TimeXtender provides all the features you need to build a future-proof infrastructure for ingesting, transforming, modelling, and delivering clean, reliable data in the fastest, most efficient way possible.

You can't optimize for everything all at once. That's why we take a holistic approach to data integration that optimises for agility instead of fragmentation. By unifying each layer of the data stack, TimeXtender empowers you to build data solutions 10x faster while reducing costs by 70%-80%. We do this for one simple reason: because time matters.

Go to dataengineeringpodcast.com/timextender today to get started for free!Support Data Engineering Podcast