talk-data.com
PyData
talk
2025-11-07 at 22:35
Know Your Data(Frame) with Paguro: Declarative and Composable Validation and Metadata using Polars
Event:
PyData Seattle 2025
Speakers
Topics
Description
Modern data pipelines are fast and expressive, but ensuring data quality is often not as straightforward. This talk introduces Paguro, an open-source, feature-rich validation and metadata library designed on top of the Polars DataFrame library. Paguro enables users to validate both single Data(Lazy)Frames and collections of Data(Lazy)Frames together, and provides beautifully formatted terminal diagnostics that explain why and where validation failed. Attendees will learn how to integrate the lightweight, fast, and composable validation toolkit into their workflows, from exploration to production, using a familiar Polars-native syntax.