Ever been burned by a mysterious slowdown in your data pipeline? In this session, we'll reveal how a stealthy performance regression in the Polars DataFrame library was hunted down and squashed. Using git bisect, Bash scripting, and uv, we automated commit compilation and benchmarking across two repos to pinpoint a commit that degraded multi-file Parquet loading. This led to challenging assumptions and rethinking performance monitoring for the Python data science library Polars.
talk-data.com
Topic
Polars
data_manipulation
data_analysis
rust
2
tagged
Activity Trend
13
peak/qtr
2020-Q1
2026-Q1
Top Events
SciPy 2025
5
PyData Berlin 2025
3
O'Reilly Data Science Books
3
Data Engineering Central Podcast
3
PyData Paris 2025
2
PyData London 2025
2
DataTopics: All Things Data, AI & Tech
2
PyData Seattle 2025
2
PyConDE & PyData Berlin 2023
2
PyData Amsterdam 2025
2
Databricks DATA + AI Summit 2023
2
O'Reilly Data Engineering Books
1
Filtering by:
Thijs Nieuwdorp
×
Jeroen Janssens and Thijs Nieuwdo join me to chat about all things Polars. We discuss the evolution of the Polars library, its advantages over pandas, and their journey of writing 'Python Polars: The Definitive Guide.'