talk-data.com talk-data.com

Topic

C#/.NET

programming_language microsoft software_development

1

tagged

Activity Trend

4 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: PyData Paris 2025 ×
How to make public data more accessible with "baked" data and DuckDB

Publicly available data is rarely analysis-ready, hampering researchers, organizations, and the public from easily accessing the information these datasets contain. One way to address this shortcoming is to "bake" the data into a structured format and ship it alongside code that can be used for analysis. For analytical work in particular, DuckDB provides a performant way to query the structured data in a variety of contexts.

This talk will explore the benefits and tradeoffs of this architectural pattern using the design of scipeds–an open source Python package for analyzing higher-education data in the US–as a case study.

No DuckDB experience required, beginner Python and programming experience recommended. This talk is aimed at data practitioners, especially those who work with public datasets.