talk-data.com talk-data.com

Event

PyData Amsterdam 2025

2025-09-24 – 2025-09-26 PyData

Activities tracked

4

Filtering by: Data Science ×

Sessions & talks

Showing 1–4 of 4 · Newest first

Search within this event →
Resource Monitoring and Optimization with Metaflow

Resource Monitoring and Optimization with Metaflow

2025-09-26 Watch
talk

Metaflow is a powerful workflow management framework for data science, but optimizing its cloud resource usage still involves guesswork. We have extended Metaflow with a lightweight resource tracking tool that automatically monitors CPU, memory, GPU, and more, then recommends the most cost-effective cloud instance type for future runs. A single line of code can save you from overprovisioned costs or painful job failures!

Data that Keeps Our Energy in Balance - From churn prediction with deep learning to real-time trading systems

2025-09-26
talk

This talk explores how data science helps balance energy systems in the face of demand volatility, generation volatility, and the push for sustainability. We’ll dive into two technical case studies: churn prediction using survival models, and the design of a high-availability real-time trading system on Databricks. These examples illustrate how data can support operational resilience and sustainability efforts in the energy sector.

What Works: Practical Lessons in Applying Privacy-Enhancing Technologies (PET) in Data Science

What Works: Practical Lessons in Applying Privacy-Enhancing Technologies (PET) in Data Science

2025-09-25 Watch
talk

Privacy-Enhancing Technologies (PETs) promise to bridge the gap between data utility and privacy — but how do they perform in practice? In this talk, we’ll share real-world insights from our hands-on experience testing and implementing leading PET solutions across various data science use cases. We explored tools such as differential privacy libraries, homomorphic encryption frameworks, federated learning, multi-party computation, etc. Some lived up to their promise — others revealed critical limitations. You’ll walk away with a clear understanding of which PET solutions work best for which types of data and analysis, what trade-offs to expect, and how to set realistic goals when integrating PETs into your workflows. This session is ideal for data professionals and decision-makers who are navigating privacy risks while still wanting to innovate responsibly.

Actionable Techniques for Finding Performance Regressions

2025-09-25
talk
Jeroen Janssens , Thijs Nieuwdorp (VodafoneZiggo)

Ever been burned by a mysterious slowdown in your data pipeline? In this session, we'll reveal how a stealthy performance regression in the Polars DataFrame library was hunted down and squashed. Using git bisect, Bash scripting, and uv, we automated commit compilation and benchmarking across two repos to pinpoint a commit that degraded multi-file Parquet loading. This led to challenging assumptions and rethinking performance monitoring for the Python data science library Polars.