talk-data.com talk-data.com

Event

PyData Eindhoven 2025

2025-12-09 – 2025-12-09 PyData

Activities tracked

55

Sessions & talks

Showing 26–50 of 55 · Newest first

Search within this event →

Football is complex, but your code doesn’t have to be — meet DataBallPy and a practical deep dive into pressing

2025-12-09
talk

DataBallPy is an open-source Python package that quickly starts your analysis of a football-related question. In the current talk, we will introduce the core features and functionalities of DataBallPy using code examples with compelling visualisations. The second part of the talk will showcase a practical example of how the Royal Belgian Football Association (RBFA) has used components of DataBallPy to analyse the effectiveness and efficiency of pressuring the opponent in over 200 games. Taken together, this talk will give you a clear starting point of how to start answering your football-related questions.

From €1M License to In-House Success: How We Built a Real-Time Recommendation System and Saved Millions Doing It

From €1M License to In-House Success: How We Built a Real-Time Recommendation System and Saved Millions Doing It

2025-12-09 Watch
talk

When we at Bol decided to personalize campaign banners, we did what many companies do: bought an expensive solution. As a software engineering team with zero data science experience, we integrated a third-party recommender system for €1 million annually, built the cloud infrastructure, and waited for results. After our first season, the data told a harsh truth—the third-party tool wasn't delivering value proportional to its cost. We faced a crossroads: accept mediocrity or build our own solution from scratch, tailored to our requirements and architecture. We'll walk you through our journey of building a more intelligent and flexible recommendation system from the ground up, and how this journey saved us over a million euros per year. We will share the incremental steps that shaped our journey, alongside the valuable lessons learned along the way

Coffee Break 5m

2025-12-09
talk

Coffee Break 5m

2025-12-09
talk

Coffee Break 5m

2025-12-09
talk
Finding trash in waste

Finding trash in waste

2025-12-09 Watch
talk

In the Netherlands plastic waste is often gathered in a dedicated container. These are collected in truck, that are emptied at a transfer station. Here a visual inspection is done to identify stuff that does not belong in this waste. We have researched the automation of this inspection using cameras and vision foundation models. All items in image of the big pile are detected, segmented and classified whether they belong to this waste stream.

FootballBERT: Encoding player identity in vectors with Transformers.

2025-12-09
talk

FootballBERT introduces a new way of representing football players — not as static IDs or statistical aggregates that fluctuate wildly over short periods, but as contextual embeddings learned directly from match data. Built on a Transformer architecture and trained through a Masked Player Prediction (MPP) objective, FootballBERT captures how a player’s identity emerges from teammates, opponents, and coaches tactical demands — much like BERT learns word meaning from sentences. Openly released on Hugging Face, FootballBERT is a plug-and-play foundation model whose embeddings can be integrated into any downstream system, paving the way for player-aware analytics across performance modeling, recruitment and prediction.

Scaling Python to thousands of nodes with Ray

Scaling Python to thousands of nodes with Ray

2025-12-09 Watch
talk

Python is the language of choice for anything to do with AI and ML. While that has made it easy to write code for one machine, it's much more difficult to run workloads across clusters of thousands of nodes. Ray allows you to do just that. I'll demonstrate how to implement this open source tool with a few lines of code. As a demo project, I'll show how I built a RAG for the Wheel of Time series.

Lunch

2025-12-09
talk

Lunch

2025-12-09
talk

Lunch

2025-12-09
talk
Efficient Time-Series Forecasting with Thousands of Local Models on Databricks

Efficient Time-Series Forecasting with Thousands of Local Models on Databricks

2025-12-09 Watch
talk

In industries like energy and retail, forecasting often requires local models when each time series has unique behavior — though training thousands of them can be overwhelming. However, training and managing thousands of such models presents scalability and operational challenges. This talk shows how we scaled local models on Databricks by leveraging the Pandas API on Spark, and shares practical lessons on storage, reuse, and scaling challenges to make this approach efficient when it’s truly needed

From Data Lake Entanglement to Data Mesh Decoupling: Scaling a Self-Service Data Platform

From Data Lake Entanglement to Data Mesh Decoupling: Scaling a Self-Service Data Platform

2025-12-09 Watch
talk

Our data platform journey started with a classic data lake — easy to ingest, hard to evolve. As domains scaled, tight coupling across source systems, pipelines, and data products slowed everything down. In this talk, we share how we re-architected toward a domain-oriented data mesh using PySpark, Delta Lake and DQX to achieve true decoupling. Expect practical lessons on designing independent data products, managing lineage and governance, and scaling self-service without chaos.

Identifying playstyles in football through spatial networks

Identifying playstyles in football through spatial networks

2025-12-09 Watch
talk

Breaking away from traditional manual video analysis, this talk introduces a data-driven approach to automatically identify football playstyles in key moments before a shot on goal , using tracking and event data. By applying network science , which studies relationships and interactions within complex systems, we objectively analyze attacking and defensive strategies. Key spatial network metrics are used to reveal diverse playstyles through clustering techniques. The session concludes with insights into the results and possible applications of these findings in football analysis.

Coffee Break 5m

2025-12-09
talk

Coffee Break 5m

2025-12-09
talk

Coffee Break 5m

2025-12-09
talk

AI-Powered Web Scraping: From Data Collection to Strategic Insights

2025-12-09
talk

Companies today are hungry for external data to stay competitive, but actually getting and making sense of that data isn’t easy. Standard web scraping often produces messy or incomplete results, and modern anti-bot systems make reliable collection even tougher.

In this talk, I’ll share how pairing Python’s scraping frameworks (like Scrapy, Playwright, and Selenium) with AI/ML can turn raw, unstructured data into clear, actionable insights.

We’ll look at:

1) How to build scrapers that still work in 2025.

2) Ways to use AI to automatically clean, enrich, and classify data.

3) Real-world applications of sentiment analysis for reviews and social media.

4) Case studies showing how SMEs have used these pipelines to sharpen marketing and product strategies.

By the end, you’ll see how to design pipelines that don’t just gather data, but deliver real strategic value. The session will focus on practical Python tools, scalable deployment (Airflow, Kubernetes, cloud platforms), and key lessons learned from hands-on projects at the intersection of scraping and AI.

xReceiver: a GNN approach to the evaluation of the decision-making process of passing options in football

2025-12-09
talk

The process of decision-making in football is characterized by a complex interplay between spatial positioning, opponent pressure, and player intent. In this research, we introduce xReceiver, a real-time Graph Neural Network (GNN) framework designed to predict the optimal passing target by modeling on-field interactions as dynamic graphs. Each player is represented as a node with positional and contextual features, while potential passing lines form weighted edges characterized by distance, angle, and pressure metrics. We have developed a Message-Passing Neural Network (MPNN) that is trained using a combination of tracking data and event data from professional matches. Our model achieves 65.22% accuracy in identifying the actual chosen receiver and 95.65% accuracy within its top three suggestions. xReceiver further offers quantification of each option's likelihood, threat, and creativity, enabling performance analysts to evaluate over 1,000 passes in seconds.

Coffee Break 15m

2025-12-09
talk

Coffee Break 15m

2025-12-09
talk

Coffee Break 15m

2025-12-09
talk
Beyond One Model: Scaling, Orchestrating & Monitoring

Beyond One Model: Scaling, Orchestrating & Monitoring

2025-12-09 Watch
talk

Training one model is fun. Running thousands without everything catching fire? That’s the real challenge. In this talk, we’ll show how we — two data scientists turned accidental ML engineers — scaled anomaly detection at Vanderlande. Expect a peek into our orchestration setup, a quick code snippet, a look at our monitoring dashboard and how we scale to a thousand models.

Developing a Nation-Wide Padel Rating System: A Data-Driven Approach

Developing a Nation-Wide Padel Rating System: A Data-Driven Approach

2025-12-09 Watch
talk

Padel has been one of the fastest-growing sports in the Netherlands in recent years. While it initially benefited from the rating facilities of its ‘big brother’ tennis, the KNLTB decided in 2024 to develop a dedicated, tailor-made rating system for padel, which has been in effect since 2025. The development process involved extensive analyses, simulations, and probability modeling on data from more than 300,000 padel matches, complemented by recommendations from the field.

In this presentation, the audience will be taken through the technical development process, as well as the unique characteristics of padel that were crucial in creating an effective rating system.

Scaling Retail Planning at IKEA: Orchestrating Sales, Fulfillment and Capacity Assessment with Metaflow

2025-12-09
talk

At IKEA, retail planning is a complex chain of processes, from sales forecasting to fulfillment and capacity assessment, that involve multiple teams. Each team builds their own predictive models independently, yet their outputs depend on one another to ensure a concise planning chain.

In this talk, we will show how IKEA uses Metaflow, an open-source framework for building and managing real-life ML, to orchestrate and connect the forecasting pipelines for more than thirty countries. We’ll discuss how Metaflow helps align independent teams, improve readability, and enable reproducible workflows and scale.

You will leave with practical approaches for an aligned team workflow and concrete patterns for orchestrating ML/AI pipelines.