Topic

xgboost

Activities

2

tagged

Activity Trend

1 peak/qtr

2020-Q1 2026-Q1

Top Events

JAX for LLM Development at NVIDIA & AI in Production at Mollie 1 4 Virtual PyData Piraeus meetup: Modern Time Series Forecasting 1

Top Speakers

Marta Barigozzi (Mollie) 1 Vladimir-Vadim Iurcovschi 1

Activities

2 activities · Newest first

All Video Podcast Book

Advanced Data Science for Credit Risk Modeling: The Good, The Bad, and The Defaulted

2025-10-07 · JAX for LLM Development at NVIDIA & AI in Production at Mollie

talk

by Marta Barigozzi (Mollie)

Google Sheets logistic regression

Ever tried building a credit risk model when your data lives in Google Sheets and your loan statuses are about as reliable as weather forecasts? You'll learn practical data science lessons about surviving data quality issues, the critical importance of target variable definition, adding genetics to feature selection algorithms, and how engineered transactional features can transform your predictions from probably fine to we actually know what we're doing. We’ll show how classical ML approaches like logistic regression and XGBoost remain highly effective for binary classification problems, proving that sometimes the fundamentals work better than the latest AI trends. Perfect for anyone who's ever wondered how machine learning works when your data isn't clean, your labels aren't perfect, and your stakeholders want results yesterday.

Modern Time Series Forecasting

2025-09-19 · 4 Virtual PyData Piraeus meetup: Modern Time Series Forecasting

talk

by Vladimir-Vadim Iurcovschi

catboost lightgbm

In this tutorial, we will explore a range of feature engineering techniques for time series forecasting using popular machine learning algorithms such as XGBoost, LightGBM, and CatBoost. We'll begin by transforming time series data into a tabular format and demonstrate how to create window and lag features, as well as features that capture seasonality and trends.

We'll cover best practices for encoding categorical variables, decomposing time series, identifying outliers, and avoiding common pitfalls such as data leakage and look-ahead bias. Additionally, we’ll touch on more advanced topics like intermittency and hierarchical forecasting.

The session will also delve into cross-validation methods - specifically backtesting methods suited for time series data. We'll examine why traditional K-fold cross-validation is inappropriate for time-dependent datasets and highlight alternative approaches along with their trade-offs.

Finally, we’ll review best practices for evaluating model performance. This includes a comprehensive overview of error metrics, discussing their strengths, weaknesses, and the contexts in which each should be used.