talk-data.com talk-data.com

D

Speaker

Dima Baranetskyi

1

talks

Filtering by: PyData Amsterdam 2025 ×

Filter by Event / Source

Talks & appearances

Showing 1 of 1 activities

Search activities →
Kafka Internals I Wish I Knew Sooner: The Non-Boring Truths

Most of us start with Kafka by building a simple producer/consumer demo. It just works — until it doesn’t. Suddenly, disk space isn’t freed up after data “expires,” rebalances loop endlessly during deploys, and strange errors about missing leaders clog your logs. In the panic, we dive into Kafka’s ocean of config options — hoping something will stick. Sound familiar?

This talk is a collection of hard-won lessons — not flashy tricks, but the kind of insights you only gain after operating Kafka in production for years. You’ll walk away with mental models that make Kafka’s internal behavior more predictable and less surprising.

We’ll cover: - Storage internals: Why expired data doesn’t always free space — and how Kafka actually reclaims disk - Transactions & delivery semantics: What “exactly-once” really means, and when it silently downgrades - Consumer group rebalancing: Why rebalances loop, and how the controller’s hidden behavior affects them

If you’ve used Kafka — or plan to — these insights will save you hours of frustration and debugging. A basic understanding of partitions, replication, and Kafka’s general architecture will help get the most out of this session.