Apache Kafka is the simplest possible reliable, horizontally scalable low-latency storage system for commodity hardware. This is increasingly making it the backbone of analytic data collection stacks and event-bus like architectures. Critical systems like this require very reliable operations. Kafka is both stateful and distributed, so it has traditional sysadmin kind of problems and those that require pretty deep expertise. We will discuss the problems with CPU and disk capacity management as well as defining availability SLOs for a distributed stateful system. We will also show some of the ways in which the Google Cloud Managed Service for Apache Kafka and lenses.io helps in solving these problems in a demo.
talk-data.com
K
Speaker
Kir Titievsky
1
talks
Product Manager
Google
Sr PM for Managed Kafka at Google with a background in distributed messaging systems.
Bio from: Google NY Site Reliability Engineering (SRE) Tech Talks, 23 Sep 2025
Filtering by:
Google NY Site Reliability Engineering (SRE) Tech Talks, 23 Sep 2025
×
Filter by Event / Source
Talks & appearances
Showing 1 of 3 activities