Apache Kafka is the simplest possible reliable, horizontally scalable low-latency storage system for commodity hardware. This is increasingly making it the backbone of analytic data collection stacks and event-bus like architectures. Critical systems like this require very reliable operations. Kafka is both stateful and distributed, so it has traditional sysadmin kind of problems and those that require pretty deep expertise. We will discuss the problems with CPU and disk capacity management as well as defining availability SLOs for a distributed stateful system. We will also show some of the ways in which the Google Cloud Managed Service for Apache Kafka and lenses.io helps in solving these problems in a demo.
talk-data.com
G
Speaker
Germain Cassis
1
talks
Lead sales and alliances
lenses.io
Lead partnerships; co-presenter.
Bio from: Google NY Site Reliability Engineering (SRE) Tech Talks, 23 Sep 2025
Filter by Event / Source
Talks & appearances
1 activities · Newest first