The Kubernetes observability talk will cover how to monitor, trace, and troubleshoot applications in a Kubernetes environment. It will highlight key tools like Prometheus, Thanos, Grafana, and OpenTelemetry for tracking metrics, logs, and distributed traces. Topics include improving visibility into clusters and microservices, detecting anomalies, and ensuring reliability. The session will focus on best practices for proactive observability and efficient debugging to maintain the health of cloud-native applications.
talk-data.com
Topic
thanos
2
tagged
Activity Trend
1
peak/qtr
2020-Q1
2026-Q1
Learn how Reddit uses a custom monitoring operator to manage Thanos and Prometheus to scale their metrics deployment beyond 45 million samples per second and 600 million active series. To achieve this they run thousands of Prometheus instances of varying sizes managed by their internally developed Kubernetes controller. They use Thanos for long-term storage and global single pane of glass querying across this massive deployment. Learn about the operator, other tools they've developed, and the challenges they've faced along the way.