talk-data.com
People (28 results)
See all 28 →Activities & events
| Title & Speakers | Event |
|---|---|
|
Beyond Prometheus: pushing the boundaries of scalable monitoring with VictoriaMetrics
2025-10-15 · 18:00
Diana Todea
– VictoriaMetrics Developer Advocate
@ VictoriaMetrics
Prometheus has become the go-to standard for metrics-based monitoring, but as environments grow in complexity and scale, teams often find themselves hitting its operational limits, especially around cardinality and long-term storage. This talk explores how VictoriaMetrics builds on Prometheus fundamentals to offer a more scalable and efficient alternative for teams managing high-ingestion workloads and demanding retention needs, without abandoning the familiar Prometheus ecosystem. I’ll dive into how VictoriaMetrics supports Prometheus-compatible scrape configurations and exporters, allowing seamless integration with existing workflows. The session will showcase practical strategies for setting up and tuning scrape jobs, managing cardinality through label analysis and relabeling, and using VictoriaMetrics’ UI and tools to gain insight into metric usage patterns. This talk is tailored for advanced users eager to push the boundaries of Prometheus-based observability, demonstrating how the core philosophy of Prometheus can be extended and elevated through the integration of high-performance systems like VictoriaMetrics. |
|
|
A History of Automatic Aggregations
2025-10-15 · 18:00
Raphael Bizos
– Site Reliability Engineer
@ Criteo
At Criteo, we’ve relied on automatic aggregations for years. “Automatic aggregation” is the name we give to a system of recording rules that matches most metrics and removes certain dimensions, such as the instance emitting the metric, to reduce the cardinality (i.e., the number of metrics) and thus makes queries faster. What started as a workaround has become a key part of how we ensure backend stability and reliability at scale, with hundreds of millions of active metrics, all without requiring users to write a single recording rule. It also significantly reduces the cost of metrics storage. Internally, we call this approach zero-effort Observability, as most teams don’t have to write/maintain recording rules. In this talk, Raphael will explain how our approach to automatic aggregations has evolved over time and how we’ve adapted it to fit naturally into our Prometheus-based stack. He will share the different implementations we’ve tried, the lessons we’ve learned, and how our latest version takes advantage of recent improvements in Prometheus (new type label). |
|
|
VictoriaLogs cluster architecture and typical use cases
2025-10-15 · 18:00
Aliaksandr Valialkin
– Co-Founder | CTO | Principal Architect
@ VictoriaMetrics
A single-node version of VictoriaLogs can handle hundreds of terabytes of logs. What if this isn't enough for you? Then the cluster version of VictoriaLogs comes to the rescue! It can scale to tens of petabytes of logs. This talk dives into the architectural details of the VictoriaLogs cluster, which explains how it achieves linear horizontal scalability for both data ingestion and querying paths. There is no magic - the cluster architecture is clear and quite simple. The talk also covers typical use cases for the VictoriaLogs cluster when a single-node version isn't enough. |
|
|
GenAI for observability in the serverless world
2024-08-27 · 17:00
Explore the cutting-edge capabilities of Generative AI in enhancing observability within serverless environments. This webinar will cover how GenAI, driven by vector databases, machine learning, and transformers can transform data ingestion and log analysis. Discover the benefits of integrating AI into observability frameworks and how it simplifies automation and improves search and data analysis. This webinar will highlight how Implementing a Generative AI type of technology can bring benefits in the long run for companies dealing with data handling, retrieval, and processing. With the help of AI, automation is becoming more simplified, easier to implement and use, and pushes data ingestion and log analysis to near perfection. In this talk, we will cover:
After a 30-minute talk there’ll be a 15-minute Q&A, for which we encourage you to submit questions in advance. A webinar recording and related materials will be shared with all attendees after the event. Speaker: Diana Todea, Senior Site Reliability Engineer at EQS Group. Diana is a Site Reliability Engineer at EQS Group and she focuses on Observability. She is passionate about serverless, SecOps and machine learning. |
GenAI for observability in the serverless world
|
|
GenAI for observability in the serverless world
2024-08-27 · 17:00
Explore the cutting-edge capabilities of Generative AI in enhancing observability within serverless environments. This webinar will cover how GenAI, driven by vector databases, machine learning, and transformers can transform data ingestion and log analysis. Discover the benefits of integrating AI into observability frameworks and how it simplifies automation and improves search and data analysis. This webinar will highlight how Implementing a Generative AI type of technology can bring benefits in the long run for companies dealing with data handling, retrieval, and processing. With the help of AI, automation is becoming more simplified, easier to implement and use, and pushes data ingestion and log analysis to near perfection. In this talk, we will cover:
After a 30-minute talk there’ll be a 15-minute Q&A, for which we encourage you to submit questions in advance. A webinar recording and related materials will be shared with all attendees after the event. Speaker: Diana Todea, Senior Site Reliability Engineer at EQS Group. Diana is a Site Reliability Engineer at EQS Group and she focuses on Observability. She is passionate about serverless, SecOps and machine learning. |
GenAI for observability in the serverless world
|
|
Serverless observability: where SLOs meet transforms
2024-04-16 · 17:00
This talk explores Service Level Objective (SLO) case studies and transforms whilst migrating to the serverless ecosystem. The talk starts by presenting the reasons why SLOs are important in the DevOps framework. It then analyses specific SLO use cases and tools for measuring SLO efficiency. Following this I’ll present the main bottlenecks encountered when defining and adhering to SLOs in the process of migrating to a serverless ecosystem. The audience will learn how for mature ecosystems, adopting efficient SLOs always presents itself as a challenge. To solve this requires the right engineering decisions to be made. To provide guidance I’ll explore the following:
After a 30-minute talk there’ll be a 15-minute Q&A, for which we encourage you to submit questions in advance. A webinar recording and related materials will be shared with all attendees after the event. Speaker: Diana Todea, Site Reliability Engineer at Elastic Diana is a Site Reliability Engineer at Elastic focusing on observability. She is passionate about serverless, SecOps, and Machine Learning. |
Serverless observability: where SLOs meet transforms
|
|
Serverless observability: where SLOs meet transforms
2024-04-16 · 17:00
This talk explores Service Level Objective (SLO) case studies and transforms whilst migrating to the serverless ecosystem. The talk starts by presenting the reasons why SLOs are important in the DevOps framework. It then analyses specific SLO use cases and tools for measuring SLO efficiency. Following this I’ll present the main bottlenecks encountered when defining and adhering to SLOs in the process of migrating to a serverless ecosystem. The audience will learn how for mature ecosystems, adopting efficient SLOs always presents itself as a challenge. To solve this requires the right engineering decisions to be made. To provide guidance I’ll explore the following:
After a 30-minute talk there’ll be a 15-minute Q&A, for which we encourage you to submit questions in advance. A webinar recording and related materials will be shared with all attendees after the event. Speaker: Diana Todea, Site Reliability Engineer at Elastic Diana is a Site Reliability Engineer at Elastic focusing on observability. She is passionate about serverless, SecOps, and Machine Learning. |
Serverless observability: where SLOs meet transforms
|