Event

PyData Berlin 2025

2025-09-01 – 2025-09-03 PyData

Activities tracked

1

Filtering by: Iryna Kondrashchenko ×

Top Speakers

Avik Basu 1 Cainã Max Couto da Silva 1 Chang She 1 Gergely Daroczi 1 Iryna Kondrashchenko 1 Jeroen Janssens 1 Oleh Kostromin 1 Adrin Jalali 1 Alexander CS Hendorf 1 Alexandre Andorra 1 Alina Dallmann 1 Andy Kitchen 1

Sessions & talks

Showing 1–1 of 1 · Newest first

Search within this event →

Beyond Benchmarks: Practical Evaluation Strategies for Compound AI Systems

2025-09-02 Watch

talk

Oleh Kostromin , Iryna Kondrashchenko

AI/ML LLM

Evaluating large language models (LLMs) in real-world applications goes far beyond standard benchmarks. When LLMs are embedded in complex pipelines, choosing the right models, prompts, and parameters becomes an ongoing challenge.

In this talk, we will present a practical, human-in-the-loop evaluation framework that enables systematic improvement of LLM-powered systems based on expert feedback. By combining domain expert insights and automated evaluation methods, it is possible to iteratively refine these systems while building transparency and trust.

This talk will be valuable for anyone who wants to ensure their LLM applications can handle real-world complexity - not just perform well on generic benchmarks.

talk-data.com

PyData Berlin 2025

Top Topics

Top Speakers

Beyond Benchmarks: Practical Evaluation Strategies for Compound AI Systems