talk-data.com talk-data.com

S

Speaker

Sebastian Duerr

1

talks

Filtering by: PyData Seattle 2025 ×

Filter by Event / Source

Talks & appearances

Showing 1 of 1 activities

Search activities →
Evaluation is all you need

LLM apps fail without reliable, reproducible evaluation. This talk maps the open‑source evaluation landscape, compares leading techniques (RAGAS, Evaluation Driven Development) and frameworks (DeepEval, Phoenix, LangFuse, and braintrust), and shows how to combine tests, RAG‑specific evals, and observability to ship higher‑quality systems. Attendees leave with a decision checklist, code patterns, and a production‑ready playbook.