Break It 'Til You Make It: Building the Self-Improving Stack for AI Agents - Aparna Dhinakaran

Building and shipping an AI agent is just the beginning. In real-world systems, the real work starts after deployment — when agents drift, fail silently, or underperform in edge cases no one anticipated.

This talk is about building the full monitoring and improvement stack that keeps agents reliable, efficient, and improving over time. We’ll walk through how to connect evals, tracing, observability, experimentation, and optimization into a virtuous cycle — one where agents not only perform, but learn and adapt in production.

Drawing on real-world deployments, I’ll cover:

Composing evaluation layers that surface meaningful failure modes -Tracing and instrumentation for deep visibility into agent behavior -Running experiments that actually improve outcomes -Closing the loop with feedback-driven optimization
People know to improve the agents application, but do they also know they need to improve their evals in tandem?

If you’re scaling agents beyond the prototype phase, this is the talk that helps you move from working once to working continuously.

talk-data.com

Break It 'Til You Make It: Building the Self-Improving Stack for AI Agents - Aparna Dhinakaran

Description