talk-data.com talk-data.com

S

Speaker

Samraj Moorjani

1

talks

Software Engineer Databricks

Samraj is a software engineer working on the Agent Evaluation team and previously, the Lakehouse Monitoring team. He graduated with a BS+MS in Compute Science from UIUC advised by Professor Hari Sundaram, where he worked on controllable natural language generation to generate more appearing and interpretable science to combat the spread of misinformation. Previously, he worked with Professor Wen-mei Hwu on speeding up LLM inference via sparsification.

Bio from: Data + AI Summit 2025

Filter by Event / Source

Talks & appearances

1 activities · Newest first

Search activities →
Creating LLM Judges to Measure Domain-Specific Agent Quality

This session is repeated. Measuring the effectiveness of domain-specific AI agents requires specialized evaluation frameworks that go beyond standard LLM benchmarks. This session explores methodologies for assessing agent quality across specialized knowledge domains, tailored workflows, and task-specific objectives. We'll demonstrate practical approaches to designing robust LLM judges that align with your business goals and provide meaningful insights into agent capabilities and limitations. Key session takeaways include: Tools for creating domain-relevant evaluation datasets and benchmarks that accurately reflect real-world use cases Approach for creating LLM judges to measure domain-specific metrics Strategies for interpreting those results to drive iterative improvement in agent performance Join us to learn how proper evaluation methodologies can transform your domain-specific agents from experimental tools to trusted enterprise solutions with measurable business value.