Topic

red-teaming datasets

Activities

1

tagged

Activity Trend

1 peak/qtr

2020-Q1 2026-Q2

Top Events

Governing and Evaluating Generative & Agentic AI in Regulated Industries 1

Top Speakers

David Talby (John Snow Labs and Pacific AI) 1

Activities

1 activities · Newest first

All Video Podcast Book

Governing and Evaluating Generative & Agentic AI in Regulated Industries

· Governing and Evaluating Generative & Agentic AI in Regulated Industries

talk

by David Talby (John Snow Labs and Pacific AI)

chai helm iso langtest nist stress-test datasets

As generative and agentic AI systems move from prototypes to production, builders must balance innovation with trust, safety, and compliance. This talk covers evaluation gaps (multistep reasoning, tool use, domain-specific workflows; contamination and fragile metrics), bias and safety (demographic bias, hallucinations, unsafe autonomy with regulatory and legal obligations), continuous monitoring (MLOps strategies for drift detection, risk scoring, and compliance auditing in deployed systems), and tools and standards (open-source libraries like LangTest and HELM, stress-test and red-teaming datasets, and guidance from NIST, CHAI, and ISO).