talk-data.com talk-data.com

PyData talk 2025-12-10 at 14:45

Uncertainty-Guided AI Red Teaming: Efficient Vulnerability Discovery in LLMs

Speakers

Description

AI red teaming is crucial for identifying security and safety vulnerabilities (e.g., jailbreaks, prompt injection, harmful content generation) of Large Language Models. However, manual and brute-force adversarial testing is resource-intensive and often inefficiently consumes time and compute resources exploring low-risk regions of the input space. This talk introduces a practical, Python-based methodology for accelerating red teaming using model uncertainty quantification (UQ).