Topic

ai safety

Activities

2

tagged

Activity Trend

1 peak/qtr

2020-Q1 2026-Q1

Top Events

Virtual Keynote Talk "AI Safety: Near and Far" 1 #21 AI Series: University of Oxford - Dr. A. Bibi 1

Top Speakers

Dr. Adel Bibi (University of Oxford, Department of Engineering Science) 1 Katarina Slama (UK AI Security Institute) 1

Activities

2 activities · Newest first

All Video Podcast Book

Keynote: AI safety discourse bridging immediate harms and catastrophic risk

2025-10-28 · Virtual Keynote Talk "AI Safety: Near and Far"

keynote

by Katarina Slama (UK AI Security Institute)

catastrophic risk existential risk human behavioral science uk aisi inspect

AI safety discourse often splits into immediate harm vs catastrophic risk framings. In this keynote, I argue that the two research streams will benefit from increased cross-talk and a greater number of synergistic projects. A zero-sum framing on attention and resources between the two communities is incorrect and does not serve either side's goals. Recent theoretical work, including on accumulative existential risk, unifies risk pathways between the two fields. Building on this, I suggest concrete synergies that are already in place - as well as opportunities for future collaboration.

I will discuss how shared research and monitoring infrastructure, such as UK AISI Inspect, can benefit both areas; how methodological approaches from human behavioral science, currently used in immediate harms research, can be ported into AI behavioral science applied to existential risk research; and how technical solutions from catastrophic risk research can be applied to mitigate immediate societal harms. We have a shared goal of building a better, safer future for everyone. Let's work together!

Keep Learning and Building! Accelerate your professional development with hands-on training, talks, workshops, networking events, 10+ tracks, and more at ODSC West AI Training Conference (San Francisco and virtual). More here - https://odsc.ai/

Progress in AI Safety and Security

2025-07-01 · #21 AI Series: University of Oxford - Dr. A. Bibi

talk

by Dr. Adel Bibi (University of Oxford, Department of Engineering Science)

instruction prefix tuning large language models (llms) safety in agentic systems tokenizers

Abstract: We will navigate through the alignment challenges and safety considerations of LLMs, addressing both their limitations and capabilities, particularly focusing on techniques related to instruction prefix tuning and their theoretical limitations toward alignment. Additionally, I will discuss fairness across languages in common tokenizers used in LLMs. Finally, I will address safety considerations for agentic systems, illustrating methods to compromise their safety by exploiting seemingly minor changes, such as altering the desktop background to generate a chain of sequenced harmful actions. I will also explore the transferability of these vulnerabilities across different agents.