talk-data.com talk-data.com

Description

Want to prevent outages before they happen? Traditional SRE methods focus on component failures, but a whole class of outages stem from unexpected system interactions. We found a solution. In our team, we use Systems Theoretic Process Analysis (STPA) to identify and fix system-level vulnerabilities before they cause outages. By applying STPA during the design phase, we've prevented major incidents and saved countless engineering hours. This talk will show you how STPA can transform your approach to reliability. We'll share a real-world example where STPA caught critical design flaws that traditional methods missed, saving us months of costly rework. Don't wait for outages to happen. Learn how STPA can help you build more resilient systems and become a 1000x engineer. Theo is a Senior Site Reliability Engineer for Google Maps. He is leading a program to improve road closure data safety. Previously, he led a program identifying risky dependencies within Google Maps. In his spare time, he hosts supper clubs.