talk-data.com
SRE Psychology: Understanding the Human Element in System Reliability
Description
"What if you have a beautiful SLO Dashboard and it's all red and no one cares?" The mission of Site Reliability Engineering (SRE) is to ensure the reliability, scalability, and performance of critical systems - a goal best achieved through strong collaboration with teams across the organization. We are exploring how SRE is embedded in an organization, how it interfaces with application owners, senior management, business stakeholders and external software/hardware vendors. In all these cases the success of SRE's mission hinges on the effectiveness of the relationships.
We will use plenty of examples of what worked, what failed in our past work and why. Additionally, we will address funding challenges that can unexpectedly impact even well-established SRE teams.
Mike has built his career around driving performance and efficiency, specializing in optimizing the security, availability and speed of cloud applications, data and infrastructure. He developed the first currency program trading system for the Toronto Stock Exchange at UBS and later refined his expertise in optimizing trading systems and migrating core data to the cloud at Morgan Stanley and Transamerica. He is a founding member of the NYZH consultancy, focusing on AI and SRE. Based in Denver, Colorado, Mike is a pilot who enjoys desert racing and cycling, sharing adventures with his wife and three children.