October 12, 2017
1 PM EDT
Common approaches to post-incident reviews are often short-sighted in their focus and rarely bring about any real improvements to our overall systems.
This talk will provide insight into new ways teams are analyzing incidents in retrospect in order to continuously improve system uptime.
Many organizations have found great value in retrospective analysis following incidents that impact the reliability and availability of a service. Commonly known as post-incident reviews or postmortems, companies routinely analyze what went wrong in retrospect. This talk will point out the true value of a post-incident review as well as how to perform them for maximum exposure of improvements for every organization’s people, process, and technology.
Let’s explore a deeper understanding of failure in complex systems and key metrics leveraged to consistently improve the availability and reliability of systems. Jason will point out common flaws in the way many organizations approach retrospective analysis of outages and service disruptions as well as uncover areas often overlooked during a retrospective (such as what were engineers thinking when they made the decisions they made).
Pulling from the new O’Reilly Media book “Post-Incident Reviews: Learning From Failure for Improved Incident Response”, the audience will walk away with a broader understanding of their purpose and how to get started on a new path towards continuously improving the uptime of systems and services. Jason will also provide a template for audience members to take back to their teams to use as a starting point for a new approach.
Audience Challenges & Takeaways:
- What’s broken about current methods of incident retrospective exercises?
- Why is a “the human element” important?
- What is the true purpose of a post-incident review?
- What are the key component of the exercise?
- How can we continuously improve this process?
Jason Hand - VictorOps DevOps Evangelist
Serving as a DevOps Champion and advisor to VictorOps (victorops.com), Jason Hand writes, presents, and coaches on the principles and nuance of DevOps and modern incident management practices. Named "DevOps Evangelist of the Year" by DevOps.com in 2016, Jason has recently authored a new book with O'Reilly Media on the subject of post-incident reviews (commonly known as postmortems) in addition to two books on the subject of Chatops. Regularly contributing articles to Wired.com, TechBeacon.com, and many other online publications, Jason is dedicated to the latest trends in technology, sharing the lessons learned, and helping people continuously improve. Jason is also a co-host on the popular podcast on building communities in Tech called "Community Pulse".