| Postmortem owner | |
| Affected systems | |
| Incident | |
| Priority |
| Instructions | Report |
|---|---|
| Leadup List the sequence of events that led to the incident. |
|
| Fault Describe how the change that was implemented didn't work as expected. If available, include relevant data visualizations. |
|
| Impact Describe how internal and external users were impacted during the incident. Include how many support cases were raised. |
|
| Detection Report when the team detected the incident and how they knew it was happening. Describe how the team could've improved time to detection. |
|
| Recovery Report how the user impact was mitigated and when the incident was deemed resolved. Describe how the team could've improved time to mitigation. |
|
| Timeline Detail the incident timeline using UTC to standardize for timezones. Include lead-up events, post-impact event, and any decisions or changes made. |
|
| Five whys root cause identification Run a 5-whys analysis to understand the true causes of the incident. |
|
| Blameless root cause Note the final root cause and describe what needs to change without placing blame to prevent this class of incident from recurring. |
|
| Lessons learned Describe what you learned, what went well, and how you can improve. |
|
| Follow-up tasks List the Jira issues created to |