Skip to content

Commit 4282efc

Browse files
peterbjohnsonPeter Johnson
andauthored
added report (#28)
Co-authored-by: Peter Johnson <peterbjohnson@Peters-MacBook-Pro-3.local>
1 parent 51755d9 commit 4282efc

File tree

1 file changed

+36
-0
lines changed

1 file changed

+36
-0
lines changed

docs/releases/status.md

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,42 @@ Severity:
1616

1717
The severity is used to decide how much we invest in preventative measures, detection, mitigation plans, and rehearsals.
1818

19+
## 2025 December 4th: Brief DB outage (Severity: LOW):
20+
21+
### Timeline (GMT/UTC)
22+
23+
10:58 DB was unavailable for a few seconds, affecting about 20 users (e.g. page won't load)
24+
25+
11:14 Alerts automatically created
26+
27+
11:15 Developers responded
28+
29+
11:40 Decision and action
30+
31+
11:45 Incident over
32+
33+
### Analysis
34+
35+
The incident was triggered by a mistake by a developer on the DB configuration, which triggered a DB restart. Restart was successful so issues only arose during the brief restart period.
36+
37+
The analysis and decision concluded that the configuration needed to be reverted, and the DB restarted again.
38+
39+
The DB connections to the app remained open during the configuration change, avoiding any need for users to re-authenticate. This minimised the impact of the incident, but meant the quickest and safest response required a second restart.
40+
41+
### Actions
42+
43+
We have implemented protections against destructive actions on the DB, increasing barriers to this type of event.
44+
45+
We have increased user security requirements to confgure the DB (this incident was not security related, but it was a useful prompt).
46+
47+
Second-developer reviews are now required before any DB configuration changes are required.
48+
49+
Developers should only make configuration changes when fully aware of the consequences and able to handle the process
50+
51+
We have documented the error messages that correspond to this issue, to make detection faster and more accurate in future.
52+
53+
N=20, effect = 4, duration = 0.01. Severity = 0.008 (LOW)
54+
1955
## 2025 November 18th: Some evaluation functions failing (Severity: LOW):
2056

2157
Some evaluation functions returned errors.

0 commit comments

Comments
 (0)