Risk | Bad minutes/year |
Overload results in slow or dropped requests during the peak hour each day. | 3559 |
A bad release takes the entire service down. Rollback is not tested. | 507 |
Users report an outage before monitoring and alerting notifies the operator. | 395 |
There is a physical failure in the hosting location that requires complete restoration from a backup or disaster recovery plan. | 242 |
The wrong server is turned off and requests are dropped. | 213 |
Overload results in a cascading failure. Manual intervention is required to halt or fix the issue. | 150 |
Operator accidentally deletes database; restore from backup is required | 129 |
Unnoticed growth in usage triggers overload; service collapses. | 125 |
A configuration mishap reduces capacity; causing overload and dropped requests | 122 |
A new release breaks a small set of requests; not detected for a day. | 119 |
Operator is slow to debug and root cause bug due to noisy alerting | 76 |
A daylight savings bug drops requests. | 71 |
Restarts for weekly upgrades drop in-progress requests (i.e., no lame ducking). | 52 |
A leap year bug causes all servers to restart and drop requests. | 16 |