metastable failures blog post
-
What is a metastable failure? distributed systems
Metastable failures occur in open systems with an uncontrolled source of load where a trigger causes the system to enter a bad state that persists even when the trigger is removed.
- key to metastable failures is the sustaining feedback loop, rather than the trigger
- grey failures - https://blog.acolyer.org/2017/06/15/gray-failure-the-achilles-heel-of-cloud-scale-systems/
-
Examples of metastable failure
-
supply chain crunch
- semi conductor
-
black start problems
-
traffic engineering problems
- Gazis, Denos C., and Robert Herman. “The Moving and ‘Phantom’ Bottlenecks.” Transportation Science 26, no. 3 (August 1992): 223–29. https://doi.org/10.1287/trsc.26.3.223.
-
thundering herd problems
-
joylent pxe boot - https://www.yohttps//www.youtube.com/watch?v=30jNsCVLpAEutube.com/watch?v=30jNsCVLpAE
- also black start
-
Rasmussen, Jens. “Risk Management in a Dynamic Society: A Modelling Problem.” Safety Science 27, no. 2–3 (November 1997): 183–213. https://doi.org/10.1016/S0925-7535(97)00052-0.
-
Brooker, Marc. “The Perils of Not Always Coordinating,” n.d., 39.
-