Antifragile systems

Yesterday I talked about Mean time between failures(MTBF) and Mean time to repair(MTTR) and their importance in building robustness and recoverability into our systems.

With these two metrics we still expect that a failure will cause an outage of our system that should be repaired. But what if we can build a system that can absorb failures? Systems where a failure actually makes the system stronger…

Nassim Taleb calls such systems antifragile systems.

From Wikipedia:

Antifragility is a property of systems in which they increase in capability to thrive as a result of stressors, shocks, volatility, noise, mistakes, faults, attacks, or failures. The concept was developed by Nassim Nicholas Taleb in his book, Antifragile, and in technical papers As Taleb explains in his book, antifragility is fundamentally different from the concepts of resiliency (i.e. the ability to recover from failure) and robustness (that is, the ability to resist failure). The concept has been applied in risk analysis,physics, molecular biology,transportation planning, engineering, aerospace (NASA),and computer science

Designing antifragile software systems require not only a focus on the infrastructure and the application level but also on the outer world (the complete ‘system’). When building such a system we observe the system behavior, the impact of disturbances and how we can apply correcting actions to keep the system operational.

Remark: Testing and evolving such an antifragile system is a perfect candidate for chaos engineering.

Gregor Hophe in his Cloud Strategy book describes the path from fragile to antifragile systems using the following table:

	Robust	Resilient	Antifragile
Model	Prevent failure	Recover from failure	Invite failure
Motto	“Hope for the best”	“Prepare for the worst”	“Bring it on!”
Attitude	Fear	Preparedness	Confidence
Mechanism	Planning & verification	Redundancy & automation	Chaos Engineering
Scope	Infrastructure	Middleware/Application	Whole System

Remark: We cannot call these systems ‘not fragile’ because that would mean a robust system what is not what we mean here, therefore the usage of ‘antifragile’.

Jones, K. H. (2014). "Engineering Antifragile Systems: A Change In Design Philosophy"
Resilience and Waste in Software Teams – Jessitron

Kubernetes–Limit your environmental impact

Reducing the carbon footprint and CO2 emission of our (cloud) workloads, is a responsibility of all of us. If you are running a Kubernetes cluster, have a look at Kube-Green . kube-green is a simple Kubernetes operator that automatically shuts down (some of) your pods when you don't need them. A single pod produces about 11 Kg CO2eq per year( here the calculation). Reason enough to give it a try! Installing kube-green in your cluster The easiest way to install the operator in your cluster is through kubectl. We first need to install a cert-manager: kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.5/cert-manager.yaml Remark: Wait a minute before you continue as it can take some time before the cert-manager is up & running inside your cluster. Now we can install the kube-green operator: kubectl apply -f https://github.com/kube-green/kube-green/releases/latest/download/kube-green.yaml Now in the namespace where we want t...

The art of simplicity

Search This Blog

Antifragile systems

Labels

Popular posts from this blog

Kubernetes–Limit your environmental impact

Azure DevOps/ GitHub emoji

DevToys–A swiss army knife for developers