Skip to main content

Azure meets Chaos Monkey–Chaos Studio

Maybe you have heared about the Chaos Monkey and later the Simian Army that Netflix introduced to check the resiliency of their AWS systems. These tools are part of a concept called Chaos Engineering. The principle behind Chaos Engineering is a very simply one: since your software is likely to encounter hostile conditions in the wild, why not introduce those conditions while (and when) you can control them, and then deal with the fallout then, instead of at 3am on a Sunday.

Azure Chaos Studio

Time to introduce Azure Chaos Studio, a managed service that uses chaos engineering to help you measure, understand, and improve your cloud application and service resilience.

With Chaos Studio, you can orchestrate safe, controlled fault injection on your Azure resources. Chaos experiments are the core of Chaos Studio. A chaos experiment describes the faults to run and the resources to run against. You can organize faults to run in parallel or sequence, depending on your needs.

Let’s give Azure Chaos Studio a try and create our first chaos experiment…

  • Go to the Azure Chaos Studio product page and click on Try Chaos Studio.
  • You’ll be redirect to the Azure Portal and after logging in you should see the following:

Onboard a resource as a Chaos target

  • The first step is to select a target. Before we can target a specific resource, we first need to onboard this resource in Azure Chaos Studio. There are 2 possible target types:
    • Service-direct faults run directly against an Azure resource, without any installation or instrumentation. Examples include rebooting an Azure Cache for Redis cluster, or adding network latency to Azure Kubernetes Service (AKS) pods.
    • Agent-based faults run in VMs or virtual machine scale sets to do in-guest failures. Examples include applying virtual memory pressure or killing a process.
  • Put a checkbox next to the resource(s) you want to onboard.

  • Now click on Enable targets then Enable service-direct targets from the dropdown menu.

  • This will trigger a new deployment and update the target resource(s).
  • After the resource deployment has completed succesfully, we can set the capabilities that should be supported on the target. This can be controlled through Manage Actions.

  • In our example, we have enable only the service-direct target type for the VM. This gives us only one capability; Cloud Service Shutdown.

Create our first chaos experiment

  • Now that we have onboarded a resource, it is finally time to create our chaos experiment.
  • Click on Experiments and click on Create.

  • Fill in the Subscription, Resource Group, and Location where you want to deploy the chaos experiment. Give your experiment a Name. Continue by clicking on Next : Experiment designer >

  • In the Experiment designer we can specify the different steps and actions that should be taken. In this example, we keep it simple and only use one step and one branch. We click on Add action.
  • Now we can specify the fault that should be triggered and some extra related parameters depending on the chosen fault type. We choose Cloud Service Shutdown as this was the only capability that was supported by our target. Click on Next: Target Resources > to continue.

  • Select the target resource and click on Add.

  • Click on Review + Create to move to the last step.

  • Click on Create to complete the process.
    • Notice the remark. We need to assign the experiment identity to our resource before we can run the experiment.

Assign the experiment identity to our target resource

  • Go to our target resource in the Azure portal.
  • Go to Acces Control (IAM) and click Add – Add Role Assignment.

  • Select the correct role (we are lazy and use the Owner role) and click on Next.

  • Click on Select Members and search for the experiment name

  • Click Review + assign then Review + assign.

Run the experiment

  • In the Experiments view, click on our experiment, and click Start, then click OK.

Learn more

If you want to learn more, check out this episode of Azure Friday:

Popular posts from this blog

XUnit - Assert.Collection

A colleague asked me to take a look at the following code inside a test project: My first guess would be that this code checks that the specified condition(the contains) is true for every element in the list.  This turns out not to be the case. The Assert.Collection expects a list of element inspectors, one for every item in the list. The first inspector is used to check the first item, the second inspector the second item and so on. The number of inspectors should match the number of elements in the list. An example: The behavior I expected could be achieved using the Assert.All method:

Angular --deploy-url and --base-href

As long you are running your Angular application at a root URL (e.g. ) you don’t need to worry that much about either the ‘--deploy-url’ and ‘--base-href’ parameters. But once you want to serve your Angular application from a server sub folder(e.g. ) these parameters become important. --base-href If you deploy your Angular app to a subfolder, the ‘--base-href’ is important to generate the correct routes. This parameter will update the <base href> tag inside the index.html. For example, if the index.html is on the server at /angularapp/index.html , the base href should be set to <base href="/angularapp/"> . More information: --deploy-url A second parameter that is important is ‘--deploy-url’. This parameter will update the generated url’s for our assets(scripts, css) inside the index.html. To make your assets available at /angularapp/, the deploy url should

Azure DevOps/ GitHub emoji

I’m really bad at remembering emoji’s. So here is cheat sheet with all emoji’s that can be used in tools that support the github emoji markdown markup: All credits go to rcaviers who created this list.