For most organizations releasing and deploying are the same thing while in fact these are 2 separate things:
- Deployment: A technical handling where a new version of the software is deployed to a specific environment
- Release: A business handling where the customers are informed that a new version of the software is available and can be used
As you combine these 2 handlings in one, releasing becomes a risky business. The same moment you roll out the code on production, your users are eagerly waiting to start using these new features they so desperately needed. At that moment, you don’t want that things go wrong.
So what do most organizations do? They introduce long release cycles where an application has to go through multiple environments and test cycles before finally reaching production.And they try to reduce the risk by only going through this cycle one or 2 times a year.
But while they are thinking that this limit the risk, it actually has an opposite effect. The moment Murphy kicks in (and it will) you’re into trouble. Why? Because you have to go step by step through this really long release cycle again before you can apply your patch or hot fix to production. By then this fix is not so ‘hot’ any more. And in the meanwhile error reports keep coming in…
Now this is the theory, what I see in practice (a lot!), is that when there is really something wrong on production, the whole process is thrown out of the window and the solution is deployed immediately to production. Wooops, maybe not the best idea either?!
Could there be a better solution?
Of course! Otherwise I shouldn’t be writing this blog post. First of all, let these 2 handlings(deployment and release) remain separate things. Deploy your feature to production as soon it’s ready but hide it from the users. Later on, you can do a ‘release’ and enable the feature on production just by toggling a configuration switch. Even better is that you can gradually enable a feature for a subset of your users. The moment you notice that performance is going down or exceptions start appear, you can easily disable the feature again and fix it without impacting the users. This is exactly the approach that companies like Facebook and Amazon are using.
Now one important recommendation I want to make is to go one step further and split out the deployment itself in multiple steps:
- In step 1, you deploy your database changes. No code changes are deployed yet. Of course this means that database changes should happen in a non breaking fashion. For example, if you add a new required database column, make it nullable first or provide a sensible default.
- In step 2, deploy your application change. When you’re ready announce the release and enable the feature.
- If everything is working as expected, you can continue to the last (optional) step and update the database again. For example, for the new column you created before, you can now make it not nullable and remove the default value.
This provides a fault tolerant approach in handling releases and is a first step towards continuous delivery.