This is a guest post written by Michael DeHaan, CTO at AnsibleWorks, a Rackspace Cloud Tools partner. AnsibleWorks provides IT orchestration solutions that simplify the way IT manages systems, applications, and infrastructure.
When developers and systems administrators work to automate the rollout of application updates, a common problem is automating web and SaaS architectures that span more than a single machine and, more importantly, managing those systems in a way that preserves uptime. This is especially critical in high-traffic web sites and services.
When looking at the automation modelling itself, it is insufficient to model the actions that happen on one machine at a time, or even all classes of machines at a time, because simultaneous updates can introduce outages.
Performing updates on live infrastructure is one of those problems that historically results in your IT team locked in a conference room late at night or on a Saturday, again and again, and it’s not a fun place to be. Not only is an arduous process, but getting a step wrong means customers will experience problems (lost orders, dropped connections, etc.).
Ansible is a configuration, app deployment and orchestration solution that provides powerful tools to roll out multi-tier applications, on either physical or cloud infrastructure. Ansible does that with a serverless, agentless solution (it just uses SSH) that can finely control what order operations happen on what machines. It’s also particularly good at multi-tier app rollout, and even more so at implementing those vital rolling updates and making them more or less a “push button” process.
Ansible makes this easy by having a push-based, explicitly ordered system that can talk to one group of hosts, talk to another on behalf of others, and then move on to other groups, to model all sorts of configuration, application deployment and rollout processes.
As shown in our ansible-examples repository, here’s an example of what that looks like for a simple HAProxy setup.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
In the above example, we’re removing the node from a pool, signaling a monitoring outage, updating it by applying three configuration roles and then putting it back into the pool and monitoring system. Ansible is also smart and knows to restart all the required services prior to putting them back in the pool, and to stop the update if a batch of systems fails, leaving you with the rest of your systems online.
The above example uses haproxy, but this can easily be extended to work with other types of load balancers, whether physical or cloud based. While the above example takes one node out of rotation at a time, if you had 500 servers and wanted to take 50 out of rotation at a time, simply set serial to “50” and you have a configurable rolling update policy. You can decide how much load capacity you can handle in the middle of an update process. Some AnsibleWorks users use this system to do continuous updates every 15 minutes, all day long.
Ansible can be used for things other than rolling updates, including configuration management, more basic application rollout or running shell commands on remote nodes. To learn more about Ansible, see our documentation section and be sure to check us out on GitHub. There is also a free two-hour quickstart training video. Thanks and happy Ansibiling! I think that’s a word :)