Why we use an Ansible manager

Ansible Tower was a pain to use daily. My colleagues and I had to learn a lot of tricks to use its buggy and non-ergonomic interface. Every time our developers saw us using it, they laughed at us.

Just a quick example of ergonomic nightmare: job result history. When we stopped using Tower, it still wasn't possible to use browser tabs. So when we wanted to find a job result, we had to click on each job in the search results, look if it was the right one and, if not, we had to go back to the history page. And every time you went back, your search terms were forgotten.

Most of my friends only use Ansible CLI. At best, they use some sort of centralized concentrator to export and save job results. But that's not enough for our use case.

Ansible Tower provided major features that we use daily:

  • full job history
  • shared credentials
  • concurrency protection
  • user-friendly Ansible launcher (especially if you know nothing about Ansible or shell)
  • templates and launch survey
  • scheduled jobs

AWX

Following their philosophy, RedHat open sourced Ansible Tower about 1 year after their purchase of Ansible. We briefly tested it and decided it could suit us well.

Using an open source project instead of Tower was motivated by the following:

  • We didn't need support
  • Tower licence was far too expensive for what we thought it was really worth
  • We hoped it would be easier to correct simple things in the source code
  • We tried other projects (including Rudder) but nothing was really better

Deployment

AWX developpers provide an Ansible role to deploy AWX, using Docker. By default, the role deploys its own database inside a docker. In our environment, we use our own PostgreSQL cluster so we didn't use this part.

AWX is composed of 2 dockers:

  • awx_task contains Ansible and the workers in charge of launching Ansible jobs.
  • awx_web contains the uWSGI server, hosting the web UI.

AWX also needs a memcached server and a RabbitMQ server, hosted on the same server in adjacent dockers.

AWX deployment role is great, but for us, it's not production ready:

  • Dockers are launched using Ansible module, so you can't easily recreate them manually.
  • It uses the deprecated Docker links feature to connect each containers.
  • It's not easy to add additional volumes or personalize AWX containers.

We are very big (and old) Docker users. Usually, we start and manage our containers using systemd. I know, it's not perfect, but we can configure every aspect, and it's easy to reset and monitor each docker.

Our latest modifications are on GitHub.

We didn't want to build our own AWX Docker image because we thought that it would be easier to upgrade if we sticked with the official one.

We use volumes to configure AWX operating system (Git config, CentOS repos, SSL certs, …). We use ExecStartPost commands to install pip and yum packages needed by our Ansible roles.

We were unable to agree on whether we gained or lost time using this method instead of Docker build.

Here is the inventory file we use to install our servers. As you see, there isn't much to configure:

[awx]
master-ans01.blue-solutions.com
standby-ans02.blue-solutions.com

[all:vars]
dockerhub_base=ansible
dockerhub_version=1.0.7.2

postgres_data_dir=/tmp/pgdocker
host_port=80
docker_compose_dir=/var/lib/awx

pg_hostname=100.83.8.1
pg_username=awx_app
pg_password={{ encrypted_pg_password }}
pg_database=awx
pg_port=5432

default_admin_user=admin
default_admin_password={{ encrypted_default_admin_password }}
awx_secret_key={{ encrypted_awx_secret_key }}

http_proxy={{ http_proxy }}
https_proxy={{ https_proxy }}
no_proxy={{ no_proxy }}

docker_network_subnet=100.83.10.0/24
docker_network_gateway=100.83.10.1

custom_volumes=['/etc/tower:/etc/tower:ro', …]

use_systemd_units=True
awx_web_post_actions=['/bin/bash -c "…"']
awx_task_post_actions=['/bin/bash -c "…"']

Migration

We didn't want to lose everything we had in Tower during the switch to AWX. We tried to directly import the database from Ansible Tower to AWX, but the structure was different in many ways, and we didn't want to miss anything. If Tower web UI is a mess, its API is really great. It's easy to use and you can do a lot of things. We decided to export all of our data using Tower API, and import everything back using AWX API.

We have been maintaining, for several years now, a Python interface to Tower API. The script is now only compatible with AWX and is available in awx-cli repository. It's a sort of tower-cli clone, rewritten from scratch, and personalized for our needs. Sorry for the poor code quality, I am not really a Python developer.

Using this script, we were able to save and reimport most of Tower elements.

The migration took several steps:

Organizations

In Tower, Organizations allow you to segment everything in hermetic compartments. In our setup, Organizations are automatically created by our LDAP + SAML configuration. There is a direct link between LDAP groups and AWX groups. Organizations in AWX don't necessarily have the same ID as their counterpart in Tower, we needed to change them during reimport (sed was our friend). This is true for each object you import in AWX.

Inventories

Inventories contain Ansible hosts and groups. Usually hosts are divided by environment (test, production, …) and/or similarity (application, geography, …).

Inventories were imported using the addInventory function over a text list.

Credential types

If you need special credential types, for example for your own sites and applications, you have to reimport them first.

Credentials

We didn't want to restore user passwords, so we mostly focus on system credentials (ESX, network devices, Git, …).

We used the exportAll function to get all credentials as a JSON file. We filtered them using jq (see README) and used the importAll function to import them in AWX.

Projects

Each project is linked to a Git repository and a specific branch. We reimported them using the same method we used for the credentials.

Hosts, groups and variables

Ansible Tower is shipped with an excellent inventory export function. We use it in the function exportAnsibleInventory, it's only 20 lines of code! You can quickly obtain a beautiful JSON with all your hosts, their groups, and the most important part, all the variables you have saved in the inventory.

The tricky part is to reimport everything. In fact, it's not directly possible from the interface nor the API. You have to build a custom inventory script from the JSON file you recover.

The simplest method is to use a bash script:

#!/bin/bash
if [ "$1" == "--list" ] ; then
cat << "EOF"
{
    JSONFILE EXPORTED
}
EOF
elif [ "$1" == "--host" ]; then
  echo '{"_meta": {"hostvars": {}}}'
else
  echo "{ }"
fi

Hint

You can use argument "--bash" of the exportAnsibleInventory function to directly obtain a shell script instead of a JSON file.

You add this script in the "Inventory Scripts" section, "New Custom Inventory". Next, you can configure an inventory to use this script in the "Sources" menu directly in the "Inventories" section.

It will import everything, hosts, groups and vars, after the first sync.

Warning

You need at least Ansible 2.5 on AWX server for nested groups support during the import.

The fun part is when you want to unlink your inventory and your custom script: that's not possible. If you remove the custom script, it deletes everything with it.

The only solution we found was to update the database manually:

UPDATE main_group
   SET has_inventory_sources = 'f' WHERE has_inventory_sources = 't';
UPDATE main_host
   SET has_inventory_sources = 'f' WHERE has_inventory_sources = 't';

TRUNCATE TABLE main_group_inventory_sources;
TRUNCATE TABLE main_host_inventory_sources;

After this, you can remove the source of the inventory, using the web UI, without also losing all hosts and groups.

Two things to know after the import:

  • Groups without members are not exported. You can have a list of these groups using the getLonelyGroups function, if you want to reimport them later.
  • All new hosts are activated. If you want to keep your disabled hosts that way, you can export them using function exportAll disabledHosts and import the change using massChangeHostStatus

Job templates

Job templates are launch configurations and surveys for a role or playbook. It allows administrators to limit and configure the way a role is used.

You can use exportAllJobTemplates and importJobTemplates functions to automatically manage this part. awx-cli exports surveys, schedules and extra credentials.

Hint

Since our migration, tower-cli has been updated with a mean to export and import job templates. You may want to use it instead of our in-house script.

What we learn after 5 months in production

We put a lot of work in this migration, a lot more than anticipated. We lost several weeks because some AWX releases were so buggy that we couldn't use them in production. However, as we often found these bugs after our migration, we tried (and failed) several times before really switching to AWX.

The ugly

  • When we updated Ansible Tower, we usually took a lot of caution. Each version fixed bugs, but most also came with new ones. AWX is worse on this. We encountered a lot of bugs, some of them simply prevented us from using the software. For example, this inventory GitHub issue. The sad part was that the fix had been developed, but it was released a week later during a scheduled merge. In the meantime, we had to wait.
  • This upgrade issue is also a good example. AWX developers regularly modify old Django migrations! We managed to do it anyway by playing with SQL, but you need to understand these 4 facts if you plan to use AWX:
  1. There is no guarantee that AWX works. As said by maintainers, "AWX is not a product; it's a project. The supported product is Ansible Tower."
  2. AWX is not a community edition of Ansible Tower.
  3. Data migration is not supported between versions. You are supposed to start from scratch and import the data back (that's not the way we proceeded, but it's not officialy supported).
  4. AWX GitHub repository isn't used by the developers during development. They merge all changes periodically from a private repository downstream.

EDIT: In this issue, one of AWX developer write the following statement:

The AWX development team is aware of the pain related to upgrades, and we're committed to making the process much smoother in the near future by versioning appropriately and using migrations to bridge the gap from one version to another whenever possible.

This is awesome, and we are looking forward for the next updates.

The bad

  • AWX isn't very stable. We have to restart the awx_task container every once in a while because job real time display stopped working.
  • AWX JavaScript will eat all of your CPU and turn your laptop into a toaster. (EDIT: The new AWX 1.0.7 is a LOT better on this, yeaah!)

The good

  • I don't know how Ansible Tower looks like right now, but when we switched (2018-04-19), AWX was way better. You could open most pages in new tabs, use search filters that almost work as expected, and the web UI was a little bit easier to use.
  • We took advantage of the migration to reset all user access and permissions. We even plugged AWX to our SAML SSO.
  • Lots of little new features and a clear improvement in performance.

Conclusion

We use AWX every day, and it matches our needs.

We can't say we love it, but it doesn't love us either, so it's fine :-)