Vagrant, Docker and Ansible revisited
In an earlier post I talked about the combination of Vagrant, Docker and Ansible. The final goal of that exercise was to have a Vagrant file that brings up several docker containers that are able to communicate with each other. And then we provision the containers using Ansible, all in one go. Using Vagrant, you can choose your own virtualisation method. Using Docker, you can virtualise using light weight containers. And using Ansible, you can easily provision complicated setups. Marvellous combination, if it works.
First for Vagrant. In the previous post I used a pattern that resolves to:
machines = { "m1" => { <machine definition> }, "m2" => { <machine definition> }, ... N=machines.size config... <generic provider settings and generic provisioning> (1..N).each do |machine_id| machine = machines["m#{machine_id}"] config.vm.define machine['name' do |m| <machine specific provisioning> <docker provider settings> if machine_id == N <ansible provisioning> end end end end
This works great, for one drawback: if you create 6 different machines, they boot and/or build all at the same time, but they don’t end at the same time. With 2 machines, chances are that your last machine will be the last one to finish, but with 6 machines those chances are diminishing rapidly. To compensate that, I insert a shell provisioner just before the ansible provisioner of the last machine:
config.vm.provision :shell do |shell| shell.inline = "echo 'Waiting 20 seconds to allow all containers to boot';sleep 20" end
This gives the other container ample opportunity to finish their initial booting and installation. Twenty seconds is a rough safe guess. A few seconds should suffice, but you never know with network lag on some machines.
Then for Docker. In the previous post I settled on using the docker-composer plugin of Vagrant to create a huge compose file inside Vagrant that duplicated a lot of the settings already defined elsewhere. As this works okay for 2 machines, but much less for 6, I decided to move away from that and solve the problem differently.
The underlying problem is that Docker only allows static IP addresses on user defined networks and the only safe way of defining such a network was through composer. However, we can cheat a little and just invoke a shell command to create a docker network in the generic provisioning section of the Vagrantfile:
config.vm.provider "docker" do |dk, override| # create a docker client network Vagrant::Util::Subprocess.execute('bash','-c', "(docker network list | grep '<your network name>') || \ docker network create --attachable --driver bridge \ --gateway <your gateway IP> --subnet <your subnet> <your network name>", :notify => [:stdout, :stderr] ) end
This configuration tests the presence of the docker network of your choosing only if we need the docker provider (so docker is installed) and will create it if it is not present. This greatly simplifies the original problem,where I had to create a container for each host, then drop all container except the last and perform a composer trick on the last, which would bring up all the previous containers again. This had a side-effect of adding the very last container to the composer settings as well, effectively duplicating that container, but without a static IP or other settings. So instead of using the docker-compose plugin, I end up with code like this in the docker provider section:
m.vm.provider "docker" do |dk| dk.name = machinename dk.build_dir ="./docker" dk.build_args = ["-t", <your build tag> ] dk.remains_running = true dk.has_ssh = true dk.create_args = [ "-d", "-t", "-i", "--network", "<your network name>", "--ip", "#{machine['ip']}", # add host names "--add-host", "#{machines['m1']['hostname']}:#{machines['m1']['ip']}", "--add-host", "#{machines['m2']['hostname']}:#{machines['m2']['ip']}", .... # add options to get systemd to run properly "-v", "/sys/fs/cgroup:/sys/fs/cgroup:ro", "--tmpfs", "/run", "--tmpfs", "/tmp:exec" # need exec for vagrant ] end
Much cleaner and leaner. Note the single dockerfile instead of the multiple versions in my previous post. Also note that these containers no longer run SSHD as the first process, because my client required systemd for service management. Although it is possible, it is strongly advised against: docker is meant for single-process shoot-and-forget services and not for a VMWare-light replacement. But when did that ever stop anyone. However, there are concerns:
- systemd requires access to /sys/fs/cgroup of the host container and some other magic that disables its use out-of-the-box on debian:stretch
- a lot of sites suggest running the container in privileged mode, but that fails due to an apparmor issue. Basically, in privileged mode, binaries try to use libraries in the host, but the apparmor configuration of the host blocks that, because the container is not recognised as correct binary. This is only an issue if the host has installed the same service as the container, for example MySQL.
- some site suggest adding only –cap-add=sys_admin. It failed for me. Just did not work
- we could run something like supervisord instead, but that would cause a lot of changes all around the ansible code of my client.
- a few setups remove a lot of .want files from the systemd installation. That cuts down on the amount of installed services, but it is not required to run systemd in a container.
After looking far and low, I finally came to the realisation that it is very easy to run systemd in a docker container, provided you use the systemd version of the stretch-backport repository. I bent my head back and forth over why my setup did not work until I noticed dramaturg using the testing repository for systemd. But backports works as well. My Dockerfile looks like:
FROM debian:stretch MAINTAINER <your contact info> # add the backports repository so we get the right version of systemd ADD sources.list /etc/apt/sources.list # clean out, update and install systemd from backports RUN apt-get -y update && apt-get -y -t stretch-backports install systemd VOLUME [ "/sys/fs/cgroup", "/run", "/run/lock", "/tmp" ] CMD ["/lib/systemd/systemd"]
The sources.list file referenced is a standard version:
deb http://http.debian.net/debian stretch main deb http://http.debian.net/debian stretch-updates main deb http://http.debian.net/debian stretch-backports main deb http://security.debian.org/ stretch/updates main
Now you can install and start services inside the container, like SSHD for example:
RUN apt-get update RUN apt-get install -y openssh-server python2.7 python3 python sudo RUN systemctl enable ssh.service