Faster, more secure, more maintainable. Three nice benefits we get from our new standard Drupal server architecture.
This year we're replacing our old "traditional" LAMP stack with an entirely less pronounceable LNDMPS version. We still use Linux, MariaDB and PHP, of course, but instead of Apache we've moved to Nginx, and we've added Docker and Salt.
If you're a technical person and haven't heard of Docker, you must have been offline for a couple years. Docker is a system for managing Linux containers. You might think of Linux containers as a form of "cheap virtualization." However, the way the Docker community has come to use them is more like a chroot jail -- a way of isolating a single process into a container that protects the rest of the system if that process gets compromised, and not a full operating system with multiple processes. So the best practices are to have a different container for each necessary service, not a single container with the whole set.
Host layout, container linking
So far we have put PHP, MariaDB, Apache Solr, and a bunch of other supporting services into containers. On our production servers, we have kept Nginx, Postfix, and DNSMasq outside the containers and installed directly on the host.
When using Docker, each container has a different IP address bridged on the host, and can expose different TCP ports, or sockets in the filesystem, to allow connections. We configure DNSMasq on the host so that you can reach any container on the host by its name, and set each container to use the host's DNS. Using this pattern, we can link all the containers together by a simple hostname, without having to link specific containers together.
There are a couple gotchas with this approach:
- There is a lag before the host DNSmasq populates new container addresses -- we currently have an update script that polls every 10 seconds for new changes, and Nginx needs a reload to see a change in a container IP address.
- We ran into problems having a wildcard DNS entry set up (*.freelock.com) when the hostname was part of that domain -- most requests for a hostname that was not fully-qualified resolved the wildcard DNS entry instead of the local DNSMasq entry. We didn't end up solving this -- we just removed the wildcard DNS entry and then everything worked fine.
We also run a local Postfix instance on the host that relays mail to our mail server. Inside each container that might need to send email, we've added/configured SSMTP and pointed it at the host.
Also on the host: all data, databases, deployment tools (git, drush, composer, etc), cron jobs (using drush for Drupal cron).
Our philosophy on production Docker use is that containers are ephemeral -- they get destroyed and recreated all the time, and so there should not be anything in the container we care about losing.
Docker is still a pretty young technology, and it clearly opens up some new avenues of attack, which need to be carefully considered. For example, adding somebody to the "docker" group is quite effectively giving them full root access to the entire host. It is early enough days that I'm not entirely confident somebody gaining a shell on the host couldn't somehow attack the Docker process itself to gain root access and modify whatever they want on the host.
That said, I'm reasonably confident that Linux containers do an effective job of isolating processes running inside them to just what's visible to those processes. And the vast majority of attacked websites we see (5 so far this year) plant a malicious script in an executable environment and run it. So the first place to secure is the executable environment -- PHP itself.
In our former LAMP setup, we would use filesystem permissions to prevent the Apache web user from being able to write to the filesystem, anywhere other than the directories with assets Drupal manages on the disk (images, videos, aggregated CSS/JS, etc) and prevent execution in the directories where Drupal can write.
With Docker, we take this one step further: we mount the web root into the Docker container as a read-only volume -- so even if an attacker somehow gained root through the PHP interpreter itself, the attacker still cannot plant their executable code into the site.
A couple other big wins here: we support a few "legacy" installations of older Drupal and some non-Drupal sites -- we're now able to contain and isolate those from other sites on the same server, so a compromise on one site cannot hop over and infect another site. And we're able to easily run different versions of PHP on the same server.
Docker gives us no performance improvements whatsoever, but it doesn't penalize us either.
Our new architecture, running Nginx and PHP-FPM, is providing very noticeable performance improvements. Sites that formerly took 3 seconds to load now take 0.3 seconds for the HTML, and by moving away from mod_php, the webserver can handle many more simultaneous connections for downloading assets.
The speed improvements are especially noticeable for admin users working on their sites.
I keep hearing people rave about how much easier it is to manage upgrades with Docker. And I pretty much completely disagree. In a production environment, if you want to upgrade a container for a new PHP release, for example, it's quite a bit more complex than a simple "apt-get dist-upgrade." You have to:
- Build a new Docker image.
- Stop the old docker container.
- Remove the old container.
- Create and run a new replacement container based on the new image.
- Fix all the links with the other parts of the system.
This is not simple.
Fortunately, Salt can manage this entire process!
Salt is a configuration management tool, very similar in purpose to Puppet, Chef, Ansible, CFEngine, and others. I think people who think Docker is a useful deployment/configuration tool don't have experience with one of these far more powerful configuration management tools.
These days Ansible seems to be getting a lot more attention than Salt -- my impression is Ansible is easier to learn, but Salt is more powerful. (That sounds familiar!) We've built up some Salt states to manage both containers and sites in containers.
So to upgrade a container, our process now looks like:
- Build a new docker image.
- Push into our private Docker registry.
- Run salt state.highstate and let Salt do the rest of the work!
Salt pulls the new image down to each host, and if it detects that the image a container is based upon has changed, it stops/removes the container and starts a new one in its place. Then the "update-dns" script we deployed earlier (also via salt) detects the new IP address for that container's name, and reloads Nginx.
This process has not been flawless, and so at this point we're running the "highstate" command that applies these updates manually, so we can address any issues that might arise -- so far we've had two failures, both of which I chalk up to the relative immaturity of Docker:
- Docker container filesystem type -- this is configured when Docker is installed. On an Ubuntu host, it sounds like the current recommendation is the older AUFS filesystem, on other systems, Devicemapper seems to be the current standard. Our original systems ran AUFS, and AUFS upgrades have gone smoothly -- however, the systems we deployed with Salt ended up using Devicemapper, and that's broken multiple times when Docker itself came out with a new release, breaking all containers. We've eventually switched all our hosts to use AUFS and haven't seen further issues.
- docker-py unable to pull from a V2 registry, errors on container creation -- You can (and should) run your own docker registry to store docker images. This allows you to build an image once and distribute it across your infrastructure. We only started using Docker a few months ago, so we never bothered to deploy a version 1 registry, went straight to V2. However, the Python docker-py library which Salt and Ansible use to manage Docker has lagged behind the Salt API, and again after various upgrades have suddenly stopped working, sometimes at the worst possible time.
We're nearly ready to turn the automatic "highstate" back on, but we want to go through a couple more upgrade cycles to make sure this goes smoothly first. As Docker and the tools mature, I'm sure these issues will be far less frequent.
The biggest improvement
Docker containers are cool. Docker is fun to work with. It's great to be able to quickly roll back to an earlier version of a container -- particularly some of the one-off servers we support, Docker makes us feel far more confident in the environment and in being able to quickly roll back the entire server to an older version.
But the biggest improvement we're seeing is in the process of creating the docker containers in the first place: the Dockerfile and the startup scripts. None of this is something that we couldn't do before -- it's just that we didn't do it before, we didn't map out the steps of creating our production environment. We had a bunch of ad-hoc scripts and miscellaneous Salt states to assist with adding sites to a server, tuning MariaDB, etc. But this was all jumbled, messy, and hard to maintain.
Now we're getting a much cleaner and well-defined separation of a standard environment build, and the run-time configuration. And that starts with the Dockerfile.
You don't have to use a Dockerfile to create a docker image -- you can just start a container based on somebody else's image, make some changes, and "commit" it to have an image. But then you end up with an image that's hard to replicate if something upstream changes.
I've read about people using their configuration management system to update their containers, but this is backwards -- you need to install more software inside your container to make it capable of being managed!
The Dockerfile is a simple recipe for creating an image, and "docker build" is baked into docker itself, it's incredibly easy to use.
We see the Dockerfile not just as the recipe for creating a Docker image, but also a self-documenting map of the configuration itself. You do need to think about what variables need to change in different container instances -- for example, our PHP images can be tweaked by passing in different variables for max ram, max clients, and max execution time, and we declare these variables in the Dockerfile so it's easy to see what parameters you can pass at runtime, even though they're not necessary to build the image.
Even though we typically run Ubuntu servers, most of our Docker images end up based on Debian Jessie or other derivative images.
So we have a git repository of Dockerfiles for all our images. When doing an update, we "docker build" the new image and push it into our registry. This means we've been able to move a lot of the build information out of Salt into Docker.
Runtime configurations in Salt, Startup scripts
Inside a Docker container, you generally don't run any kind of init system -- you just run the service itself, in the foreground. So most images need to have the ability to set up the environment before starting the service, and we end up using a hand-built bash script for this. Most often, this script simply replaces values in configuration file from the values of environment variables provided to the container when it's started, and then starts the service.
You do need to consider that this script will get called whenever the container is started -- both the very first time it's launched, and also if the host gets rebooted or the container stopped/started for any other reason. Other than that, the startup scripts are very straightforward, because you don't need to consider shutdowns, status checks, or any of the other things you typically need in an init script.
The catch is, you can't change any of these environment variables after the container is created. So that means if you do need to change them, you need to create a new container and replace the old.
If you have a dozen variables to set at container creation, you need to keep track of those somewhere. We started by just putting it in our project management system, and cutting and pasting the startup line. Docker is developing "docker engine" for this purpose, to allow you to store these variables in a config file for easy startup, and orchestrating multiple containers.
But we've found Salt able to handle that task very, very well. We've built out a set of salt "states" that automatically provision the necessary containers, in a very elegant way, with the run-time values stored in salt "pillar".
Because we're primarily managing Drupal sites, we've set up pillar data to make it easy to focus on one layer at a time:
- sites/sitename.sls -- contains information about a particular site: drush alias, git alias, public URLs, database credentials, site root path, assets path
- server/servername.sls -- defines which containers to create on a particular host, based on which images, along with any changes to the runtime defaults and a list of sites to mount into that container. It also includes each site state file to provision on that server, and designates which container it runs in.
Once these pillars have been populated with appropriate data, Salt now ensures that a whole bunch of configuration is done on the server:
- Latest site code is checked out of git
- Permissions are set correctly for the site code and assets
- Containers are running with the latest images
- Site code is mounted as a read-only volume, and assets mounted as read-write inside the appropriate container
- Nginx has a site configuration for each site, pointing to the appropriate PHP container
- The Drupal settings file is written with appropriate database credentials and any other variables specific to the production environment
... and there's a couple easy next steps we haven't quite gotten to: scheduling the Drupal cron job in the host, and setting up the user account inside the database.
Other than that, completing a move of a site into this architecture consists of importing the production database, copying over the site assets from wherever they are, running "highstate" again to fix the permissions, and updating DNS to point to the new server!
From a disaster recovery standpoint, this is huge. We used to spend a couple hours dialing in the environment on a new server, working from checklists and error messages as we build it up. Now we simply copy the container configuration to the pillar for a new server, run highstate, and import a backup site database and assets and we're off and running.
This is a bit of a long and rambling post, but I hope it illuminates the big picture of how we use all these different systems to deliver what's really becoming a great result -- a fast, secure, replicable environment for running Drupal sites. We're still new to Docker, but happy to share our experiences further, and if you have any suggestions, questions, or obvious things we're missing, please comment below!