Our branch strategy based on Git Flow did not survive. It was getting a bit old in the tooth, but the final blow was automation.
At Freelock, we've been hard at work building out automation so we can handle the maintenance on hundreds of websites with better test coverage and more confidence than ever before. Exciting news! It's all coming together, and we have it working across the board on ALL of our projects, now.
However, we've had to change our processes to support it. These really simplify what we need to do to roll out a tested release, so our partners and customers who work in our git repositories need to understand how it works now.
We still use git for deployment. And due to the nature of most of the Drupal sites we manage, we test against various copies of the production database -- which means the normal "continuous integration" (CI) and "continuous deployment" (CD) practices of building up a clean site just aren't very relevant -- the stuff that breaks is all related to things in the production database, so we have to test against copies of that.
Some of the tests we run use Selenium, PhantomJS, and direct access to resources like the various databases we have internally.
Based on these needs, we've decided to have two copies of every site running in our environment, mainly for testing purposes:
- dev -- the development site, is where all new development happens, and most of the Behavior Driven Design (BDD) tests are run against this copy. Sometimes it gets stale -- generally we update this database from the stage copy before starting a chunk of work.
- stage -- the stage site is where we test the actual deployment process, and run screenshot comparisons ("Visual Regression Testing"). It gets a fresh copy of the production database after every release, so its content should never be that far behind. The database gets sanitized to prevent accidental spamming of users and other mishaps.
- prod -- Prod is what we call the main production site, and is compared to stage on each release candidate.
Site Release State and Versions
We've set up our continuous deployment to always have a particular status. What you can do depends upon the state of the current release of the site:
- Released/Clean -- All code has been released to production. No changes currently exist on the dev site that have not been deployed.
- Dev -- There are changes on the dev copy of the site that have not been added to a release. This is the state when there is work currently being done on the site.
- Stage -- At least one "Release Candidate" has been created, and is being tested or is waiting to get deployed to production.
The CI system maintains a specific "version" for each site, and this also varies based on the release state:
- Released: "final" version, for example: 7.1.5
- Clean or Dev: "beta" version -- 7.1.6-beta.1
- Stage: "rc" version -- 7.1.6.rc.2
On Drupal 6 and 7, the CI system writes the RC and Final versions to the settings.php file, in a $conf setting ("site_version"). This makes it easy to see what's actually running on various sites using "drush vget site_version". We are still working out how best to do this in Drupal 8 -- Drupal 8 does not care much for config objects that are not owned by a module...
We keep a branch that corresponds to each site copy. This makes it possible to do emergency releases that bypass current development if necessary, among other benefits.
- feature/xxx -- These are arbitrary branches that the CI system entirely ignores. When working on something that is going to take more than a day to complete, the work should happen on a feature branch. If the Dev site is on a feature branch, it will block tests. Local development should be done on a feature branch, and merged into "develop" when ready to test and merge with other development.
- develop -- This is the "home" branch for the Dev site. Whenever anything is pushed to develop, if the Dev site is otherwise clean the CI system will check out the develop branch on the Dev site, and merge in all changes from master and release branches, and kick off the Behat (BDD) tests on the Dev site. However, if the release state is currently "stage", or if the Dev site is on a feature branch, this won't happen -- the intent is to make us finish what we're doing before kicking off the next step.
- release -- This corresponds to the Stage site. A push to the release branch will change the release state to "Stage", deploy everything to the Stage site, and kick off the visual regression tests. Generally if fixes need to happen, we check out the release branch on the Dev site, make the appropriate changes, and push them back in for a new deployment to stage.
- master -- This corresponds to the Prod site. We restrict access to push here to designated release managers and the CI system.
The basic workflow we use is:
Feature/xxx -> develop -> release -> prod -> new development
We have a bot in a chatroom that facilitates moving stuff through this process, and triggering deployments and tests on demand.
When we're building out something major, work is done either on a local copy or on the Dev copy of the site. If the Dev copy is on a feature branch, it blocks all tests -- the assumption being we don't expect them to pass here.
When we're deploying updates, or merging in completed features, we set the Dev site to the develop branch, and/or push the develop branch (and have our bots set the dev site for us). This kicks off the Behat tests so we can see if anything breaks.
Site-specific behat tests are stored in tests/ under the root of the site. If there's no behat.yml file present, it runs a set of base tests we use across the board.
When the behat tests have passed to our satisfaction, and/or we're ready to roll stuff out, we kick off a release. Currently we do this by switching the Dev site to the release branch, merging in the develop branch, and pushing the release. We are thinking about having our bot do this for us on demand.
Whenever code is pushed to the release branch, it gets deployed on the stage copy. All features get reverted (in D6 and D7 sites) and configuratation applied (D8). For Drupal 8 sites, we also set the "site:mode" to production, turning on all caching functionality and disabling development/debugging settings.
After it's all deployed, our visual regression testing framework kicks in, using Wraith to compare Stage to Prod. The list of paths it compares is stored either in a .paths.yaml or a tests/paths.yaml file in the site, where we can easily update URLs to test.
Launching to production, for most of our sites, is easier than ever! A simple command to our bot will trigger the deployment on most of our sites. First the release code gets merged into master, and the final version number is applied and a matching git tag is created. Then the CI system looks up the deployment policy in ci/production, which can indicate whether to put the site in maintenance mode before deploying, whether this site needs manual deployment (e.g. on Acquia or Pantheon), if it should skip applying configuration, and other exceptions.
If it all goes cleanly, our bot reports the result back in the chat room. If any manual steps need to be done, it reports that.
On most of our sites, on most of our releases, we can now deploy with a single command in a chat room!
After a deployment is released, there's a cleanup phase to reset for new development. Currently this is manual, but our plan is to wait for the next production database backup to run to lessen the impact on the production server to do the cleanup/prep for the next round of development.
To clean up, this is what happens:
- Version "patch" level gets bumped and set to "beta" - e.g. 7.1.4 gets set to 7.1.5-beta.1
- New copy of the production database gets sanitized/imported on Stage
- Dev copy gets reset to develop branch, all commits from release and master merged in, and for Drupal 8, site:mode gets set back up to the "develop" settings
This all certainly has gotten complex... and yet we're finding it's making our results far more reliable and consistent, and as a result, we're now able to focus on the quality of what we're doing. Now we're able to drop new Behat tests into place and have them run every time. Now we have systems to let us know when releases sit undeployed, allowing us to handle more sites, with more consistency than we've ever had before.
How do you manage your CI/CD projects? Anything obvious we've missed, or insight you've gained? Please comment below! And let us know if we can help you manage your Drupal sites...
Thanks for your comment!
Our "release" branch is volatile, as you describe -- something we are free to force-push or delete/recreate at any point, as is the stage site we have it associated with.
Otherwise we do our best to avoid the problem you describe by pushing small features relatively frequently.
I suspect your approach is quite useful when you have many developers working simultaneously on a project with a longish release cycle. At most, we have 2 or 3, and we keep our sprints/releases as short as possible.
One thing missed: pulling non-passing features from a release - especially when a release could have multiple features that have gone through bug fixes that get merged back into the release branch at different times. This is a messy (complex, hard-to-read reverts) problem to begin with when using a static dev-staging-prod deployment pipeline. It is important to be able to pull features from a release so these features do not become blockers and delay features that are already finished.
Two things that help alleviate this messiness: