Yesterday Drupal.org got hacked, and potentially all the password hashes on the site fell into malicious hands. According to the security team's announcement, the attack was not a result of a Drupal vulnerability, but of other, as yet undisclosed, software on the server.
Drupal has long had one of the best security track records among open source CMSs. The security team does a great job of tracking down even the smallest exploits, often removing modules that maintainers choose not to fix. The vast majority of fixes and security updates we see are protecting against "privilege escalation" -- vulnerabilities that can only be exploited by users who already have some level of administrative access.
For example, there was a webform update yesterday to close a hole that allowed somebody who already had permission to create or edit a webform, to gain full administrative access. We use webforms on a huge number of sites, but we have never set up a configuration where we give an untrusted user the power to create or edit webforms. And yet on a large, community driven site, you might want to give some people the ability to create a survey without further access. This kind of strict, detailed review leads to a project that has a high level of security baked in. It's very rare that we see the more dangerous kinds of exploits -- SQL Injection, Cross-site scripting (XSS), or Remote Code Execution.
This incident highlights that there is more to security than just the software. In this case, something else in the hosting environment provided a weakness that allowed an attacker to break in. What was it? They haven't said, so far, but we can speculate on some possibilities:
The really common mistakes:
* A database access tool like phpMyAdmin left improperly secured, or a vulnerable version
* A control panel with vulnerabilities
* A WYSIWYG editor with file upload capabilities not managed by Drupal
* A compromised user account on the server, or a weak password protecting a server login
* Other software installed for evaluation purposes and not properly managed
Any of the above can make it pretty easy for an attacker to gain access to a server. Some recently discovered privilege escalation vulnerabilities in the Linux kernel itself can then give the attacker a way to gain control of the entire box, and from there, you're done -- you've been 0wned.
Not likely to be a source of Drupal's problem, but other common attack vectors:
* FTP in use
* Another user on the shared system
* Vulnerable software stored on another account on the same server
Drupal is big enough that it uses not just one but several dedicated servers. But we see people running CMSs on shared servers all the time, using FTP to upload files. This is horribly insecure in the first place, you're asking for trouble if you use a shared hosting account for anything important for your organization -- it's far too easy for far too many people to gain access to your site.
Some scarier, much more dangerous vectors:
* A configuration management tool like Puppet, Chef, CFEngine, or Salt
* An attack on SSH itself
* A compromised web server -- there are a few server exploits active out in the wild that security researchers still have not fully understood how they broke in: Darkleech, Cdorked and others.
If even security researchers can't figure out how an attacker broke in, it indicates we've got a pretty large problem out there.
What can you do about it?
First off, if you had an account at Drupal.org, or any of the many sites that have been compromised recently, and you use the same password on other sites, stop doing that. While what was compromised at Drupal.org were "hashed" passwords and not your actual passwords, it's pretty easy for an attacker to crack your password from the hash, often even if you thought you were using good techniques to generate it.
The best practice is to either use a very long password (as in 5 - 7 full random words), or to use a password generator to generate a long (10 or more character) password of complete gibberish with no patterns, and then store it in a secure password manager program like KeePass.
Ars Technica just published a great article about how easy it is to decode passwords from a hash -- the untrained, inexperienced author was able to decode nearly half of all the passwords in a database in just a few hours, while an experienced security expert collected more than 90% of them in less than a day.
Use a different password for each site, and use a secure password manager to store your passwords.
If your site is important to your business, don't put it on a shared account at a mass host. We do provide shared hosting on our servers, but we don't allow anybody other than Freelock staff to access the server, we don't run any control panels, and we pay close attention to the security on our servers.
We've also used configuration management to create a standard configuration, make sure our tools get deployed everywhere, and have the ability to add/revoke user accounts quickly and easily. If you don't have a strong system administration partner to make sure these things are being done, we'd be happy to manage your web server for you using our configuration tools.
How do you keep a server secure, when there are attacks that even security experts haven't figured out? The answer is, you can't.
How do you deal with this? We think there are two critical things you can do:
1. Do your best to detect a problem.
2. Make sure you have good backups.
To detect a problem, there are several different tools/approaches to use. Without disclosing our actual practices, here are some techniques we may or may not use:
* Configuration Management. By putting server configurations in a configuration management tool like Puppet or Chef, you can easily see if they have been modified, and replace any modifications with a single stroke. We're using Salt for this, and it's extremely powerful for this. In some ways, though, this does open you up to more risk -- if your configuration management master gets compromised, you've given away the keys to the kingdom.
* Code management. Using automated deployment tools like Jenkins, you can make sure production sites are running known good code, and changes only get pushed up after review. Using source management tools like git can guarantee that the code running your web site cryptographically matches the known good.
* Server binary signatures. Even these attacks that researchers can't track down leave traces in the form of binary files that get changed in some way. Software like Tripwire can create a cryptographically signed list of hashes for every program on the server, and alert you if one gets changed.
* Intrusion Detection. Software like Snort can analyze attacks on your network, and can trigger tools to block access to people who are obviously attempting to break in.
* Log analysis. Most attacks on a server leave traces in a log that can help you identify what happened. Going into a server on a regular basis and reviewing the logs for things that look suspicious can reveal an attack. Tools like Splunk can greatly help with this.
* Central logging. One problem with log analysis is that attackers can easily edit out traces of their attack, if they succeed in gaining full access to your server. By using a central logging facility that sends all logs to another server as they are written, you're much more likely to get useful information about how an attacker gained access to your server.
And then the final line of defense is simply having a good disaster recovery plan. Disasters will happen -- are you prepared to deal with them? A good backup is the cornerstone of disaster recovery, but it's not all to it. You have to know how to recover when you need it, and you have to know that you have a good backup -- if an attack goes undetected for days or weeks, are your backups already compromised?
If we lost an entire production server today, here's what our recovery process would look like:
1. Provision a new virtual machine at host of choice.
2. Install salt, identify roles, and tell it to apply its "high state", which basically configures everything necessary to make the server ready to host web sites in a matter of 1/2 hour or so.
3. Clone the git repositories for each of the sites being hosted.
4. Run the site provisioning script for each site.
5. From our backup server, restore the previous night (or earlier, if known bad) copies of the databases.
6. Restore all the user generated files from the backup server.
7. Update DNS if necessary to point at the new server.
We have faster recovery methods from snapshot images of the server at our hosts, but if we were slow to detect a problem, we know that that process will restore us to a known good state. As is, for a server hosting a dozen sites, we should be able to restore most of the sites within a couple hours, and all user data over the following few hours depending on their size.
Do you have a recovery plan for when something goes wrong? Do you know that you have proper site backups in place to protect your business-critical web site? If not, drop us a line! Have some other security tips? Please add your comment below!