Client accidentally deleted their server. We recovered in less than an hour!

By John Locke on May 31, 2017

This is why you want to be on our maintenance plans. Our number one priority is recoverability, from just about any risk. And today, we had a client that needed this, in a very bad way!

It all started with an alert on my phone saying a production site was down. Frequently these are temporary Internet blips that are gone when I actually try to pull up the site. Not this time! No site, and I couldn't even reach the server.

I pulled up the client's AWS credentials and logged in to see if there was some sort of problem with the actual server. And yes, it was showing as "Terminated" -- Amazon's term for permanently deleted. What?!?!? Had the client discontinued service without letting us know?

There were several new servers in the account, but none looked like a replacement site. We looked in the users section and found a user we weren't aware of, who had created these servers... had this person deleted the web server? Time to get on the phone.

When we reached our client contact, they had just become aware their site was down, and were gratified that we were aware and had called them. We identified the user who had been making changes in their account, and our contact reached him and confirmed that yes, indeed, this person may have accidentally deleted their server. Bingo! Now we knew what had happened. Next step: what to do about it?

This is where our recovery plans kicked in. First up, assess what we have to work with. Server was irrevocably deleted, along with the root disk. However, the data disk with the actual site and assets was available to be re-attached, and we had a disk snapshot of the root volume from 9 hours before the deletion.

I spun up a brand new instance, re-attached the IP address, attached the data volume, and a new volume from the previous snapshot.

I installed our configuration management client (the "salt-minion"), copied over the database and SSL certificate, and applied the configuration. After approving a couple credentials in our system, the site was back up and running! Total time of the outage: 50 minutes, of which the first 15 was trying to reach our client to determine why it had been deleted...

For incidents like these, we do charge for recovery. However, with our maintenance plan in place, this was a pretty straightforward recovery to do -- largely because of the planning work we do ahead of time, and solid, reliable backups set up to recover from a variety of risks. If we had not set up scripts that automatically backed up their AWS servers, and had other redundant backups available, the story might have had a much worse ending.

You can read more about our server maintenance, Drupal site maintenance, or WordPress site maintenance plans, and if we can help prevent any disasters with your site, get in touch!

Add new comment

The content of this field is kept private and will not be shown publicly.

Filtered HTML

  • Web page addresses and email addresses turn into links automatically.
  • Allowed HTML tags: <a href hreflang> <em> <strong> <blockquote cite> <cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h1> <h2 id> <h3 id> <h4 id> <h5 id> <p> <br> <img src alt height width>
  • Lines and paragraphs break automatically.

Drupal Canvas — Block HTML (locked)

  • Allowed HTML tags: <strong> <em> <u> <a href> <p> <br> <ul> <ol> <li>

Drupal Canvas — Inline HTML (locked)

  • Allowed HTML tags: <strong> <em> <u> <a href>

More Like This

Grafana line showing load dropping to normal
🕑Aug 22, 2023 🖋John Locke 💬2

Rate Limiting an aggressive bot in Nginx

High load isn't necessarily an emergency, but it may be a heads-up before a site noticeably slows down. Sometimes there are weird spikes that just go away, but sometimes this is an indication of a Denial of Service.

Code monster
🕑Mar 29, 2018 🖋John Locke 💬3

Drupalgeddon2: Should I worry about critical security updates?

No, you should not. You should let us worry about them, and go back to your business.

Seriously, we're getting questions from all kinds of people about whether this matters. I'm a bit surprised that there is any question about that. Would you be concerned if your top salesperson was selling for somebody else? If your cashiers were jotting down credit card numbers when they charged a card? If your office became a well-known spot for illicit drug or gun dealers? If your office had a bunch of scammers squatting and running a pyramid scheme? If your confidential client information could be revealed as easily as using a bic pen on an old Kryptonite lock?

Bic Pen vs Kryptonite Lock

We've seen some variation of every single one of those scenarios. And all of them are possible with a remote code execution flaw in a web application, like yesterday's Drupal security vulnerability.

And yet people still

Meltdown
🕑Jan 15, 2018 🖋John Locke 💬0

Meltdown notes

The Meltdown vulnerability leaked out into public news a full week before patches were available for many distributions. When patches did become available, sometimes the patch caused further trouble.

Meltdown in action
🕑Jan 11, 2018 🖋John Locke 💬1

The Spectre of a Meltdown

The news was supposed to come out Tuesday, but it leaked early. Last week we learned about three variations of a new class of attacks on modern computing, before many vendors could release a patch -- and we come to find out that the root cause may be entirely unpatchable, and can only be fixed by buying new computers.

Today Microsoft released a patch -- which they had to quickly pull when they discovered that it crashed computers with AMD chips.

Essentially Spectre and Meltdown demonstrate a new way of attacking your smartphone, your laptop, your company's web server, your desktop, maybe even your tv and refrigerator.

Meltdown - Animated
Meltdown in Action

This all sounds dreadfully scary. And it is... but don't panic! Instead, read on to learn how this might affect you, your website, and what you can do to prevent bad things from getting worse.