SEO stands for Search Engine Optimization. According to Moz.com, SEO is “a marketing discipline focused on growing visibility in organic (non-paid) search engine results”. They continue that it includes the “technical and creative elements required to improve rankings, drive traffic, and increase awareness in search engines.”

The technical components that they are referring to are ensuring that pages are linked correctly, pictures and pages have been “tagged” with information about them, and the site can be found and understood by Search Engines like Google and Bing.

The creative components that they are referring to are the use of certain terms throughout the website that are likely to trigger the Search Engines to know that this specific page has the answers that they are looking for.

However, for a firm to be truly effective in the world of SEO, they cannot just put terms on their page and tag pictures and be done with it—they have to walk a mile in their customer’s shoes. Proper SEO management allows this to happen.

Step 1: Walk a mile in your customers shoes

Your business has offerings, whether they be products or services—you are providing something that your customers want. However, are you sure it is the product that customers want? How can you be sure? Well, one way to do this is to track certain keywords for their volume to determine how often keywords that are relevant to your brand are being searched. Thus, if you are an auto parts store and you aren’t sure if you should put “used transmissions” or “old transmissions” on your website, you can track these two keywords to see which is used more—whichever one is used more, is how you should put it on your website. This ensures that more people will find your “used transmissions”, even if we all know they are “old transmissions.”

Step 2: Now walk a mile in your competitor’s shoes

After you have done your research on the keywords that have the most potential for you, its important to see what happens when you google them and to see if your competition comes up. If they don’t—you’ve got an opportunity. If they do, then you have to figure out if this battle is worth fighting and how you want to fight it, or differentiate yourself.

Step 3: Content is King

In today’s society, the most-effective businesses are creating more content to their website every day. The reason they are doing this is because this ensures that they will come up first on search engine rankings. Once they come up first, they are front and center in front of new potential customers with a recommendation to solve their customer’s problems—which is not a bad place to be.

Step 4: A Website Visit Does Not Put Any Money in the Till

After you have created a steady flow of potential customers to your website, it is time to start thinking about conversion. The most-effective way to think through this conversion is once again to walk a mile in your customer’s shoes. To do this, think about what they searched for, what they clicked on, and what would be a logical next step in their mind. Does it make sense at this point to “Schedule an Appointment” or does it make more sense to “Click here to Buy”, the answer is going to be different for every customer, but there is a right way to grow your business this way.

Step Forever: Use Data to Learn What Can be Improved

While the Digital Marketing World has had a ton of promise, the one thing that is certain about it is that it is the easiest way to test new ideas and see which ones attract attention. The ones that do not should be amended or deleted. The ones that do should be amplified. A truly-savvy organization in the 21st century is using their website as a laboratory to drive new customers every day.

Game on!

What is SEO?

SEO stands for Search Engine Optimization. According to Moz.com, SEO is “a marketing discipline focused on growing visibility in organic (non-paid) search engine results”. They continue that it includes the “technical and creative elements required to improve rankings, drive traffic, and increase awareness in search engines.”

The technical components that they are referring to are ensuring that pages are linked correctly, pictures and pages have been “tagged” with information about them, and the site can be found and understood by Search Engines like Google and Bing.

The creative components that they are referring to are the use of certain terms throughout the website that are likely to trigger the Search Engines to know that this specific page has the answers that they are looking for.

However, for a firm to be truly effective in the world of SEO, they cannot just put terms on their page and tag pictures and be done with it—they have to walk a mile in their customer’s shoes. Proper SEO management allows this to happen.

Step 1: Walk a mile in your customers shoes

Your business has offerings, whether they be products or services—you are providing something that your customers want. However, are you sure it is the product that customers want? How can you be sure? Well, one way to do this is to track certain keywords for their volume to determine how often keywords that are relevant to your brand are being searched. Thus, if you are an auto parts store and you aren’t sure if you should put “used transmissions” or “old transmissions” on your website, you can track these two keywords to see which is used more—whichever one is used more, is how you should put it on your website. This ensures that more people will find your “used transmissions”, even if we all know they are “old transmissions.”

Step 2: Now walk a mile in your competitor’s shoes

After you have done your research on the keywords that have the most potential for you, its important to see what happens when you google them and to see if your competition comes up. If they don’t—you’ve got an opportunity. If they do, then you have to figure out if this battle is worth fighting and how you want to fight it, or differentiate yourself.

Step 3: Content is King

In today’s society, the most-effective businesses are creating more content to their website every day. The reason they are doing this is because this ensures that they will come up first on search engine rankings. Once they come up first, they are front and center in front of new potential customers with a recommendation to solve their customer’s problems—which is not a bad place to be.

Step 4: A Website Visit Does Not Put Any Money in the Till

After you have created a steady flow of potential customers to your website, it is time to start thinking about conversion. The most-effective way to think through this conversion is once again to walk a mile in your customer’s shoes. To do this, think about what they searched for, what they clicked on, and what would be a logical next step in their mind. Does it make sense at this point to “Schedule an Appointment” or does it make more sense to “Click here to Buy”, the answer is going to be different for every customer, but there is a right way to grow your business this way.

Step Forever: Use Data to Learn What Can be Improved

While the Digital Marketing World has had a ton of promise, the one thing that is certain about it is that it is the easiest way to test new ideas and see which ones attract attention. The ones that do not should be amended or deleted. The ones that do should be amplified. A truly-savvy organization in the 21st century is using their website as a laboratory to drive new customers every day.

Game on!

A common question of companies wanting to drive traffic to their website is “Should we advertise in the search engines? Or conduct a search engine optimization project? Neither? Both?” Based on 20 years of helping people with marketing in search engines, here are a few thoughts.

First, some definitions. Search Engine Optimization is a process that includes work on various factors that Google, Bing, Yahoo and other search engines use to decide the order of sites included in the results of a query. Engines do not charge for these “organic” listings. Search Engine Advertising is almost always “pay per click”, where companies pay the search engine for each click on their ad. Ads are supposed to be labeled on a search results page, and there are usually two to six ads per page.

Where to begin when considering optimization or ads? Analytics. Take a look at existing data about your website traffic to see if there are already visitors from the search engines. There are many analytics options available. Google Analytics is a popular analytics package, but there are also open source options as well as paid options for all ranges of budgets.

So, Paid or Organic? Each has its strengths. Many companies use both options.

Paid works under the following conditions:

  • There’s sufficient traffic to actually advertise on a phrase.
    • Google’s minimum is 10 searches per month. If a phrase has fewer than 10 searches, Google will state “low search volume” and you can’t advertise on that phrase.
  • Your company’s employees or designees can create a page on your site that is an appropriate starting point.
    • Creating a new specific landing page is usually a good idea. Home pages are often too general. Other pages on your site may not be a good starting point for first-time visitors.
  • Your employees or designees have the time to test ad texts, adjust bids and otherwise monitor the ad campaigns.
    • If you “set and forget”, there’s a good chance you’ll spend a lot of money and have very little to show for it.

If you can meet these conditions, then ads in the search engines can be very helpful. These ads have many strengths:

--Companies can quickly change messages or the first page visitors see on the site.

--Companies have full control over costs.

--Companies can target different messages by geography, time of day, and other factors.

--Companies can “remarket”, showing a specific message to people who have previously visited their web site.

So, why bother with optimization?

  • The long tail of terms.
    • A very high percentage of the total traffic to any web site is from the “long tail” of phrases that have just one or two visits in a month, or even in a quarter. This is counterintuitive, as we tend to think of a single word, or a combo of two to three words, as being the most important. But the reality is variations of these one to three words usually make up the majority of traffic. Advertising directly on these variations is rarely possible.
  • Optimization costs may be included in a web site redesign, and the incremental ongoing costs may be quite low unless there’s a major shift in the search engine algorithms.
  • You’re already optimized for something. You might as well optimize for what you value.
    • Almost always, you are already listed in Google, Bing and Yahoo. The question is whether your listings are helpful or harmful?
    • For a local business, does Google have your correct hours, address, photo and other key information?
    • For all businesses, does Google include pages you might not want first-time visitors to start on?
    • For all businesses, is the text in the search results helpful, or harmful?

There are plenty of horror stories about what Google includes in the search results for a company. Links from other web sites are a very important factor in the Google search engine results. So, if there are a lot of links pointing to a software company’s bug reports page, or a company’s product recall page, or information that is simply out of date, that may well be where visitors start. And end. They may not take the time to look any further on your web site.

To conclude, sometimes both search engine optimization and ads in the search engines can be worthwhile. At other times, though, neither optimization nor ads may be worthwhile. And in some situations, one may definitely be the better solution. But in any event, I highly recommend starting by taking a look at your analytics to see if you’re already getting traffic.

-Guest blogger Stuart Jenner of Marketek Consulting Group is based in Seattle. He assists companies in a wide range of industries with search engine optimization and paid search guidance.

Paid Search or SEO: Which to Use?

A common question of companies wanting to drive traffic to their website is “Should we advertise in the search engines? Or conduct a search engine optimization project? Neither? Both?” Based on 20 years of helping people with marketing in search engines, here are a few thoughts.

First, some definitions. Search Engine Optimization is a process that includes work on various factors that Google, Bing, Yahoo and other search engines use to decide the order of sites included in the results of a query. Engines do not charge for these “organic” listings. Search Engine Advertising is almost always “pay per click”, where companies pay the search engine for each click on their ad. Ads are supposed to be labeled on a search results page, and there are usually two to six ads per page.

Where to begin when considering optimization or ads? Analytics. Take a look at existing data about your website traffic to see if there are already visitors from the search engines. There are many analytics options available. Google Analytics is a popular analytics package, but there are also open source options as well as paid options for all ranges of budgets.

So, Paid or Organic? Each has its strengths. Many companies use both options.

Paid works under the following conditions:

  • There’s sufficient traffic to actually advertise on a phrase.
    • Google’s minimum is 10 searches per month. If a phrase has fewer than 10 searches, Google will state “low search volume” and you can’t advertise on that phrase.
  • Your company’s employees or designees can create a page on your site that is an appropriate starting point.
    • Creating a new specific landing page is usually a good idea. Home pages are often too general. Other pages on your site may not be a good starting point for first-time visitors.
  • Your employees or designees have the time to test ad texts, adjust bids and otherwise monitor the ad campaigns.
    • If you “set and forget”, there’s a good chance you’ll spend a lot of money and have very little to show for it.

If you can meet these conditions, then ads in the search engines can be very helpful. These ads have many strengths:

--Companies can quickly change messages or the first page visitors see on the site.

--Companies have full control over costs.

--Companies can target different messages by geography, time of day, and other factors.

--Companies can “remarket”, showing a specific message to people who have previously visited their web site.

So, why bother with optimization?

  • The long tail of terms.
    • A very high percentage of the total traffic to any web site is from the “long tail” of phrases that have just one or two visits in a month, or even in a quarter. This is counterintuitive, as we tend to think of a single word, or a combo of two to three words, as being the most important. But the reality is variations of these one to three words usually make up the majority of traffic. Advertising directly on these variations is rarely possible.
  • Optimization costs may be included in a web site redesign, and the incremental ongoing costs may be quite low unless there’s a major shift in the search engine algorithms.
  • You’re already optimized for something. You might as well optimize for what you value.
    • Almost always, you are already listed in Google, Bing and Yahoo. The question is whether your listings are helpful or harmful?
    • For a local business, does Google have your correct hours, address, photo and other key information?
    • For all businesses, does Google include pages you might not want first-time visitors to start on?
    • For all businesses, is the text in the search results helpful, or harmful?

There are plenty of horror stories about what Google includes in the search results for a company. Links from other web sites are a very important factor in the Google search engine results. So, if there are a lot of links pointing to a software company’s bug reports page, or a company’s product recall page, or information that is simply out of date, that may well be where visitors start. And end. They may not take the time to look any further on your web site.

To conclude, sometimes both search engine optimization and ads in the search engines can be worthwhile. At other times, though, neither optimization nor ads may be worthwhile. And in some situations, one may definitely be the better solution. But in any event, I highly recommend starting by taking a look at your analytics to see if you’re already getting traffic.

-Guest blogger Stuart Jenner of Marketek Consulting Group is based in Seattle. He assists companies in a wide range of industries with search engine optimization and paid search guidance.

No, you should not. You should let us worry about them, and go back to your business.

Seriously, we're getting questions from all kinds of people about whether this matters. I'm a bit surprised that there is any question about that. Would you be concerned if your top salesperson was selling for somebody else? If your cashiers were jotting down credit card numbers when they charged a card? If your office became a well-known spot for illicit drug or gun dealers? If your office had a bunch of scammers squatting and running a pyramid scheme? If your confidential client information could be revealed as easily as using a bic pen on an old Kryptonite lock?

Bic Pen vs Kryptonite Lock

We've seen some variation of every single one of those scenarios. And all of them are possible with a remote code execution flaw in a web application, like yesterday's Drupal security vulnerability.

And yet people still don't take website security seriously.

If you have any question about whether it's important to keep your site up-to-date, review that second paragraph again. And then give us a call. Today.

Back up a second. What's remote blah-di-dy-blah blah code execution?

A "Remote Code Execution" vulnerability is a flaw that allows somebody bad to run whatever code they want inside your website. If this happens to you, it is like having a bad employee with keys to anything in your business. Only you don't even know who the employee is, and they are probably sitting in an entirely different country.

Remote because you don't need to have authorized access to the site or server -- they can get in without a key.

Code is what runs your Drupal or WordPress website, and you can find code snippets to do huge amounts of things without even having to write that code -- code can intercept data going in or out of the website, can attack other sites, can change data, can even plant your favorite celebrity's face on a porn video or generate crypto-cash.

Execution means the bad guy can run the code they put up there remotely, and potentially execute your company. Via lawsuits. From your customers whose data you failed to protect. Or the people you defrauded (or rather, your squatters who now control your website defrauded).

Drupalgedden 2 - Wait, didn't they learn the first time?

Yesterday we patched all of our Drupal maintenance customers for a new security vulnerability, dubbed Drupalgeddon 2, within a few hours of the disclosure.

Not only did we patch all the sites, but we did so automatically, with full backups taken immediately before the patch was applied, across over a dozen different hosting companies, 3 different Drupal versions, scads of different clients. We rocked it -- we had only 3 failures across the entire portfolio, all of which were easily dealt with manually. We can easily handle 4 times the number of sites we currently manage.

We did learn from the first DrupalGeddon, at least how to do it automatically. But then, DevOps is what we do best. Enough bragging. Let's look at the vulnerabilities.

DrupalGeddon the First happened in October 2014. That was a remote code execution vulnerability that at its heart had to do with form parameters posted with malicious keys. When you post data from a form, each field has a key (like username or password) and a value (like "John Doe" and "MyPassword123"). Web developers have long known not to trust any of the values coming from browsers, because those are very easy to fake. Keys? Most web applications look for specific keys and ignore keys they don't recognize.

A huge number of attacks target forms of one sort or another, simply because that's one place where you're meant to put stuff into an application. There's a lot of ways to inject stuff the developer doesn't expect, and if an attacker can figure out some way of getting past the protections used by the web developer, she can trick the application into doing something it's not supposed to do. Like doing things to your data (SQL Injection), adding malicious Javascript (Cross-Site-Scripting, or XSS), getting an admin to submit a form for you (Cross-site request forgery, or CSRF), or a variety of other attacks.

Drupal provides a "Form API" that blocks most of these attacks by design, using a lot of security "best practices".

Drupal has automatic "sanitization" of all form values, if the developer uses the core "Form API" it provides. The Form API is extremely powerful, but adds a fair amount of complexity under the hood -- and complexity means more places a vulnerability can creep in and get overlooked. The Form API uses arrays, nested inside other arrays, nested inside more arrays. Going back to the fundamental web form, this means many keys in the form themselves use an array notation, such as "name[0]" "name[1]". The "name" part of that has gotten correctly sanitized (cleaned to prevent any possible injection) -- but what's inside the brackets is a bit harder (because the PHP language itself helpfully converts these to array keys).

With the first DrupalGeddon, they discovered that an attacker could inject nasty stuff into the brackets, and own the site, along the lines of "name[--"put bad stuff here]".

This time around, the attack is on the Form API itself. The Form API tracks lots of stuff in arrays with keys that start with "#". Of particular concern: "#validate", which designates what function to run when the form is submitted. And happens to share the same function signature as "#submit" -- which gets all the other form values, but assumes the form has already been validated. Let's see. What form might I be able to hijack by bypassing validation? What other function might I be able to run that would load my code instead of what's expected? (This is left as an exercise for the reader...)

This has now happened twice to Drupal. Should I move to WordPress?

Gawd, no. Compared to Drupal, WordPress is a shit-show.

WordPress doesn't even have a Form API. Which means there is no standard way for developers to create forms. Which means if you use code from a developer who isn't extremely competent in how to block CSRF, XSS, SQL Injection, and other far "easier" attacks who rolled their own form, chances are your WordPress site has more security holes than swiss cheese.

WordPress core has a reasonably competent security team watching over it. But the > 54,000 plugins available in the WP repository? You're on your own. The security team might handle working with a plugin author if they get notified about a security issue, but they're not out their poring over plugin code looking for vulnerabilities. And with countless proprietary plugins in widespread use getting little to no external review, it's no wonder we see so many hacked WordPress sites.

There are even some accounts of WordPress sites getting broken into using the same exploit code used in Drupalgeddon the First!

If you have a WordPress site at a regular webhost, and don't know what to do if you went to it one day only to find a blank screen, get ahold of us now so we can make sure you're properly backed up and have a recovery plan! You are at far greater risk than you know...

If you're on a reasonably well-built Drupal site, have this vulnerability patched, have reasonably secure hosting, and strong passwords, you're way better off than most WordPress sites. Drupal has a really strong security record, and the vast majority of security issues reported (and fixed) are "privilege escalation" bugs. The typical security update fixes problems like your janitor who has keys to all of your office can find the combination to your safe written in a folder labeled "Insurance contracts." First off, you need to have a sketchy janitor up to nefarious things (why would you hire somebody like that?). And then they would need to go searching in all the most boring places imaginable.

Is this a Zero-day exploit?

No. And that is exactly why Drupal has such a strong reputation for security.

A "Zero-day exploit" is when there is code that exploits a vulnerability before there is a patch for it -- the code to fix a vulnerability was released after it was getting used in the wild. At worst, this is a One-day exploit -- it's been one day since the fix was released, and we haven't yet seen it exploited in the wild -- but that could change today.

Drupal Security coverage shield
Drupal Security coverage shield

This vulnerability was discovered by a developer doing security research and audits on Drupal code. At this writing, it is still considered "theoretical" but that is not going to last long.

Disclosing possible issues, and patching them before they are exploited in the wild, is a hallmark of a project that takes security seriously.

Compared to WordPress, the Drupal security team covers not just Drupal core, but also thousands of contributed modules hosted on Drupal.org -- look for the security badge for any module you use and make sure it has security team coverage if you're concerned.

 

Drupalgeddon2: Should I worry about critical security updates?

No, you should not. You should let us worry about them, and go back to your business.

Seriously, we're getting questions from all kinds of people about whether this matters. I'm a bit surprised that there is any question about that. Would you be concerned if your top salesperson was selling for somebody else? If your cashiers were jotting down credit card numbers when they charged a card? If your office became a well-known spot for illicit drug or gun dealers? If your office had a bunch of scammers squatting and running a pyramid scheme? If your confidential client information could be revealed as easily as using a bic pen on an old Kryptonite lock?

Bic Pen vs Kryptonite Lock

We've seen some variation of every single one of those scenarios. And all of them are possible with a remote code execution flaw in a web application, like yesterday's Drupal security vulnerability.

And yet people still don't take website security seriously.

If you have any question about whether it's important to keep your site up-to-date, review that second paragraph again. And then give us a call. Today.

Back up a second. What's remote blah-di-dy-blah blah code execution?

A "Remote Code Execution" vulnerability is a flaw that allows somebody bad to run whatever code they want inside your website. If this happens to you, it is like having a bad employee with keys to anything in your business. Only you don't even know who the employee is, and they are probably sitting in an entirely different country.

Remote because you don't need to have authorized access to the site or server -- they can get in without a key.

Code is what runs your Drupal or WordPress website, and you can find code snippets to do huge amounts of things without even having to write that code -- code can intercept data going in or out of the website, can attack other sites, can change data, can even plant your favorite celebrity's face on a porn video or generate crypto-cash.

Execution means the bad guy can run the code they put up there remotely, and potentially execute your company. Via lawsuits. From your customers whose data you failed to protect. Or the people you defrauded (or rather, your squatters who now control your website defrauded).

Drupalgedden 2 - Wait, didn't they learn the first time?

Yesterday we patched all of our Drupal maintenance customers for a new security vulnerability, dubbed Drupalgeddon 2, within a few hours of the disclosure.

Not only did we patch all the sites, but we did so automatically, with full backups taken immediately before the patch was applied, across over a dozen different hosting companies, 3 different Drupal versions, scads of different clients. We rocked it -- we had only 3 failures across the entire portfolio, all of which were easily dealt with manually. We can easily handle 4 times the number of sites we currently manage.

We did learn from the first DrupalGeddon, at least how to do it automatically. But then, DevOps is what we do best. Enough bragging. Let's look at the vulnerabilities.

DrupalGeddon the First happened in October 2014. That was a remote code execution vulnerability that at its heart had to do with form parameters posted with malicious keys. When you post data from a form, each field has a key (like username or password) and a value (like "John Doe" and "MyPassword123"). Web developers have long known not to trust any of the values coming from browsers, because those are very easy to fake. Keys? Most web applications look for specific keys and ignore keys they don't recognize.

A huge number of attacks target forms of one sort or another, simply because that's one place where you're meant to put stuff into an application. There's a lot of ways to inject stuff the developer doesn't expect, and if an attacker can figure out some way of getting past the protections used by the web developer, she can trick the application into doing something it's not supposed to do. Like doing things to your data (SQL Injection), adding malicious Javascript (Cross-Site-Scripting, or XSS), getting an admin to submit a form for you (Cross-site request forgery, or CSRF), or a variety of other attacks.

Drupal provides a "Form API" that blocks most of these attacks by design, using a lot of security "best practices".

Drupal has automatic "sanitization" of all form values, if the developer uses the core "Form API" it provides. The Form API is extremely powerful, but adds a fair amount of complexity under the hood -- and complexity means more places a vulnerability can creep in and get overlooked. The Form API uses arrays, nested inside other arrays, nested inside more arrays. Going back to the fundamental web form, this means many keys in the form themselves use an array notation, such as "name[0]" "name[1]". The "name" part of that has gotten correctly sanitized (cleaned to prevent any possible injection) -- but what's inside the brackets is a bit harder (because the PHP language itself helpfully converts these to array keys).

With the first DrupalGeddon, they discovered that an attacker could inject nasty stuff into the brackets, and own the site, along the lines of "name[--"put bad stuff here]".

This time around, the attack is on the Form API itself. The Form API tracks lots of stuff in arrays with keys that start with "#". Of particular concern: "#validate", which designates what function to run when the form is submitted. And happens to share the same function signature as "#submit" -- which gets all the other form values, but assumes the form has already been validated. Let's see. What form might I be able to hijack by bypassing validation? What other function might I be able to run that would load my code instead of what's expected? (This is left as an exercise for the reader...)

This has now happened twice to Drupal. Should I move to WordPress?

Gawd, no. Compared to Drupal, WordPress is a shit-show.

WordPress doesn't even have a Form API. Which means there is no standard way for developers to create forms. Which means if you use code from a developer who isn't extremely competent in how to block CSRF, XSS, SQL Injection, and other far "easier" attacks who rolled their own form, chances are your WordPress site has more security holes than swiss cheese.

WordPress core has a reasonably competent security team watching over it. But the > 54,000 plugins available in the WP repository? You're on your own. The security team might handle working with a plugin author if they get notified about a security issue, but they're not out their poring over plugin code looking for vulnerabilities. And with countless proprietary plugins in widespread use getting little to no external review, it's no wonder we see so many hacked WordPress sites.

There are even some accounts of WordPress sites getting broken into using the same exploit code used in Drupalgeddon the First!

If you have a WordPress site at a regular webhost, and don't know what to do if you went to it one day only to find a blank screen, get ahold of us now so we can make sure you're properly backed up and have a recovery plan! You are at far greater risk than you know...

If you're on a reasonably well-built Drupal site, have this vulnerability patched, have reasonably secure hosting, and strong passwords, you're way better off than most WordPress sites. Drupal has a really strong security record, and the vast majority of security issues reported (and fixed) are "privilege escalation" bugs. The typical security update fixes problems like your janitor who has keys to all of your office can find the combination to your safe written in a folder labeled "Insurance contracts." First off, you need to have a sketchy janitor up to nefarious things (why would you hire somebody like that?). And then they would need to go searching in all the most boring places imaginable.

Is this a Zero-day exploit?

No. And that is exactly why Drupal has such a strong reputation for security.

A "Zero-day exploit" is when there is code that exploits a vulnerability before there is a patch for it -- the code to fix a vulnerability was released after it was getting used in the wild. At worst, this is a One-day exploit -- it's been one day since the fix was released, and we haven't yet seen it exploited in the wild -- but that could change today.

Drupal Security coverage shield
Drupal Security coverage shield

This vulnerability was discovered by a developer doing security research and audits on Drupal code. At this writing, it is still considered "theoretical" but that is not going to last long.

Disclosing possible issues, and patching them before they are exploited in the wild, is a hallmark of a project that takes security seriously.

Compared to WordPress, the Drupal security team covers not just Drupal core, but also thousands of contributed modules hosted on Drupal.org -- look for the security badge for any module you use and make sure it has security team coverage if you're concerned.

 

The Meltdown vulnerability leaked out into public news a full week before patches were available for many distributions. When patches did become available, sometimes the patch caused further trouble.

Our vulnerable systems

Before patches were available, we downloaded the Proof-of-Concept exploit code, compiled and tested it on a variety of the environments we work in, or have in production.

Here's a quick run-down of what we found affected, and what was not:

Environment Affected? Notes
Local 16.04 workstation Yes Exploit ran quickly and reliably.
Dev server - 14.04 virtual machine, 10 year-old hardware Yes -- but slow Exploit ran, and revealed information -- but unlike our workstations, the information dripped out a character at a time, with some errors.
Amazon AWS servers Yes -- but slow Similar to our own virtual servers, exploit code ran and revealed secrets, but slowly and with errors. Amazon had already patched the underlying hosts.
Google Cloud Engine No Google's Project Zero team was one of the groups that discovered the exploit, and Google has deployed something on their infrastructure that seemed to completely foil this attack. The attack printed a bunch of garbage characters, no actual clear text.
Digital Ocean Yes The exploit ran perfectly, and very quickly, within our Digital Ocean guests.

We did not attempt to exploit other guests on the same hardware -- all our testing was exploiting Meltdown within a single virtual (or dedicated) host.

What happened when we patched

Most of our infrastructure uses Ubuntu LTS (Long-Term Support) releases. Ubuntu published patches for Meltdown on Tuesday January 9, the original coordinated disclosure date. We updated our older 14.04 servers to use the 16.04 HWE kernel, and deployed Ubuntu's 4.4.0-108-generic pretty much across the board, aside from some hosts that used the AWS-specific kernel. We installed these updates on Tuesday afternoon, and rebooted all our hosts that evening.

For the most part, everything went very smoothly. However, we had 2 incidents:

  • One of our continuous integration workers failed to boot into the new kernel. This was dedicated hardware, in our office, and we did not have a remote console available -- which essentially made all our overnight scheduled maintenance jobs fail. This was fixed by a kernel release the following day, for 4.4.0-109-generic.
  • Our configuration management server (Salt) ended up getting extremely high loads whenever a git commit got pushed.

Meltdown is an attack on how the CPU schedules work, and patches for it essentially disables processor features designed to speed up computing. Most sources suggest there is a 5% - 30% degradation in CPU speed after patching for Meltdown -- highly dependent on workload.

For the most part, we're not noticing big slowdowns, with the one exception of our Salt events.

Salt event.fire_master performance devastation

We've spent a lot of time automating our systems, and have a variety of triggers hooked up. Once we decide upon a trigger, we will often publish events in various systems, so that at some point in the future if we decide to use them, they're already there running. One of those is a git post-update hook -- whenever anyone pushes any git commits to our central git server, we publish an event in several different systems that any other system can subscribe to, and take action.

In our SaltStack configuration management system, our bot uses "salt-call event.fire_master" to publish a system-wide Salt event. At the moment, we have a "Salt Reactor" listening for these on a few of our repositories, but for the most part these end up entirely ignored. And our Salt Master was ending up with a load north of 20 - 30, with a bunch of these event triggers stacked up.

When you run the event command in a shell, it normally fires and returns within a second or so. However, with the kernel patched for Meltdown, the same exact command would take 2 - 3 minutes before the shell prompt re-appeared -- even for repositories that had nothing subscribed to the event! Worse, our bot uses node.js to trigger these events, and in that environment it was taking more like 15 - 20 minutes before it timed out and cleaned up the process. And with commits happening every minute or two, the CPU load quickly started climbing and triggering all sorts of monitoring issues.

 

Meltdown notes

The Meltdown vulnerability leaked out into public news a full week before patches were available for many distributions. When patches did become available, sometimes the patch caused further trouble.

Our vulnerable systems

Before patches were available, we downloaded the Proof-of-Concept exploit code, compiled and tested it on a variety of the environments we work in, or have in production.

Here's a quick run-down of what we found affected, and what was not:

Environment Affected? Notes
Local 16.04 workstation Yes Exploit ran quickly and reliably.
Dev server - 14.04 virtual machine, 10 year-old hardware Yes -- but slow Exploit ran, and revealed information -- but unlike our workstations, the information dripped out a character at a time, with some errors.
Amazon AWS servers Yes -- but slow Similar to our own virtual servers, exploit code ran and revealed secrets, but slowly and with errors. Amazon had already patched the underlying hosts.
Google Cloud Engine No Google's Project Zero team was one of the groups that discovered the exploit, and Google has deployed something on their infrastructure that seemed to completely foil this attack. The attack printed a bunch of garbage characters, no actual clear text.
Digital Ocean Yes The exploit ran perfectly, and very quickly, within our Digital Ocean guests.

We did not attempt to exploit other guests on the same hardware -- all our testing was exploiting Meltdown within a single virtual (or dedicated) host.

What happened when we patched

Most of our infrastructure uses Ubuntu LTS (Long-Term Support) releases. Ubuntu published patches for Meltdown on Tuesday January 9, the original coordinated disclosure date. We updated our older 14.04 servers to use the 16.04 HWE kernel, and deployed Ubuntu's 4.4.0-108-generic pretty much across the board, aside from some hosts that used the AWS-specific kernel. We installed these updates on Tuesday afternoon, and rebooted all our hosts that evening.

For the most part, everything went very smoothly. However, we had 2 incidents:

  • One of our continuous integration workers failed to boot into the new kernel. This was dedicated hardware, in our office, and we did not have a remote console available -- which essentially made all our overnight scheduled maintenance jobs fail. This was fixed by a kernel release the following day, for 4.4.0-109-generic.
  • Our configuration management server (Salt) ended up getting extremely high loads whenever a git commit got pushed.

Meltdown is an attack on how the CPU schedules work, and patches for it essentially disables processor features designed to speed up computing. Most sources suggest there is a 5% - 30% degradation in CPU speed after patching for Meltdown -- highly dependent on workload.

For the most part, we're not noticing big slowdowns, with the one exception of our Salt events.

Salt event.fire_master performance devastation

We've spent a lot of time automating our systems, and have a variety of triggers hooked up. Once we decide upon a trigger, we will often publish events in various systems, so that at some point in the future if we decide to use them, they're already there running. One of those is a git post-update hook -- whenever anyone pushes any git commits to our central git server, we publish an event in several different systems that any other system can subscribe to, and take action.

In our SaltStack configuration management system, our bot uses "salt-call event.fire_master" to publish a system-wide Salt event. At the moment, we have a "Salt Reactor" listening for these on a few of our repositories, but for the most part these end up entirely ignored. And our Salt Master was ending up with a load north of 20 - 30, with a bunch of these event triggers stacked up.

When you run the event command in a shell, it normally fires and returns within a second or so. However, with the kernel patched for Meltdown, the same exact command would take 2 - 3 minutes before the shell prompt re-appeared -- even for repositories that had nothing subscribed to the event! Worse, our bot uses node.js to trigger these events, and in that environment it was taking more like 15 - 20 minutes before it timed out and cleaned up the process. And with commits happening every minute or two, the CPU load quickly started climbing and triggering all sorts of monitoring issues.

 

It's only taken two years since the release of Drupal 8 for us to get our own site updated... Cobbler's children and all. But finally, we are proud to unveil our shiny new site!

But wait, don't you tell your clients you don't need a new site?

Take a look around... all our old content is here, most of it in the same place it has always been. In fact, we fixed some things that were missing from our last site -- several pages that had broken videos are now fixed. All in all, our site has right around a thousand pages -- and somewhere between 1/4 to 1/2 of our clients use it to pay their invoices, so Commerce is a critical piece. It turns out our site has a lot more going on than many of our clients, with some content going back over 18 years.

We see our content and the website as our biggest asset. Much of our new business comes through our website -- less now than in the past, but it still plays a vital role in helping new visitors get to know us, learn our strengths, and ultimately develop enough trust to become a client.

In the past couple years, we have added WordPress to our maintenance arsenal, provide regular maintenance, updates, enhancements and improvements. Working with WordPress is easy -- it's like working with a toy, it really does not do much for you. This frees up a web designer to do whatever they want with the look and feel, and the system does not get in the way.

But we're not web designers -- we are developers, system administrators, business consultants, information architects, and we think Drupal 8 is the best system out there. We love working with it -- it's great for marketing people, it's great for developers, it's great for managing data and information, it's great for integrating with other systems, and it's great for the future. So that's what we chose for our site.

A simple example: most WordPress sites we've been seeing have somewhere between a dozen and 50 database tables. This site has over 800 tables (ok yeah, maybe we experiment a bit much) and most of our Drupal sites have somewhere between 100 and 500 database tables. That's just one indication of how much more Drupal is doing for you. Overkill? Maybe, if you just want a blog. But if you're doing e-commerce, customer management, membership management, complex layouts, scheduling, event registration, publishing workflows, you end up with a lot more sophistication under the hood.

Migration

The initial migration was easy. We sucked over all our content very quickly, right from the start. But... that's just getting content into the site. There ends up being tons of issues to resolve going forward. Things like:

  • Converting embedded media to the new Drupal media format
  • Finding the current location of videos that were no longer where they used to be
  • Consolidating tags into our current set we want to make a bit cleaner
  • Customer payment profiles, to continue charging our clients who auto-pay their bills as seamlessly as possible
  • Supporting images/media that were previously migrated into Drupal 7

Part of the complexity of this for us was that our site has gone through many versions. First it was entirely custom PHP. Then it was Joomla. Then it was Drupal 6 -- and we folded in a separate MediaWiki site. Then it was Drupal 7, and we folded in a WordPress site. And without a person dedicated to going through the old content and bringing it up to date, we've just accumulated that content and brought it forward, fixing the issues so it continues to look ok (actually better than it ever has before!)

The more we looked around nearing launch, the more stuff we found that needed fixing, so it was a huge push on the week before we pulled the trigger to get that all squared away.

Commerce

We're really impressed with Drupal Commerce 2, in Drupal 8. It seems very robust, and so much of it "just works" out of the box with very little configuration. We had to create two plugins -- one to support our payment gateway, and one for Washington State tax collection -- and we had to do some tweaks to get migrations from our old Ubercart 7 store for customers, products, and orders -- but otherwise we spent very little time on the Commerce. And we had a new customer successfully make a payment the very next day after we turned it all on!

We did write another custom module to support our payment links. Way back in Ubercart 6, we became early adopters of "cart links", which allows us to send a link to a customer that populates their cart with what they are buying. This sets up our automatic payments for hosting with tax properly calculated, and our monthly maintenance plans. Our bookkeeping system also sends out a payment link that allows people to pay invoices through our site.

We created a custom Product Variation for our invoice payment now that makes this process easier, and so while we were at it, we simplified our cart links to make them easier to figure out on the fly (just sku and qty, and for invoices, invoice # and amount) and also made them "idempotent" (a computer science term meaning you can click the link over and over again and you get the same result -- it won't keep adding more items to the cart).

Front End

Yes, it's Bootstrap. (Caution: that link is not exactly... kind... or appropriate for work) Bootstrap seems to be what everybody wants these days, it's a decent looking theme that we use on almost everything (contributing to that problem!)

The thing is, it looks nice, it works great in mobile, and it lets us focus more on what we want to get across -- our content -- and not spend much time with design. And frankly, that's what we advise for our clients, too -- start with your content, what you're trying to get across, what makes you special. If design is your thing, great! Go out and get a really top notch, custom design. But if it's not... well, just use Bootstrap. And try to use some unique photography, the difference between a great bootstrap site and one that's Meh is just photography.

It's not that we don't think design is important. The key point here is that design should be directed to support some goal you have for the website -- and if your goal is a company brochure or any of a number of different purposes, well, Bootstrap has solved a lot of those basic design problems. Spend your time on your content.

Once you have a really clear idea of what you want your users to do on your site, then bring in a designer to optimize for those goals.

With all that said, we're really excited to have our site current so we can start experimenting with entirely new web UIs. We've particularly been delving into Vue.js, React, and GraphQL, and have some demos we've built and integrated into a couple sites we can't wait to roll out here!

Here's to 2018!

We did launch the site early. There are still layout glitches here we're quickly fixing, in between client work (If you're on Safari, sorry!). But we feel a huge sense of relief to be fully up-to-date on a new site, which gives us so many opportunities to try out new things for ourselves, and then can share what works with our clients.

Need a web partner to bring you up to date? Let us know how we can help!

New Year, New Website!

It's only taken two years since the release of Drupal 8 for us to get our own site updated... Cobbler's children and all. But finally, we are proud to unveil our shiny new site!

But wait, don't you tell your clients you don't need a new site?

Take a look around... all our old content is here, most of it in the same place it has always been. In fact, we fixed some things that were missing from our last site -- several pages that had broken videos are now fixed. All in all, our site has right around a thousand pages -- and somewhere between 1/4 to 1/2 of our clients use it to pay their invoices, so Commerce is a critical piece. It turns out our site has a lot more going on than many of our clients, with some content going back over 18 years.

We see our content and the website as our biggest asset. Much of our new business comes through our website -- less now than in the past, but it still plays a vital role in helping new visitors get to know us, learn our strengths, and ultimately develop enough trust to become a client.

In the past couple years, we have added WordPress to our maintenance arsenal, provide regular maintenance, updates, enhancements and improvements. Working with WordPress is easy -- it's like working with a toy, it really does not do much for you. This frees up a web designer to do whatever they want with the look and feel, and the system does not get in the way.

But we're not web designers -- we are developers, system administrators, business consultants, information architects, and we think Drupal 8 is the best system out there. We love working with it -- it's great for marketing people, it's great for developers, it's great for managing data and information, it's great for integrating with other systems, and it's great for the future. So that's what we chose for our site.

A simple example: most WordPress sites we've been seeing have somewhere between a dozen and 50 database tables. This site has over 800 tables (ok yeah, maybe we experiment a bit much) and most of our Drupal sites have somewhere between 100 and 500 database tables. That's just one indication of how much more Drupal is doing for you. Overkill? Maybe, if you just want a blog. But if you're doing e-commerce, customer management, membership management, complex layouts, scheduling, event registration, publishing workflows, you end up with a lot more sophistication under the hood.

Migration

The initial migration was easy. We sucked over all our content very quickly, right from the start. But... that's just getting content into the site. There ends up being tons of issues to resolve going forward. Things like:

  • Converting embedded media to the new Drupal media format
  • Finding the current location of videos that were no longer where they used to be
  • Consolidating tags into our current set we want to make a bit cleaner
  • Customer payment profiles, to continue charging our clients who auto-pay their bills as seamlessly as possible
  • Supporting images/media that were previously migrated into Drupal 7

Part of the complexity of this for us was that our site has gone through many versions. First it was entirely custom PHP. Then it was Joomla. Then it was Drupal 6 -- and we folded in a separate MediaWiki site. Then it was Drupal 7, and we folded in a WordPress site. And without a person dedicated to going through the old content and bringing it up to date, we've just accumulated that content and brought it forward, fixing the issues so it continues to look ok (actually better than it ever has before!)

The more we looked around nearing launch, the more stuff we found that needed fixing, so it was a huge push on the week before we pulled the trigger to get that all squared away.

Commerce

We're really impressed with Drupal Commerce 2, in Drupal 8. It seems very robust, and so much of it "just works" out of the box with very little configuration. We had to create two plugins -- one to support our payment gateway, and one for Washington State tax collection -- and we had to do some tweaks to get migrations from our old Ubercart 7 store for customers, products, and orders -- but otherwise we spent very little time on the Commerce. And we had a new customer successfully make a payment the very next day after we turned it all on!

We did write another custom module to support our payment links. Way back in Ubercart 6, we became early adopters of "cart links", which allows us to send a link to a customer that populates their cart with what they are buying. This sets up our automatic payments for hosting with tax properly calculated, and our monthly maintenance plans. Our bookkeeping system also sends out a payment link that allows people to pay invoices through our site.

We created a custom Product Variation for our invoice payment now that makes this process easier, and so while we were at it, we simplified our cart links to make them easier to figure out on the fly (just sku and qty, and for invoices, invoice # and amount) and also made them "idempotent" (a computer science term meaning you can click the link over and over again and you get the same result -- it won't keep adding more items to the cart).

Front End

Yes, it's Bootstrap. (Caution: that link is not exactly... kind... or appropriate for work) Bootstrap seems to be what everybody wants these days, it's a decent looking theme that we use on almost everything (contributing to that problem!)

The thing is, it looks nice, it works great in mobile, and it lets us focus more on what we want to get across -- our content -- and not spend much time with design. And frankly, that's what we advise for our clients, too -- start with your content, what you're trying to get across, what makes you special. If design is your thing, great! Go out and get a really top notch, custom design. But if it's not... well, just use Bootstrap. And try to use some unique photography, the difference between a great bootstrap site and one that's Meh is just photography.

It's not that we don't think design is important. The key point here is that design should be directed to support some goal you have for the website -- and if your goal is a company brochure or any of a number of different purposes, well, Bootstrap has solved a lot of those basic design problems. Spend your time on your content.

Once you have a really clear idea of what you want your users to do on your site, then bring in a designer to optimize for those goals.

With all that said, we're really excited to have our site current so we can start experimenting with entirely new web UIs. We've particularly been delving into Vue.js, React, and GraphQL, and have some demos we've built and integrated into a couple sites we can't wait to roll out here!

Here's to 2018!

We did launch the site early. There are still layout glitches here we're quickly fixing, in between client work (If you're on Safari, sorry!). But we feel a huge sense of relief to be fully up-to-date on a new site, which gives us so many opportunities to try out new things for ourselves, and then can share what works with our clients.

Need a web partner to bring you up to date? Let us know how we can help!

The news was supposed to come out Tuesday, but it leaked early. Last week we learned about three variations of a new class of attacks on modern computing, before many vendors could release a patch -- and we come to find out that the root cause may be entirely unpatchable, and can only be fixed by buying new computers.

Today Microsoft released a patch -- which they had to quickly pull when they discovered that it crashed computers with AMD chips.

Essentially Spectre and Meltdown demonstrate a new way of attacking your smartphone, your laptop, your company's web server, your desktop, maybe even your tv and refrigerator.

Meltdown - Animated
Meltdown in Action

This all sounds dreadfully scary. And it is... but don't panic! Instead, read on to learn how this might affect you, your website, and what you can do to prevent bad things from getting worse.

How will this affect you?

All of these attacks fall into a class of "Information Disclosure." A successful Spectre attack can reveal information you want to keep secret -- mainly your passwords, and security keys widely used to protect information and identity.

Thumbnail

Have any Bitcoin lying around? Your wallet could get compromised with this type of attack. Visit any SSL sites? A secure SSL certificate on a server might have its private key stolen, and incorporated into a fake certificate -- which would make "Man in the middle" attacks from wifi hotspots a lot more effective -- a phisher could set up a fake copy of your bank's website, and there would be no way to tell it apart from the real website, because it has a copy of the real certificate. Use a password manager to keep track of all those different passwords you need for each site? Spectre can make those secrets -- not so secret. This is far worse than Heartbleed.

Over the coming months and years, this will be a headache. Security updates on your phone and all of your computers will be more important to apply promptly than ever before -- because a new 0-day attack could give the attacker the keys to your online kingdom.

The good news is, browsers have already updated with fixes that make a Spectre attack from remote Javascript much more difficult, and Meltdown is nearly patched.

The bad news is, patching for Meltdown means slowing down your computers substantially -- reports suggest by somewhere between 5% and 30%, depending on the types of computing being done. And there isn't really a way of patching Spectre -- it's a design flaw having to do with how the processor caches what it's working on, while using something called "Speculative Processing" to try to speed up its work -- fully preventing a Spectre attack means deploying new hardware processors that manage their low-level caching in a different way.

So preventing Spectre attacks falls more into the realm of viruses -- blocking specific attacks, rather than stopping the vulnerability entirely, at least as I understand the problem. For more, ZDNet has a pretty understandable explanation of the vulnerabilities.

How can they attack me?

To exploit any of these attacks, an attacker needs to get you to run malicious code. How can they do this? Well, for some Spectre attacks, through Javascript running in your browser. Firefox and Safari released updates that make the Javascript timer not so accurate -- having accurate timing to detect the difference in speed for loading particular caches is a critical part of how the currently identified attacks work. But it's scary that this level of attack could be embedded in Javascript on any website you visit...

Browsers are changing faster than ever, though, and I wonder if this will set back some proposed browser features like WebAssembly, which could be a field day for attackers wanting to deliver nasty code to you through innocuous web pages. It's relatively easy for a browser maker to make the Javascript execution environment fuzzy enough to defeat the precision needed to carry out these attacks. WebAssembly? The entire goal of that is to get programmers "closer to the metal" which is going to make it easier to get creative with exploiting side-channel vulnerabilities.

Browser extensions, mobile apps, anything you download and install now have far more opportunity to steal your secrets than ever before.

How will this affect your website?

Your website's host is almost certainly vulnerable. If you are not hosting on dedicated hardware, Spectre basically means that somebody else hosting on the same physical hardware can now possibly gain access to anything in your hosting account.

There are basically 3 "secrets" in nearly every website that's built on a CMS (like Drupal or WordPress) that might be a target:

  1. Your FTP, SSH, or other hosting account logins -- this could give them full access to your site, allow an attacker to upload malicious code, steal data, damage your site, whatever they want.
  2. The private key for your SSL certificate -- this would allow them to create a fake SSL certificate that looks legitimate to anybody visiting their copy of the site. This is particularly a problem for financial institutes, but it could happen to anyone -- this can lead to fake sites being set up under your domain name, and combined with a "man in the middle" used to phish other people, smear your reputation, or a variety of other things.
  3. Any secrets in your CMS -- your login, your passwords, any passwords of users that log into your site, ecommerce data, whatever there is to steal.
Thumbnail

If you're on a "virtual private server" or a "shared hosting account", there will be exploits coming for years, until we've all replaced all the computer hardware that has come out in the last 20 years -- and another tenant on your hardware can potentially attack your site.

And those are just the universally available targets. You may have other things of value to an attacker, unique to you.

"Meltdown" got its name because it melts down the security boundaries between what you can do as a user, and the underlying system that has full access to everybody.

Meltdown does have patches available, and these are getting deployed -- at the cost of disabling CPU features built to make them perform quickly. Which means if you're nearing the limits of what your currently provisioned hosting provides, patching for Meltdown may push you over, and force you into more costly hosting.

What should you do now to make things better?

What you can do now really isn't much different than it was a month ago -- but the consequences of failing to use best security practices have gotten a lot higher. You could stop using technology, but who is really going to do that? And who already has all your data, who might get compromised anyway?

We think there are two main things to think about when it comes to this type of security planning:

  1. Make sure you are doing all you can to avoid an attack, and
  2. Have a plan for what to do if you fall victim to an attack.

Avoid an attack

To avoid an attack, a little paranoia can go a long way. Get something in email you don't recognize, with an attachment? Don't open the attachment. On a public wifi network? Don't go anywhere well known, like a banking website -- wait until you get home or on a network you trust.

Apply all security updates promptly, and verify that you're getting them from the real source. Pay attention to anything that looks suspicious. Expect more phishing attacks for the foreseeable future (as if we didn't have enough already...) Regularly check that any sites or servers you have do not show any signs of unexpected activity, changed files, etc.

It might be hard to detect an intrusion, because if they've hacked you, they will likely be connecting as you -- so set up any 2-factor authentication you can, consider getting security dongles/hardware tokens, and just think about security before clicking that link.

Plan for disaster

Nobody can be perfect. There is no such thing as "secure" -- there's always a way in. The practice of security is really one of risk management -- identifying what the consequences of a given security breach is, what the costs to avoid that breach are, and finding a balance between the cost of securing something and the cost of suffering a breach.

That equation varies for everybody -- but some key values in that equation just shifted -- now the consequences of a minor breach can lead to a much bigger problem than before. Or, perhaps more accurately, now we know about some ways to make these breaches worse, and thus the likelihood of them happening have become higher.

When it comes to a website, the three main risks to consider are:

  • Loss of service (Denial of service -- your website goes away)
  • Loss of data (you lose access to something you need -- e.g. ransomware, hardware failure without sufficient backups, etc)
  • Information Disclosure (revealing secrets that can cost you something)

What has changed now is that these new information disclosure attacks can reveal your keys and passwords, and then an attacker can conduct the other kinds of attacks impersonating you. It used to be that information disclosure was a bigger concern for data you stored in a database, because the operating system takes such special care of your passwords and keys -- but now we've learned the operating system protections can be bypassed with an attack on the CPU. And that this has been the case for the past 20 years.

Do you have a disaster recovery plan that outlines what steps to take if you discover you've been hacked? If not, we can probably help, at least for your website and server. We've written several disaster recovery plans for ourselves and our clients -- reach out if we can help you create one. We can also do the day-to-day work on your Drupal or WordPress site and server to keep them fully up-to-date, with a full testing pipeline to detect a lot of things that can break in an update.

Let us know if we can help!

The Spectre of a Meltdown

The news was supposed to come out Tuesday, but it leaked early. Last week we learned about three variations of a new class of attacks on modern computing, before many vendors could release a patch -- and we come to find out that the root cause may be entirely unpatchable, and can only be fixed by buying new computers.

Today Microsoft released a patch -- which they had to quickly pull when they discovered that it crashed computers with AMD chips.

Essentially Spectre and Meltdown demonstrate a new way of attacking your smartphone, your laptop, your company's web server, your desktop, maybe even your tv and refrigerator.

Meltdown - Animated
Meltdown in Action

This all sounds dreadfully scary. And it is... but don't panic! Instead, read on to learn how this might affect you, your website, and what you can do to prevent bad things from getting worse.

How will this affect you?

All of these attacks fall into a class of "Information Disclosure." A successful Spectre attack can reveal information you want to keep secret -- mainly your passwords, and security keys widely used to protect information and identity.

Thumbnail

Have any Bitcoin lying around? Your wallet could get compromised with this type of attack. Visit any SSL sites? A secure SSL certificate on a server might have its private key stolen, and incorporated into a fake certificate -- which would make "Man in the middle" attacks from wifi hotspots a lot more effective -- a phisher could set up a fake copy of your bank's website, and there would be no way to tell it apart from the real website, because it has a copy of the real certificate. Use a password manager to keep track of all those different passwords you need for each site? Spectre can make those secrets -- not so secret. This is far worse than Heartbleed.

Over the coming months and years, this will be a headache. Security updates on your phone and all of your computers will be more important to apply promptly than ever before -- because a new 0-day attack could give the attacker the keys to your online kingdom.

The good news is, browsers have already updated with fixes that make a Spectre attack from remote Javascript much more difficult, and Meltdown is nearly patched.

The bad news is, patching for Meltdown means slowing down your computers substantially -- reports suggest by somewhere between 5% and 30%, depending on the types of computing being done. And there isn't really a way of patching Spectre -- it's a design flaw having to do with how the processor caches what it's working on, while using something called "Speculative Processing" to try to speed up its work -- fully preventing a Spectre attack means deploying new hardware processors that manage their low-level caching in a different way.

So preventing Spectre attacks falls more into the realm of viruses -- blocking specific attacks, rather than stopping the vulnerability entirely, at least as I understand the problem. For more, ZDNet has a pretty understandable explanation of the vulnerabilities.

How can they attack me?

To exploit any of these attacks, an attacker needs to get you to run malicious code. How can they do this? Well, for some Spectre attacks, through Javascript running in your browser. Firefox and Safari released updates that make the Javascript timer not so accurate -- having accurate timing to detect the difference in speed for loading particular caches is a critical part of how the currently identified attacks work. But it's scary that this level of attack could be embedded in Javascript on any website you visit...

Browsers are changing faster than ever, though, and I wonder if this will set back some proposed browser features like WebAssembly, which could be a field day for attackers wanting to deliver nasty code to you through innocuous web pages. It's relatively easy for a browser maker to make the Javascript execution environment fuzzy enough to defeat the precision needed to carry out these attacks. WebAssembly? The entire goal of that is to get programmers "closer to the metal" which is going to make it easier to get creative with exploiting side-channel vulnerabilities.

Browser extensions, mobile apps, anything you download and install now have far more opportunity to steal your secrets than ever before.

How will this affect your website?

Your website's host is almost certainly vulnerable. If you are not hosting on dedicated hardware, Spectre basically means that somebody else hosting on the same physical hardware can now possibly gain access to anything in your hosting account.

There are basically 3 "secrets" in nearly every website that's built on a CMS (like Drupal or WordPress) that might be a target:

  1. Your FTP, SSH, or other hosting account logins -- this could give them full access to your site, allow an attacker to upload malicious code, steal data, damage your site, whatever they want.
  2. The private key for your SSL certificate -- this would allow them to create a fake SSL certificate that looks legitimate to anybody visiting their copy of the site. This is particularly a problem for financial institutes, but it could happen to anyone -- this can lead to fake sites being set up under your domain name, and combined with a "man in the middle" used to phish other people, smear your reputation, or a variety of other things.
  3. Any secrets in your CMS -- your login, your passwords, any passwords of users that log into your site, ecommerce data, whatever there is to steal.
Thumbnail

If you're on a "virtual private server" or a "shared hosting account", there will be exploits coming for years, until we've all replaced all the computer hardware that has come out in the last 20 years -- and another tenant on your hardware can potentially attack your site.

And those are just the universally available targets. You may have other things of value to an attacker, unique to you.

"Meltdown" got its name because it melts down the security boundaries between what you can do as a user, and the underlying system that has full access to everybody.

Meltdown does have patches available, and these are getting deployed -- at the cost of disabling CPU features built to make them perform quickly. Which means if you're nearing the limits of what your currently provisioned hosting provides, patching for Meltdown may push you over, and force you into more costly hosting.

What should you do now to make things better?

What you can do now really isn't much different than it was a month ago -- but the consequences of failing to use best security practices have gotten a lot higher. You could stop using technology, but who is really going to do that? And who already has all your data, who might get compromised anyway?

We think there are two main things to think about when it comes to this type of security planning:

  1. Make sure you are doing all you can to avoid an attack, and
  2. Have a plan for what to do if you fall victim to an attack.

Avoid an attack

To avoid an attack, a little paranoia can go a long way. Get something in email you don't recognize, with an attachment? Don't open the attachment. On a public wifi network? Don't go anywhere well known, like a banking website -- wait until you get home or on a network you trust.

Apply all security updates promptly, and verify that you're getting them from the real source. Pay attention to anything that looks suspicious. Expect more phishing attacks for the foreseeable future (as if we didn't have enough already...) Regularly check that any sites or servers you have do not show any signs of unexpected activity, changed files, etc.

It might be hard to detect an intrusion, because if they've hacked you, they will likely be connecting as you -- so set up any 2-factor authentication you can, consider getting security dongles/hardware tokens, and just think about security before clicking that link.

Plan for disaster

Nobody can be perfect. There is no such thing as "secure" -- there's always a way in. The practice of security is really one of risk management -- identifying what the consequences of a given security breach is, what the costs to avoid that breach are, and finding a balance between the cost of securing something and the cost of suffering a breach.

That equation varies for everybody -- but some key values in that equation just shifted -- now the consequences of a minor breach can lead to a much bigger problem than before. Or, perhaps more accurately, now we know about some ways to make these breaches worse, and thus the likelihood of them happening have become higher.

When it comes to a website, the three main risks to consider are:

  • Loss of service (Denial of service -- your website goes away)
  • Loss of data (you lose access to something you need -- e.g. ransomware, hardware failure without sufficient backups, etc)
  • Information Disclosure (revealing secrets that can cost you something)

What has changed now is that these new information disclosure attacks can reveal your keys and passwords, and then an attacker can conduct the other kinds of attacks impersonating you. It used to be that information disclosure was a bigger concern for data you stored in a database, because the operating system takes such special care of your passwords and keys -- but now we've learned the operating system protections can be bypassed with an attack on the CPU. And that this has been the case for the past 20 years.

Do you have a disaster recovery plan that outlines what steps to take if you discover you've been hacked? If not, we can probably help, at least for your website and server. We've written several disaster recovery plans for ourselves and our clients -- reach out if we can help you create one. We can also do the day-to-day work on your Drupal or WordPress site and server to keep them fully up-to-date, with a full testing pipeline to detect a lot of things that can break in an update.

Let us know if we can help!

We're nearing launch of two new Drupal Commerce sites, one of them being this one. It turns out Freelock.com has some relatively sophisticated commerce needs: some taxable products, some non-taxable products. Recurring subscriptions. Arbitrary invoice payments.

We previously blogged about Commerce 2 Price Resolvers. Now, let's get into some of the details of payment gateways and taxes.

This post is going to be more high-level, less code, because there are some critical concepts that are fundamental to how Commerce 2 currently works that site owners need to understand.

Onsite Payment Gateways always store credit card info

Payment gateways have different capabilities. Some support storing a user's card in the gateway. Others have an "offsite" flow, sending you to their site to collect card info and then returning you back to the commerce site when done.

An "Onsite" payment gateway is one that keeps the visitor on the site, collects the card numbers and posts it from the site to the gateway. In Drupal Commerce 2, the payment flow for onsite gateways is completely different than in Commerce 1, Ubercart, or many other shopping carts: it now stores the credit card in the gateway first, and creates a "Payment Method" associated with the user that contains the token from the gateway. It can then use that token for any future charges (as long as the gateway continues to honor it).

This is a fundamental change that makes recurring transactions and "card on file" functionality pretty much built into Commerce 2!

This means that the Commerce EPN Gateway module we created has full support for recurring functionality out of the box.

One other thing of note: Each gateway can declare its capabilities in its code, and Commerce then automatically exposes that functionality to the store admins. One thing that confused me slightly was "Voids" -- in most payment gateways, you can void a transaction up until the batch closes (which usually happens once per day). However, in Commerce 2, "Void" is used to cancel an authorization, not void a transaction. If you use "authorization only" transactions, and capture the payment after the order is fulfilled, then you can use Void to cancel the authorization (or Capture to process it).

The Tax system needs help to recognize taxable vs non-taxable products

Out of the box, the tax system is extremely smart about zones and tax registrations. You can define, per store, which taxes you need to collect, and Commerce will automatically charge tax to customers in a taxable zone. However there is no built-in distinction for taxable vs non-taxable products.

There are tax rules coming, but in the meantime if you have both taxable and non-taxable sales, you need a custom tax resolver to do the right thing. These are relatively easy to create. We borrowed heavily from another tax module the configuration settings that allow you to define on a tax type which product variation(s) it applies to.

One other gotcha we found while testing recurring payments -- if you use the "Include tax in product price" option, recurring billing adds the tax again each time the purchase recurs -- be sure to uncheck these boxes and keep the tax separate, and then it works fine.

If you're in Washington State and need to collect sales tax, this module works fine today -- check it out here. However note that it's not yet possible to create a tax report -- we do store the location data with each transaction, and will be sure to have this report available by April, when we need it!

 

Working with Drupal Commerce 2 is really refreshing. It feels like you're working on a robust, well-oiled system with a very nice user experience right out of the gate. It is still missing some bits and pieces around the edges, and customizing it does involve turning to code more quickly than previous versions, but overall we're finding it really straightforward to work with. Stay tuned for our coming sites, and if you need a Commerce site, reach out to us!

Getting hands on with Drupal Commerce 2 - Onsite payments and Sales Tax

We're nearing launch of two new Drupal Commerce sites, one of them being this one. It turns out Freelock.com has some relatively sophisticated commerce needs: some taxable products, some non-taxable products. Recurring subscriptions. Arbitrary invoice payments.

We previously blogged about Commerce 2 Price Resolvers. Now, let's get into some of the details of payment gateways and taxes.

This post is going to be more high-level, less code, because there are some critical concepts that are fundamental to how Commerce 2 currently works that site owners need to understand.

Onsite Payment Gateways always store credit card info

Payment gateways have different capabilities. Some support storing a user's card in the gateway. Others have an "offsite" flow, sending you to their site to collect card info and then returning you back to the commerce site when done.

An "Onsite" payment gateway is one that keeps the visitor on the site, collects the card numbers and posts it from the site to the gateway. In Drupal Commerce 2, the payment flow for onsite gateways is completely different than in Commerce 1, Ubercart, or many other shopping carts: it now stores the credit card in the gateway first, and creates a "Payment Method" associated with the user that contains the token from the gateway. It can then use that token for any future charges (as long as the gateway continues to honor it).

This is a fundamental change that makes recurring transactions and "card on file" functionality pretty much built into Commerce 2!

This means that the Commerce EPN Gateway module we created has full support for recurring functionality out of the box.

One other thing of note: Each gateway can declare its capabilities in its code, and Commerce then automatically exposes that functionality to the store admins. One thing that confused me slightly was "Voids" -- in most payment gateways, you can void a transaction up until the batch closes (which usually happens once per day). However, in Commerce 2, "Void" is used to cancel an authorization, not void a transaction. If you use "authorization only" transactions, and capture the payment after the order is fulfilled, then you can use Void to cancel the authorization (or Capture to process it).

The Tax system needs help to recognize taxable vs non-taxable products

Out of the box, the tax system is extremely smart about zones and tax registrations. You can define, per store, which taxes you need to collect, and Commerce will automatically charge tax to customers in a taxable zone. However there is no built-in distinction for taxable vs non-taxable products.

There are tax rules coming, but in the meantime if you have both taxable and non-taxable sales, you need a custom tax resolver to do the right thing. These are relatively easy to create. We borrowed heavily from another tax module the configuration settings that allow you to define on a tax type which product variation(s) it applies to.

One other gotcha we found while testing recurring payments -- if you use the "Include tax in product price" option, recurring billing adds the tax again each time the purchase recurs -- be sure to uncheck these boxes and keep the tax separate, and then it works fine.

If you're in Washington State and need to collect sales tax, this module works fine today -- check it out here. However note that it's not yet possible to create a tax report -- we do store the location data with each transaction, and will be sure to have this report available by April, when we need it!

 

Working with Drupal Commerce 2 is really refreshing. It feels like you're working on a robust, well-oiled system with a very nice user experience right out of the gate. It is still missing some bits and pieces around the edges, and customizing it does involve turning to code more quickly than previous versions, but overall we're finding it really straightforward to work with. Stay tuned for our coming sites, and if you need a Commerce site, reach out to us!

Drupal security updates generally come out on Wednesdays, to try to streamline everybody's time. WordPress security notices come out... well, whenever whichever feed you subscribe to bothers to announce something.

Today's notices showed some striking differences between the two communities.

Drupal module vulnerabilities

There were 4 Drupal contributed modules flagged with security bulletins. Of these 4, 3 of these were not fixed -- the module code was yanked from Drupal.org, and now any site that uses any of these modules has a big red warning "Unsupported module in use!" These were all modules I had never heard of, are not in widespread use, and now have been clearly marked as dangerous.

Unsupported Release message

The 4th security update turns out had actually been fixed over 2 years ago, but the fix had not been released in a "stable" release. The vulnerability did look like a ridiculously easy-to-exploit, dangerous chunk of code, and it only affected the module in Drupal 7.

Searching our sites, I found we did not have any Drupal 7 sites using this module, but we did have 2 Drupal 6 sites that actively used it. So I rolled up my sleeves and looked at the code to find that the affected sections were not present in the Drupal 6 version, which solved the problem in a different way.

WordPress vulnerabilities

Unlike Drupal, there is no single source of notifications of WordPress plugin vulnerabilities. There are multiple companies that do security assessments, each with their own conclusions, and none able to signal to the wider WordPress community about a problematic plugin.

We subscribe to several of these community feeds -- and the tale we get is full of drama, conflicting stories, firms calling out each other for misleading information... and basically you're on your own when it comes to determining whether you're using a safe set of plugins.

But today took the cake:

We recommend that you uninstall the Captcha plugin immediately from your site. Based on the public data we’ve gathered, this developer does not have user safety in mind and is very likely a criminal actor attempting yet another supply chain attack.

... that's from a WordFence blog post outlining a security vulnerability they've highlighted in a module that was removed from the WordPress Plugin Repository -- not due to the security hole, but rather because of a trademark infringement issue. The WP Vulnerability Database shows it had a back door in it until release 4.4.5. WordFence is apparently unconvinced that the plugin remains trustworthy at all -- because the new maintainers have included similar backdoors in other plugins they manage.

Just to clarify what these backdoors do: they allow anybody with the "secret handshake" method to knock on your WordPress site's door to log in as a site admin, and remove the evidence they have done so.

It does get worse... Another of our regular sources, Plugin Vulnerabilities, has found over 6000 current installations of an actively attacked plugin that was removed from the Plugin Registry a year and a half ago, and abandoned 4 years ago.

No wonder WordPress has such a bad security reputation!

If you're running a WordPress site, and not actively using a security service or watching the security lists, your site is at risk! Our WordPress protection plan is a must for making sure site updates get done, with testing on development copies, and with solid backup management so we can recover if you have any issues.

Of course, we have a Drupal plan too...

Another Wednesday, another round of security updates

Drupal security updates generally come out on Wednesdays, to try to streamline everybody's time. WordPress security notices come out... well, whenever whichever feed you subscribe to bothers to announce something.

Today's notices showed some striking differences between the two communities.

Drupal module vulnerabilities

There were 4 Drupal contributed modules flagged with security bulletins. Of these 4, 3 of these were not fixed -- the module code was yanked from Drupal.org, and now any site that uses any of these modules has a big red warning "Unsupported module in use!" These were all modules I had never heard of, are not in widespread use, and now have been clearly marked as dangerous.

Unsupported Release message

The 4th security update turns out had actually been fixed over 2 years ago, but the fix had not been released in a "stable" release. The vulnerability did look like a ridiculously easy-to-exploit, dangerous chunk of code, and it only affected the module in Drupal 7.

Searching our sites, I found we did not have any Drupal 7 sites using this module, but we did have 2 Drupal 6 sites that actively used it. So I rolled up my sleeves and looked at the code to find that the affected sections were not present in the Drupal 6 version, which solved the problem in a different way.

WordPress vulnerabilities

Unlike Drupal, there is no single source of notifications of WordPress plugin vulnerabilities. There are multiple companies that do security assessments, each with their own conclusions, and none able to signal to the wider WordPress community about a problematic plugin.

We subscribe to several of these community feeds -- and the tale we get is full of drama, conflicting stories, firms calling out each other for misleading information... and basically you're on your own when it comes to determining whether you're using a safe set of plugins.

But today took the cake:

We recommend that you uninstall the Captcha plugin immediately from your site. Based on the public data we’ve gathered, this developer does not have user safety in mind and is very likely a criminal actor attempting yet another supply chain attack.

... that's from a WordFence blog post outlining a security vulnerability they've highlighted in a module that was removed from the WordPress Plugin Repository -- not due to the security hole, but rather because of a trademark infringement issue. The WP Vulnerability Database shows it had a back door in it until release 4.4.5. WordFence is apparently unconvinced that the plugin remains trustworthy at all -- because the new maintainers have included similar backdoors in other plugins they manage.

Just to clarify what these backdoors do: they allow anybody with the "secret handshake" method to knock on your WordPress site's door to log in as a site admin, and remove the evidence they have done so.

It does get worse... Another of our regular sources, Plugin Vulnerabilities, has found over 6000 current installations of an actively attacked plugin that was removed from the Plugin Registry a year and a half ago, and abandoned 4 years ago.

No wonder WordPress has such a bad security reputation!

If you're running a WordPress site, and not actively using a security service or watching the security lists, your site is at risk! Our WordPress protection plan is a must for making sure site updates get done, with testing on development copies, and with solid backup management so we can recover if you have any issues.

Of course, we have a Drupal plan too...

In the previous post on A custom quantity price discount for Drupal Commerce we created a compound field for price breaks, which was composed by two subfields. One for a quantity threshold, the other for the price at that threshold.

That post covered everything needed to get quantity discounts working within Drupal Commerce, but for this particular project, we also had to find a way to populate these price breaks through the integration with their Sage accounting system.

Source Data

We're starting with their existing CSV exports, and one of them is a file that provides up to 5 pricebreaks for each project. These files get copied up to the webserver each hour, and then the migration into Drupal will get triggered.

Here's what the data looks like in the source CSV file:

Pricebreak CSV Data

... This file has other pricing data, for customer-specific pricing, so our migration skips those rows. The key thing is that there are 5 pairs of columns for the threshold and pricing, called "BreakQuery1" - "BreakQuery5" and "DiscountMarkup1" - "DiscountMarkup5". They use a very large number (99999999) for the largest tier.

So our migration needs to match these up into pairs, and add each pair as a delta in the price break field for that product variation.

Migration Configuration

Setting up migrations is covered elsewhere, as is importing/exporting configurations. For a project like this, we simply copy an existing migration configuration to a new file, edit it with a text editor, and import it into the site's configuration. For this migration, that was all we needed -- there is no supporting code whatsoever, the entire migration is handled with just a YAML file.

We're going to name this migration after the source file, im_pricecode, so the basic setup goes like this:

  1. Export the site configuration.
  2. Copy another CSV migration file to a new "migrate_plus.migration.im_pricecode.yml" file.
  3. Inside the file, delete the uuid line, and set the id to "im_pricecode". When the configuration is imported, Drupal will assign a new UUID, which will get added when you next export the configuration, and the id needs to match the filename.
  4. Add migration tags/group as appropriate.

That's the basic setup. There are 3 crucial parts to a migration configuration: source, process, and destination. All of the heavy lifting here is in the process block, so we'll tackle that last.

Here's our source section:

source:
  plugin: csv
  path: /path/to/IM_PriceCode.csv
  delimiter: ','
  enclosure: '"'
  header_row_count: 1
  keys:
    - ItemCode
    - PriceCodeRecord
  track_changes: true

This section needs to get set to the data source. (We already have plans to swap this for a webservice call down the road). In this case, we're using the "csv" source plugin provided by the Migrate Source CSV module. Of particular note here:

  • header_row_count here indicates that the first line of the file contains column headers, which will be automatically available on the source $row object in the migration, for processing.
  • keys need to specify a field or set of fields that uniquely identify the row in the CSV file. In our case, the file also contains customer-specific pricing data -- this makes the ItemCode appear multiple times in the file, so we need to add the second column to get a unique key.
  • track_changes stores a hash of the source row in the migration table. On future migration runs, the source row is checked against this hash to determine if there's a change -- if there is, the row will get updated, if not, it will be skipped.

Next, the Destination:

destination:
  plugin: 'entity:commerce_product_variation'
  destination_module: commerce_product
  default_bundle: default
migration_dependencies:
  required:
    - ci_item_variation

The destination is pretty self-explanatory. However, there's one crucial thing here: we are running this migration on top of entities that have already been created by another migration, "ci_item_variation". We need to declare that this migration runs after ci_item_variation, by using migration_dependencies.

Finally, we get to our "process" block. Let's take this a field at a time:

process:
  variation_id:
    -
      plugin: migration_lookup
      migration: ci_item_variation
      source: ItemCode
      no_stub: true
    -
      plugin: skip_on_empty
      method: row

variation_id is the primary key for commerce_product_variation entities. We are running this through two plugins: migration_lookup (formerly known as just migration) and skip_on_empty.

The migration_lookup loads up the existing product_variation, by looking up the variation_id on the previous ci_item_variation migration. We use "no_stub" to make sure we don't create any new variations from possible bad data or irrelevant rows. The second plugin, skip_on_empty, is another safety check to be sure we're not migrating bad data in.

pricing_method:
  -
    plugin: static_map
    source: PricingMethod
    map:
      O: O
    default_value: ''
  -
    plugin: skip_on_empty
    method: row

PricingMethod is what indicates the type of pricing each row in the source file contains. We don't need to migrate this field, but we do want to use it as a filter -- we only want to migrate rows that have this set to "O". So we chain together two more plugins to achieve this filtering, and we assign the result to a dummy field that won't ever get used.

The first static_map plugin maps the "O" rows to "O", and everything else gets set to an empty string.

The next plugin, skip_on_empty, simply skips rows that have the empty string set -- which is now everything other than rows with an "O" in this column.

Now we get to the fun stuff:

break1:
  plugin: get
  source:
    - BreakQuantity1
    - DiscountMarkup1
break2:
  plugin: get
  source:
    - BreakQuantity2
    - DiscountMarkup2
break3:
  plugin: get
  source:
    - BreakQuantity3
    - DiscountMarkup3
break4:
  plugin: get
  source:
    - BreakQuantity4
    - DiscountMarkup4
break5:
  plugin: get
  source:
    - BreakQuantity5
    - DiscountMarkup5

The whole trick to getting the 10 columns collapsed into up to 5 instances of our quantity price break field, each with two sub-fields, is to rearrange the data into a format that matches the field. This involves creating two layers of dummy fields. The first layer are these 5 dummy fields, break1 - break5, which group the BreakQuantityN and DiscountMarkupN fields together. The resut are these 5 dummy fields that each have a BreakQuantity and a DiscountMarkup.

One crucial point here is the order -- these are now set as an array with two values. If we take a look at the source data, the row for ItemCode 23311 will have an array that looks like:

$row->break1 = [
  0 => "9",
  1 => "5.1",
];
$row->break2 = [
  0 => "99999999",
  1 => "4.85",
];
$row->break3 = [
  0 => "0",
  1 => "0",
];
...

Remember that each of these is an indexed array, with 0 and 1 as keys.

Now we create our second layer -- a dummy field that combines the other 5 dummy fields:

breaks:
  plugin: get
  source:
    - '@break1'
    - '@break2'
    - '@break3'
    - '@break4'
    - '@break5'

Prefixing a fieldname with "@" means use the row from the destination (the one we're building up) and not the original source row -- this lets us access the dummy fields we just created. The YAML parser requires them to be quoted.

So now we have a value built up that pretty much matches the entity field structure:

$row->breaks = [
  0 => [
    0 => "9",
    1 => "5.1",
  ],
  1 => [
    0 => "99999999",
    1 => "4.85",
  ],
  2 => [
    0 => "0",
    1 => "0",
  ],
  ...
];

Now we assign this to our real destination field:

field_qty_price_breaks:
  plugin: iterator
  source: '@breaks'
  process:
    threshold:
      plugin: skip_on_empty
      method: process
      source: '0'
    price: '1'

For the actual field values, we need to map the values to the actual subfields -- and we want to skip the empty pricebreak columns. The iterator plugin lets us loop through the breaks, and then assign the subfields with further processing.

The threshold sub-field is 0 if this break is unused, so we can use "skip_on_empty" to skip it -- this time we use the "process" method to only skip this field, and not the entire row. Because the threshold is the first value in the indexed array for each pricebreak, this is the "0" source (and needs to be quoted to be interpreted as an index).

The price sub-field is simply the second value in each of the pricebreak fields, with index "1".

Migrate is cool!

We love working with migrate module in Drupal 8 because of how powerful it is, and how easy to customize. We routinely create process plugins to quickly do some custom transformations, or create source plugins to connect to custom data sources, and I thought I would have to do one of those to solve this problem -- but it turns out some YAML and dummy fields worked with no additional code, pretty much the first time!

Hat tip to MTech for the basic technique! If you have an improvement or suggestion for other cool migrate tricks, leave a comment below! Or if you have a data integration you need done, reach out...

A slick migration trick - convert columns to multi-value field with subfields

In the previous post on A custom quantity price discount for Drupal Commerce we created a compound field for price breaks, which was composed by two subfields. One for a quantity threshold, the other for the price at that threshold.

That post covered everything needed to get quantity discounts working within Drupal Commerce, but for this particular project, we also had to find a way to populate these price breaks through the integration with their Sage accounting system.

Source Data

We're starting with their existing CSV exports, and one of them is a file that provides up to 5 pricebreaks for each project. These files get copied up to the webserver each hour, and then the migration into Drupal will get triggered.

Here's what the data looks like in the source CSV file:

Pricebreak CSV Data

... This file has other pricing data, for customer-specific pricing, so our migration skips those rows. The key thing is that there are 5 pairs of columns for the threshold and pricing, called "BreakQuery1" - "BreakQuery5" and "DiscountMarkup1" - "DiscountMarkup5". They use a very large number (99999999) for the largest tier.

So our migration needs to match these up into pairs, and add each pair as a delta in the price break field for that product variation.

Migration Configuration

Setting up migrations is covered elsewhere, as is importing/exporting configurations. For a project like this, we simply copy an existing migration configuration to a new file, edit it with a text editor, and import it into the site's configuration. For this migration, that was all we needed -- there is no supporting code whatsoever, the entire migration is handled with just a YAML file.

We're going to name this migration after the source file, im_pricecode, so the basic setup goes like this:

  1. Export the site configuration.
  2. Copy another CSV migration file to a new "migrate_plus.migration.im_pricecode.yml" file.
  3. Inside the file, delete the uuid line, and set the id to "im_pricecode". When the configuration is imported, Drupal will assign a new UUID, which will get added when you next export the configuration, and the id needs to match the filename.
  4. Add migration tags/group as appropriate.

That's the basic setup. There are 3 crucial parts to a migration configuration: source, process, and destination. All of the heavy lifting here is in the process block, so we'll tackle that last.

Here's our source section:

source:
  plugin: csv
  path: /path/to/IM_PriceCode.csv
  delimiter: ','
  enclosure: '"'
  header_row_count: 1
  keys:
    - ItemCode
    - PriceCodeRecord
  track_changes: true

This section needs to get set to the data source. (We already have plans to swap this for a webservice call down the road). In this case, we're using the "csv" source plugin provided by the Migrate Source CSV module. Of particular note here:

  • header_row_count here indicates that the first line of the file contains column headers, which will be automatically available on the source $row object in the migration, for processing.
  • keys need to specify a field or set of fields that uniquely identify the row in the CSV file. In our case, the file also contains customer-specific pricing data -- this makes the ItemCode appear multiple times in the file, so we need to add the second column to get a unique key.
  • track_changes stores a hash of the source row in the migration table. On future migration runs, the source row is checked against this hash to determine if there's a change -- if there is, the row will get updated, if not, it will be skipped.

Next, the Destination:

destination:
  plugin: 'entity:commerce_product_variation'
  destination_module: commerce_product
  default_bundle: default
migration_dependencies:
  required:
    - ci_item_variation

The destination is pretty self-explanatory. However, there's one crucial thing here: we are running this migration on top of entities that have already been created by another migration, "ci_item_variation". We need to declare that this migration runs after ci_item_variation, by using migration_dependencies.

Finally, we get to our "process" block. Let's take this a field at a time:

process:
  variation_id:
    -
      plugin: migration_lookup
      migration: ci_item_variation
      source: ItemCode
      no_stub: true
    -
      plugin: skip_on_empty
      method: row

variation_id is the primary key for commerce_product_variation entities. We are running this through two plugins: migration_lookup (formerly known as just migration) and skip_on_empty.

The migration_lookup loads up the existing product_variation, by looking up the variation_id on the previous ci_item_variation migration. We use "no_stub" to make sure we don't create any new variations from possible bad data or irrelevant rows. The second plugin, skip_on_empty, is another safety check to be sure we're not migrating bad data in.

pricing_method:
  -
    plugin: static_map
    source: PricingMethod
    map:
      O: O
    default_value: ''
  -
    plugin: skip_on_empty
    method: row

PricingMethod is what indicates the type of pricing each row in the source file contains. We don't need to migrate this field, but we do want to use it as a filter -- we only want to migrate rows that have this set to "O". So we chain together two more plugins to achieve this filtering, and we assign the result to a dummy field that won't ever get used.

The first static_map plugin maps the "O" rows to "O", and everything else gets set to an empty string.

The next plugin, skip_on_empty, simply skips rows that have the empty string set -- which is now everything other than rows with an "O" in this column.

Now we get to the fun stuff:

break1:
  plugin: get
  source:
    - BreakQuantity1
    - DiscountMarkup1
break2:
  plugin: get
  source:
    - BreakQuantity2
    - DiscountMarkup2
break3:
  plugin: get
  source:
    - BreakQuantity3
    - DiscountMarkup3
break4:
  plugin: get
  source:
    - BreakQuantity4
    - DiscountMarkup4
break5:
  plugin: get
  source:
    - BreakQuantity5
    - DiscountMarkup5

The whole trick to getting the 10 columns collapsed into up to 5 instances of our quantity price break field, each with two sub-fields, is to rearrange the data into a format that matches the field. This involves creating two layers of dummy fields. The first layer are these 5 dummy fields, break1 - break5, which group the BreakQuantityN and DiscountMarkupN fields together. The resut are these 5 dummy fields that each have a BreakQuantity and a DiscountMarkup.

One crucial point here is the order -- these are now set as an array with two values. If we take a look at the source data, the row for ItemCode 23311 will have an array that looks like:

$row->break1 = [
  0 => "9",
  1 => "5.1",
];
$row->break2 = [
  0 => "99999999",
  1 => "4.85",
];
$row->break3 = [
  0 => "0",
  1 => "0",
];
...

Remember that each of these is an indexed array, with 0 and 1 as keys.

Now we create our second layer -- a dummy field that combines the other 5 dummy fields:

breaks:
  plugin: get
  source:
    - '@break1'
    - '@break2'
    - '@break3'
    - '@break4'
    - '@break5'

Prefixing a fieldname with "@" means use the row from the destination (the one we're building up) and not the original source row -- this lets us access the dummy fields we just created. The YAML parser requires them to be quoted.

So now we have a value built up that pretty much matches the entity field structure:

$row->breaks = [
  0 => [
    0 => "9",
    1 => "5.1",
  ],
  1 => [
    0 => "99999999",
    1 => "4.85",
  ],
  2 => [
    0 => "0",
    1 => "0",
  ],
  ...
];

Now we assign this to our real destination field:

field_qty_price_breaks:
  plugin: iterator
  source: '@breaks'
  process:
    threshold:
      plugin: skip_on_empty
      method: process
      source: '0'
    price: '1'

For the actual field values, we need to map the values to the actual subfields -- and we want to skip the empty pricebreak columns. The iterator plugin lets us loop through the breaks, and then assign the subfields with further processing.

The threshold sub-field is 0 if this break is unused, so we can use "skip_on_empty" to skip it -- this time we use the "process" method to only skip this field, and not the entire row. Because the threshold is the first value in the indexed array for each pricebreak, this is the "0" source (and needs to be quoted to be interpreted as an index).

The price sub-field is simply the second value in each of the pricebreak fields, with index "1".

Migrate is cool!

We love working with migrate module in Drupal 8 because of how powerful it is, and how easy to customize. We routinely create process plugins to quickly do some custom transformations, or create source plugins to connect to custom data sources, and I thought I would have to do one of those to solve this problem -- but it turns out some YAML and dummy fields worked with no additional code, pretty much the first time!

Hat tip to MTech for the basic technique! If you have an improvement or suggestion for other cool migrate tricks, leave a comment below! Or if you have a data integration you need done, reach out...