So there are definite "gotchas" to migrating content from Drupal 6 to Drupal 8, when you take away the assumption that the ids throughout the system will remain the same. We hit another one on a recent launch: the URL aliases imported from Drupal 6 did not get rewritten with new node ids, after the migration had started using a map.

The issue: When importing content that uses a new nid, you can simply add a "process" plugin to process that value through the migration map to get the correct nid in Drupal 8.

However, the upgrade_d6_url_alias does not deal in nids -- it simply adds a "/" character to the start of whatever it found in the Drupal 6 url_alias table, and inserts it into the corresponding table in the Drupal 8 url_alias table.

Here are some examples from a d6 url_alias table:

 

+------+-----------+---------------------------------------------+
| pid  | src       | dst                                         |
+------+-----------+---------------------------------------------+
| 2208 | node/1850 | wisconsin-state-refund-policy               |
| 2210 | node/1852 | what-does-it-cost-0                         |
| 2213 | node/1855 | indonesian-phrases-study-tour-students      |
| 2214 | node/1856 | aromatic-indonesia-study-tour-what-pack     |
| 2215 | node/1857 | dos-and-donts-travel-indonesia              |
+------+-----------+---------------------------------------------+

In our D6 -> D8 migration, the url_alias migration is pretty simple. All migrations specify a source and destination -- the part we're most interested in is the "process" section, which looks like this:

 

process:
  source:
    plugin: concat
    source:
      - constants/slash
      - src
  alias:
    plugin: concat
    source:
      - constants/slash
      - dst
  langcode:
    plugin: d6_url_alias_language
    source: language


This section essentially processes 3 fields: source, alias, and langcode. It's the source field we need to rewrite with the correct nid mapped from the node migrations.

Fortunately in our case, users and taxonomy terms were not added on the Drupal 8 site before the final migration, we only needed to focus on nodes.

We could create our own process plugin to rewrite this all according to whatever rules we want -- but for expedience, we found in the Drupal 8 migration documentation there was a Callback process plugin that might make this a breeze.

This process function does not allow for any arguments to be passed to the callback, but it does allow you to use a static method on a class.

The solution? Add a class file inside the module, in src/. PHPStorm and Drupal Console greatly assisted here to make this a breeze to do.

Next up: How do we skip the rows we don't care about? The taxonomy/term, user/, paths? We only care about nodes. By viewing the source code of some other migration plugins, I found the MigrateSkipRowException exception, which presumably you would throw to skip processing this row.

So now we know how to get a function that will process the src URL, skip all the rows we don't care about, and return the nid so we can apply the migration plugin. After that, we need to rewrite the new nid into the new source pattern.

Unfortunately, I didn't see any way to use the previous Concat plugin to concatenate text with the nid returned by the migration plugin, so the answer was another Callback plugin, a new function I added to my little helper class.

Once I re-loaded that config, I reran the migration with the --update flag, and it worked perfectly!

The new YAML for the source process plugin:

process:
  source:
    -
      plugin: callback
      callable:
        - '\Drupal\site_migrate\MigrateUtil'
        - filterint
      source: src
    -
      plugin: migration
      migration:
        - upgrade_d6_node_alumni_in_action
        - upgrade_d6_node_banner
        - upgrade_d6_node_blog
        - upgrade_d6_node_course
        - upgrade_d6_node_course_group
        - upgrade_d6_node_event
        - upgrade_d6_node_header_graphic
        - upgrade_d6_node_homepage_slide
        - upgrade_d6_node_job_post
        - upgrade_d6_node_landing_page
        - upgrade_d6_node_mock_panel
        - upgrade_d6_node_news
        - upgrade_d6_node_page
        - upgrade_d6_node_panel
        - upgrade_d6_node_program
        - upgrade_d6_node_quote
        - upgrade_d6_node_story
        - upgrade_d6_node_webform
    -
      plugin: callback
      callable:
        - '\Drupal\site_migrate\MigrateUtil'
        - nodepath

... And the class, in modules/custom/site_migrate/src/MigrateUtil.php:

<?php

namespace Drupal\site_migrate;

use Drupal\migrate\MigrateSkipRowException;

class MigrateUtil {

  /**
   * @param $value
   * @return mixed
   * @throws MigrateSkipRowException
   *
   * Utility function to extract an integer from a string. Used to
   * extract the nid from a url_alias.src field for migration.
   */
  function filterint($value) {
    if (!preg_match('/^node/', $value)) {
      throw new MigrateSkipRowException('Skip row that is not a node path');
    }
    preg_match('/([\d]+)/', $value, $match);
    return $match[0];

  }
  /**
   * @param $value nid of node that needs to be written as an internal path
   * @return string 'node/'.[nid]
   * @throws MigrateSkipRowException
   *
   * Utility function to create a node path from a nid.
   */
  function nodepath($value) {
    if (empty($value)) {
      throw new MigrateSkipRowException('Skip row if no mapped value');
    }
    return '/node/' . $value;
  }
}

And with just a bit of YAML and a helper class, we've fixed all the broken aliases from the previous migration!

Add new comment

The content of this field is kept private and will not be shown publicly.

Filtered HTML

  • Web page addresses and email addresses turn into links automatically.
  • Allowed HTML tags: <a href hreflang> <em> <strong> <blockquote cite> <cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h1> <h2 id> <h3 id> <h4 id> <h5 id> <p> <br> <img src alt height width>
  • Lines and paragraphs break automatically.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.