We decided to migrate our old 7sabores blog site from Drupal7 to Markdown to be consumed using GatsbyJS.
The first question was, is there any contributed module available to do that easily?
The answer was no. Even in D7, finding a module with this behaviour could be very tricky. The only solution was PHP scripts + D7 Bootstrap + Markdown Library.
Let’s get started. First, download the scripts from here. Paste the user-migrate.php and node-migrate.php scripts in your D7 root and then install the dependencies.
Regarding dependencies, we need a library to convert our HTML body into Markdown. The package league/html-to-markdown
(https://github.com/thephpleague/html-to-markdown) will help you do it.:
Using the following line will allow you to install the package and update our composer.json:
composer require 'league/html-to-markdown'
Let’s break down each major component required to migrate our Drupal 7 data to markdown files. In the following example, we just export users and nodes from Drupal.
Users
The following script will provide an author.yaml
file from Drupal users in a predefined directory.
<?php
//Loading Drupal Bootstrap
define('DRUPAL_ROOT', getcwd());
$_SERVER['REMOTE_ADDR'] = '127.0.0.1';
require_once DRUPAL_ROOT . '/includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);
//Configure Directories
$destination_dir = '../static';
$user_dir = $destination_dir . '/user/';
$file_name = 'author.yaml';
if (!file_exists($destination_dir)) {
mkdir($destination_dir, 0777, true);
}
$uids = db_select('users', 'u')
->fields('u', array('uid'))
->orderBy('u.created', 'ASC')
->execute()
->fetchCol();
$users = user_load_multiple($uids);
if (!file_exists($user_dir)) {
mkdir($user_dir, 0777, true);
} else {
delete_directory($user_dir);
mkdir($user_dir, 0777, true);
}
$text = '';
$user_count = 0;
foreach ($users as $user) {
if ($user->name) {
echo "- $usernamen";
$username = clean_text($user->name);
$text .= "- id: $usernamen";
$name = '';
if ($user->field_nombre['und'][0]) {
$name = $user->field_nombre['und'][0]['value'];
}
if ($user->field_apellido['und'][0]) {
$name .= ($name) ? ' ' . $user->field_apellido['und'][0]['value'] : $user->field_apellido['und'][0]['value'];
}
if ($name) {
$name = clean_text($name);
$text .= " name: $namen";
}
if ($user->field_pais['und'][0]) {
$country = clean_text($user->field_pais['und'][0]['value']);
$text .= " country: $countryn";
}
if ($user->field_estado_provincia['und'][0]) {
$province = clean_text($user->field_estado_provincia['und'][0]['value']);
$text .= " province: $provincen";
}
$user_count++;
}
}
write_file($user_dir . $file_name, $text);
echo "n$user_count users were exported.n";
/*
* Delete all directories from a passed directory.
*/
function delete_directory($dir)
{
if (!file_exists($dir)) {
return true;
}
if (!is_dir($dir) || is_link($dir)) {
return unlink($dir);
}
foreach (scandir($dir) as $item) {
if ($item == '.' || $item == '..') {
continue;
}
if (!delete_directory($dir . "/" . $item, false)) {
chmod($dir . "/" . $item, 0777);
if (!delete_directory($dir . "/" . $item, false)) return false;
};
}
return rmdir($dir);
}
/*
* Write file function.
*/
function write_file($filen_name, $text)
{
$my_file = fopen($filen_name, "w") or die("Unable to create file!");
fwrite($my_file, $text);
fclose($my_file);
}
/*
* Add single quote to large strings.yup
*/
function clean_text($text)
{
if (count(explode(' ', $text)) >= 2) {
$text = "'$text'";
}
return $text;
}
Customizing the script
The script fully adapts to the defined user structure. You need to review the user structure into Drupal before, in order to execute it and then check if the fields are matching up.
In this case, the markdown converter is not used. You will need to use it in order to continue reading the node script.
About the exported directory structure, a file author.yaml will be created into a directory called “user”. The destination directories and the filename are configurable in the first lines of the script.
Running user exporting script
Using PHP CLI you can execute the script:
php user-migrate.php
Output (static/user/author.yaml):
- id: admin
name: 'Admin Admin'
country: US
- id: johndoe
name: John Doe'
country: CR
province: Heredia
Nodes
This script is a little bit more complex than the user script. We can pass the content type as an argument; if the content type doesn’t have any custom fields, the script only exports the common fields of a node e.g: title, date, author, path, summary/body.
<?php
//Loading Drupal Bootstrap
define('DRUPAL_ROOT', getcwd());
$_SERVER['REMOTE_ADDR'] = '127.0.0.1';
require_once DRUPAL_ROOT . '/includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);
$destination_dir = '../static';
if (!file_exists($destination_dir)) {
mkdir($destination_dir, 0777, true);
}
//HTML Converter autoload
requireAutoloader();
$converter = new LeagueHTMLToMarkdownHtmlConverter();
//Args
if (!isset($argv[1])) {
echo "error: A content type is required.n";
}
$content_type = $argv[1];
//Query
$nids = db_select('node', 'n')
->fields('n', array('nid'))
->fields('n', array('type'))
->condition('n.status', 1)
->condition('n.type', $content_type)
->orderBy('n.created', 'ASC')
->execute()
->fetchCol();
$nodes = node_load_multiple($nids);
$content_type_dir = $destination_dir . '/' . $content_type . '/';
if (!file_exists($content_type_dir)) {
mkdir($content_type_dir, 0777, true);
} else {
delete_directory($content_type_dir);
mkdir($content_type_dir, 0777, true);
}
$count_nodes = 0;
if ($nodes) {
foreach ($nodes as $node) {
echo "- $titlen";
$text = "---n";
$title = clean_text($node->title);
$text .= "title: $titlen";
$date = date('Y-m-d', $node->created);
$text .= "date: $daten";
$user_author = user_load($node->uid);
$author = clean_text($user_author->name);
$text .= "author: $authorn";
$path = drupal_get_path_alias('node/' . $node->nid);
$text .= "path: $pathn";
//Blog custom fields
if ($content_type == 'blog') {
//Term load example
if ($node->field_tema['und']) {
$text .= "topics:n";
foreach ($node->field_tema['und'] as $topic) {
$taxonomy = taxonomy_term_load($topic['tid']);
$taxonomy_name = clean_text($taxonomy->name);
$text .= " - $taxonomy_namen";
}
}
//Getting a file uri
if ($node->field_cover['und']) {
$uri = clean_media($node->field_cover['und'][0]["uri"]);
$text .= "cover: $urin";
}
}
//Lesson custom fields
if ($content_type == 'lesson') {
if ($node->field_vimeo_free['und'][0]['vimeo']) {
$vimeo = $node->field_vimeo_free['und'][0]['vimeo'];
$text .= "vimeo: $vimeon";
}
if ($node->field_tema['und']) {
$text .= "topics:n";
foreach ($node->field_tema['und'] as $topic) {
$taxonomy = taxonomy_term_load($topic['tid']);
$taxonomy_name = clean_text($taxonomy->name);
$text .= " - $taxonomy_namen";
}
}
}
if ($node->body['und'][0]['summary']) {
$summary = clean_media(strip_tags($converter->convert($node->body['und'][0]['summary'])));
$text .= "summary: $summaryn";
}
$text .= "---nn";
if ($node->body['und'][0]['value']) {
$body = clean_media(strip_tags($converter->convert($node->body['und'][0]['value'])));
$text .= "$bodyn";
}
$filename = file_name($path);
write_file($content_type_dir . $filename, $text);
$count_nodes++;
}
echo "n$count_nodes nodes ($content_type) were exported.n";
}
/*
* Loading league/html-to-markdown library.
*/
function requireAutoloader()
{
$autoloadPaths = array(
__DIR__ . '/vendor/autoload.php',
__DIR__ . '/../../../autoload.php',
);
foreach ($autoloadPaths as $path) {
if (file_exists($path)) {
require_once $path;
break;
}
}
}
/*
* Delete all directories from a passed directory.
*/
function delete_directory($dir)
{
if (!file_exists($dir)) {
return true;
}
if (!is_dir($dir) || is_link($dir)) {
return unlink($dir);
}
foreach (scandir($dir) as $item) {
if ($item == '.' || $item == '..') {
continue;
}
if (!delete_directory($dir . "/" . $item, false)) {
chmod($dir . "/" . $item, 0777);
if (!delete_directory($dir . "/" . $item, false)) return false;
};
}
return rmdir($dir);
}
/*
* Write file function.
*/
function write_file($filen_name, $text)
{
$my_file = fopen($filen_name, "w") or die("Unable to create file!");
fwrite($my_file, $text);
fclose($my_file);
}
/*
* Return the file name.
*/
function file_name($path)
{
$explode_path = explode('/', $path);
return $explode_path[1] . '.md';
}
/*
* Convert media url linked to asset folder.
*/
function clean_media($str)
{
$asset_dir = '../assets/';
$str = str_replace('public://', $asset_dir, $str);
$str = str_replace('/sites/default/files/styles/large/public/', $asset_dir, $str);
$str = str_replace('sites/default/files/styles/medium/public/', $asset_dir, $str);
return $str;
}
/*
* Add single quote to large strings.yup
*/
function clean_text($text)
{
if (count(explode(' ', $text)) >= 2) {
$text = "'$text'";
}
return $text;
}
Customizing the script
The script is totally adaptable based on the defined content type structure.
Running the script
php node-migrate.php blog
Output (static/blog/abrir-carpeta-sublime-text-consola.md):
title: 'Como abrir una carpeta con Sublime Text desde consola en mac'
date: 2013-11-06
author: johndoe
path: blog/abrir-carpeta-sublime-text-consola
topics:
- Programación
Assets
Regarding images, we need to copy the /sites/default/files folder into the new static directory with the name assets. The directory destination and the name for the assets dir are configurable. The function clean_media will be in charge to convert Drupal default URLs to the new ones in the asset dir.
Results
The exported directory structure contains one file (*.md) for each node, using the path as filename, and are included under one directory, per content type.
This example just supports nodes and users from Drupal. If you need to export permissions, list of modules, taxonomies itself, or others, you must modify the script or create another one.
Now you are ready to start a new Gatsby project using by the content from a D7 site.
Enjoy your coding!