Skip to main content
drupal-8-to-markdown

How to export your Drupal 8 content to Markdown files

 

This is the second blog post in the series Migrate to Markdown files. You can read the first one How to migrate a Drupal 7 site to Markdown files here.

We've been asked numerous times how to export Drupal 8 content to markdown files. We have tackled this task by using Drupal Console Snippets and, it was an amazing experience.

Regarding dependencies,  a library is needed to convert our HTML body into Markdown  league/html-to-markdown (https://github.com/thephpleague/html-to-markdown). 

composer require 'league/html-to-markdown'

Let's get started. First, download the scripts from here and paste the user-migrate.php and node-migrate.php scripts into a snippet directory inside console directory like.

Users

The following script will provide an author.yaml file from Drupal users in a predefined directory.

<?php

$destination_dir = 'static';
$user_dir = $destination_dir . '/user/';
$file_name = 'author.yaml';

if (!file_exists($destination_dir)) {
    mkdir($destination_dir, 0777, true);
}

$users = \Drupal\user\Entity\User::loadMultiple();

if (!file_exists($user_dir)) {
    mkdir($user_dir, 0777, true);
} else {
    delete_directory($user_dir);
    mkdir($user_dir, 0777, true);
}

$text = '';
$user_count = 0;
foreach ($users as $user) {
    if ($user->getUsername()) {
        $username = clean_text($user->getUsername());
        $text .= "- id: $username\n";

        echo "- $username\n";

        //Add custom fields here....
        $name = '';
        if ($user->get('field_user_name')->getString()) {
            $name = $user->get('field_user_name')->getString();
        }
        if ($user->get('field_user_last_name')->getString()) {
            $name .= ($name) ? ' ' . $user->get('field_user_last_name')->getString() : $user->get('field_user_last_name')->getString();
        }
        if ($name) {
            $name = clean_text($name);
            $text .= "  name: $name\n";
        }

        if ($user->get('field_user_website')->getString()) {
            $website = clean_text($user->get('field_user_website')->getString());
            $text .= "  website: $website\n";
        }

        if ($user->get('field_user_facebook')->getString()) {
            $facebook = clean_text($user->get('field_user_facebook')->getString());
            $text .= "  facebook: $facebook\n";
        }

        if ($user->get('field_user_twitter')->getString()) {
            $twitter = clean_text($user->get('field_user_twitter')->getString());
            $text .= "  twitter: $twitter\n";
        }

        if ($user->get('field_user_github')->getString()) {
            $github = clean_text($user->get('field_user_github')->getString());
            $text .= "  github: $github\n";
        }

        if ($user->get('field_user_drupal')->getString()) {
            $drupal = clean_text($user->get('field_user_drupal')->getString());
            $text .= "  user_drupal: $drupal\n";
        }

        if ($user->get('field_github_avatar')->getString()) {
            $field_github_avatar = clean_text($user->get('field_github_avatar')->getString());
            $text .= "  github_avatar: $field_github_avatar\n";
        }

        //Image
        if ($user->get('user_picture')->getString()) {
            $user_picture_url = clean_media($user->user_picture->entity->url());
            $text .= "  user_picture: $user_picture_url\n";
        }

        $user_count++;
    }
}

echo "\n$user_count users were exported.\n";

write_file($user_dir . $file_name, $text);

/*
 * Delete all directories from a passed directory.
 */
function delete_directory($dir)
{
    if (!file_exists($dir)) {
        return true;
    }
    if (!is_dir($dir) || is_link($dir)) {
        return unlink($dir);
    }
    foreach (scandir($dir) as $item) {
        if ($item == '.' || $item == '..') {
            continue;
        }
        if (!delete_directory($dir . "/" . $item, false)) {
            chmod($dir . "/" . $item, 0777);
            if (!delete_directory($dir . "/" . $item, false)) return false;
        };
    }
    return rmdir($dir);
}

/*
 * Write file function.
 */
function write_file($filen_name, $text)
{
    $my_file = fopen($filen_name, "w") or die("Unable to create file!");
    fwrite($my_file, $text);
    fclose($my_file);
}

/*
 * Convert media url linked to asset folder.
 */
function clean_media($str)
{
    $asset_dir = '../assets/';

    $str = str_replace('public://', $asset_dir, $str);
    $str = str_replace('default/sites/default/files/', $asset_dir, $str);

    return $str;
}

/*
 * Add single quote to large strings.yup
 */
function clean_text($text)
{
    if (count(explode(' ', $text)) >= 2) {
        $text = "'$text'";
    }

    return $text;
}

 

Customizing the script 

The script fully adapts to the defined user structure. Prior to that, you need to review the user structure into Drupal, in order to execute it and then check if the fields match.

In this case, the markdown converter is not used. You will need to use it in order to continue reading the node script.

By using PHP CLi, you can execute the script:

drupal snippet --file=console/snippet/user-migrate.php

 

Output (static/user/author.yaml):

- id: omers
 name: 'Omar Aguirre'
 twitter: https://twitter.com/omers
 github: https://github.com/omero
 github_avatar: https://avatars.githubusercontent.com/u/1909779?v=3&s=140
 user_picture: ../assets/pictures/2015-11/profile.jpg
- id: jmolivas
 name: 'Jesus Manuel Olivas'
 website: http://jmolivas.com/
 twitter: https://twitter.com/jmolivas
 github: https://github.com/jmolivas
 github_avatar: https://avatars.githubusercontent.com/u/366275?v=3&s=140
 user_picture: ../assets/pictures/2015-12/avatar-medium.jpg

About the exported directory structure, a file author.yaml will be created into a directory called user. The destination directories and the filename are configurable in the first few lines of the script.

Nodes

Customizing the script 

The script is fully adaptable based on the defined content type structure. The machine name of available content types must be added into the script beforehand. Then, set up your custom fields on each content type.

Certain predefined functions can be used to get a specific type of field: 

  • get_drupal_term_referencestt

  • get_drupal_node_references

  • get_drupal_user_references

  • get_drupal_value

  • get_drupal_image_uri

  • get_drupal_link

 


<?php

use Drupal\user\Entity\User;

$destination_dir = 'static';
if (!file_exists($destination_dir)) {
    mkdir($destination_dir, 0777, true);
}

//HTML Converter autoload
requireAutoloader();
$converter = new League\HTMLToMarkdown\HtmlConverter();

$node_manager = \Drupal::entityTypeManager()->getStorage('node');
$paragraph_manager = \Drupal::entityTypeManager()->getStorage('paragraph');

$content_types = ['article',
    'page',
    'documentation',
    'event'];

foreach ($content_types as $content_type) {
    $nids = \Drupal::entityQuery('node')->condition('type', $content_type)->execute();
    $nodes = $node_manager->loadMultiple($nids);

    $content_type_dir = $destination_dir . '/' . $content_type . '/';
    if (!file_exists($content_type_dir)) {
        mkdir($content_type_dir, 0777, true);
    } else {
        delete_directory($content_type_dir);
        mkdir($content_type_dir, 0777, true);
    }

    if ($nodes) {
        echo "------------------------------------------------------\n";
        $count_nodes = 0;
        foreach ($nodes as $node) {
            $text = "---\n";
            $title = clean_text($node->getTitle());
            $text .= "title: $title\n";

            echo "- $title\n";

            $date = date('Y-m-d', $node->get('created')->getString());
            $text .= "date: $date\n";

            $user_author = User::load($node->getOwnerId());
            $author = clean_text($user_author->getUsername());
            $text .= "author: $author\n";

            $path = $node->toUrl()->toString();
            $text .= "path: $path\n";

            if ($content_type == 'article') {
                if ($node->get('field_components_article')->getString()) {
                    $p_value = $node->get('field_components_article')->getValue();

                    //release - resource
                    $paragraph = $paragraph_manager->load($p_value[0]['target_id']);

                    if ($paragraph->getType() == 'release') {
                        $text .= get_drupal_value($paragraph, 'field_contributors', 'release_contributors');
                        $text .= get_drupal_value($paragraph, 'field_new_features', 'release_features');
                        $text .= get_drupal_value($paragraph, 'field_required_changes', 'release_required_changes');
                        $text .= get_drupal_value($paragraph, 'field_version', 'release_version');
                    } elseif ($paragraph->getType() == 'resource') {

                        $text .= get_drupal_term_references($paragraph, 'field_source', 'resource');
                        $text .= get_drupal_link($paragraph, 'field_link', 'resource_link');
                    }
                }

                $text .= get_drupal_image_uri($node, 'field_image', 'image');
                $text .= get_drupal_node_references($node, 'field_related_articles', 'related_articles');
                $text .= get_drupal_term_references($node, 'field_tags', 'tags');
            }

            if ($content_type == 'documentation') {
                $text .= get_drupal_link($node, 'field_epub', 'epub');
                $text .= get_drupal_value($node, 'field_language_code', 'language_code');
                $text .= get_drupal_image_uri($node, 'field_language_image', 'language_image');
                $text .= get_drupal_link($node, 'field_mobi', 'mobi');
                $text .= get_drupal_link($node, 'field_pdf', 'pdf');
                $text .= get_drupal_link($node, 'field_online', 'online');
                $text .= get_drupal_user_references($node, 'field_maintainer', 'maintainer');
            }

            if ($content_type == 'event') {
                $text .= get_drupal_value($node, 'field_event_start_date', 'start_date');
                $text .= get_drupal_value($node, 'field_event_end_date', 'end_date');
                $text .= get_drupal_value($node, 'field_event_page', 'page');
                $text .= get_drupal_value($node, 'field_event_latitude', 'latitude');
                $text .= get_drupal_value($node, 'field_event_longitude', 'longitude');
                $text .= get_drupal_user_references($node, 'field_event_attendee', 'attendee');
            }

            $text .= "---\n\n";

            //Body
            if ($node->get('body')->getString()) {
                $body_value = $node->get('body')->getValue();
                $body = clean_media(strip_tags($converter->convert($body_value[0]['value'])));

                $text .= "$body\n";
            }

            $filename = file_name($path);
            write_file($content_type_dir . $filename, $text);
            $count_nodes++;
        }
    }

    echo "\n$count_nodes nodes ($content_type) were exported.\n";
}

/*
 * Loading league/html-to-markdown library.
 */
function requireAutoloader()
{
    $autoloadPaths = array(
        __DIR__ . '/vendor/autoload.php',
        __DIR__ . '/../../../autoload.php',
    );
    foreach ($autoloadPaths as $path) {
        if (file_exists($path)) {
            require_once $path;
            break;
        }
    }
}

/*
 * Get the name for taxonomy term entity.
 */
function get_drupal_term_references($entity, $field_name, $name)
{
    $term_manager = \Drupal::entityTypeManager()->getStorage('taxonomy_term');
    $references = '';

    if ($entity->get($field_name)->getString()) {
        $names = [];

        foreach ($entity->get($field_name)->getValue() as $target) {
            $entity_load = $term_manager->load($target['target_id']);
            $names[] = $entity_load->getName();
        }

        if (count($names) == 1) {
            $references = "$name: $names[0]\n";
        } else {
            $references = "$name:\n";
            foreach ($names as $name) {
                $references .= "  - $name\n";
            }
        }

    }

    return $references;
}

/*
 * Get the path for node entity.
 */
function get_drupal_node_references($entity, $field_name, $name)
{
    $node_manager = \Drupal::entityTypeManager()->getStorage('node');
    $references = '';

    if ($entity->get($field_name)->getString()) {
        $paths = [];

        foreach ($entity->get($field_name)->getValue() as $target) {
            $entity_load = $node_manager->load($target['target_id']);
            $paths[] = $entity_load->toUrl()->toString();
        }

        if (count($paths) == 1) {
            $references = "$name: $paths[0]\n";
        } else {
            $references = "$name:\n";
            foreach ($paths as $path) {
                $references .= "  - $path\n";
            }
        }

    }

    return $references;
}

/*
 * Get the name for an user entity.
 */
function get_drupal_user_references($entity, $field_name, $name)
{
    $references = '';

    if ($entity->get($field_name)->getString()) {
        $user_names = [];

        foreach ($entity->get($field_name)->getValue() as $target) {
            $user = User::load($target['target_id']);
            $user_names[] = $user->getUsername();
        }

        if (count($user_names) == 1) {
            $references = "$name: $user_names[0]\n";
        } else {
            $references = "$name:\n";
            foreach ($user_names as $user_name) {
                $references .= "  - $user_name\n";
            }
        }
    }

    return $references;
}

/*
 * Get the value for a common text field.
 */
function get_drupal_value($entity, $field_name, $name)
{
    $value = '';

    if ($entity->get($field_name)->getString()) {
        $field_value = $entity->get($field_name)->getValue();
        $clean_field_value = clean_text($field_value[0]['value']);
        $value = "$name: $clean_field_value\n";
    }

    return $value;
}

/*
 * Get the URI for a media image.
 */
function get_drupal_image_uri($entity, $field_name, $name)
{
    $image = '';

    if ($entity->get($field_name)->getString()) {
        $uri = clean_media($entity->get($field_name)->entity->getFileUri());
        $image = "$name: $uri\n";
    }

    return $image;
}

/*
 * Get the link URL for a link field.
 */
function get_drupal_link($entity, $field_name, $name)
{
    $link = '';

    if ($entity->get($field_name)->getString()) {
        $clean_link = clean_text($entity->get($field_name)->getString());
        $link = "$name: $clean_link\n";
    }

    return $link;
}

/*
 * Delete all directories from a passed directory.
 */
function delete_directory($dir)
{
    if (!file_exists($dir)) {
        return true;
    }
    if (!is_dir($dir) || is_link($dir)) {
        return unlink($dir);
    }
    foreach (scandir($dir) as $item) {
        if ($item == '.' || $item == '..') {
            continue;
        }
        if (!delete_directory($dir . "/" . $item, false)) {
            chmod($dir . "/" . $item, 0777);
            if (!delete_directory($dir . "/" . $item, false)) return false;
        };
    }
    return rmdir($dir);
}

/*
 * Write file function.
 */
function write_file($filen_name, $text)
{
    $my_file = fopen($filen_name, "w") or die("Unable to create file!");
    fwrite($my_file, $text);
    fclose($my_file);
}

/*
 * Return the file name.
 */
function file_name($path)
{
    $explode_path = explode('/', $path);

    if (count($explode_path) == 3) {
        $file = $explode_path[2];
    } else {
        $file = $explode_path[1];
    }

    return $file . '.md';
}

/*
 * Convert media url linked to asset folder.
 */
function clean_media($str)
{
    $asset_dir = '../assets/';

    $str = str_replace('public://', $asset_dir, $str);
    $str = str_replace('sites/default/files/', $asset_dir, $str);

    return $str;
}

/*
 * Add single quote to large strings.yup
 */
function clean_text($text)
{
    $text = strip_tags($text);
    if (count(explode(' ', $text)) >= 2) {
        $text = "'$text'";
    }

    return $text;
}

 

By using PHP CLi you can execute the script: 

node migrate d8

 

drupal snippet --file=console/snippet/node-migrate.php

---

title: 'Add DrupalConsole to a project using Acquia Lightning distribution'

date: 2016-11-28

author: jmolivas

path: /articles/add-drupalconsole-to-a-project-using-acquia-lightning-distribution

image: ../assets/2016-11/drupal-console-lightning-blog-header_0.png

tags:

 - Drupal

 - drupal8

 - DrupalPlanet

---

Lightning is a base distribution maintained by Acquia. In this short blog post, you will learn how to fix the dependency conflicts when trying to add DrupalConsole to a project using the Lightning distribution.

Lightning distribution at Drupal project https://www.drupal.org/project/lightning

### Download Lightning distribution

As you can see we are passing the `--no-install` flag, this will skip installation of the package dependencies, and it is required to avoid the conflict between dependencies.

### Download DrupalConsole

According to project instructions at [readme](https://github.com/hechoendrupal/DrupalConsole/#download-as-new-dependency) file.

This command will add the DrupalConsole dependency to the composer.json file and then resolve and download Lightning + DrupalConsole.

 

The exported directory structure contains one file (*.md) for each node, using the path as filename, and, these are included under one directory, per content type.

Assets

Regarding images, we need to copy the /sites/default/files folder into the new static directory with the name “assets”. The directory destination and the name for the assets dir are configurable. The function clean_media will be in charge of converting Drupal default URLs to the new ones in the asset dir.

This example just supports nodes and users from Drupal. If you need to export permission, list of modules, taxonomies itself, etc., you must modify the script or create another one.

Now you are ready to start a new amazing Gatsby project using by the content from a D8 site. 

Enjoy your coding!

Related Posts

Are you interested in knowing more about GatsbyJS and JAMstack?

We can provide support. Feel free to contact us, we will be more than glad to help you.