Search results

DITA: Author in Markdown, publish with DITA

Markdown drastically simplifies the authoring process. As long as you don't need to incorporate conref and other DITA specific tags (which aren't available in Markdown), you can implement this super simple authoring process.

This Markdown to DITA process is intended to support a workflow where you begin drafting documentation in Markdown as you gather content. When you're mostly finished authoring content, you can transform it all to DITA and then start working with it there.

Here's a quick looping video demo:

Prerequisites

These instructions are intended for Mac users, but it shouldn't be too hard to do the same on a PC.

Process Markdown to HTML

To automate the conversion from Markdown to HTML:

  1. Create a folder to store your project files. I'll call mine ditaqrg.

  2. Create a Markdown file and add it to this folder. Make sure it uses the .md extension.
  3. In this same folder, create a file named ditaqrg.sh (you can choose whatever name you want) and add this code:

    <code>multimarkdown -f -b *.md
    cd /Applications/oxygenAuthor_161/frameworks/dita/DITA-OT/plugins/h2d
    ant -Dargs.input=/Users/tjohnson/projects/ditaqrg -Dargs.output=/Users/tjohnson/projects/ditaqrg
    cd /users/tjohnson/projects/ditaqrg
    </code>

    Change “ditaqrg” to whatever project name you chose to use. You will need to customize the Oxygen h2d path and your input and output directories to match your specific locations. I explain this code in more detail below.

  4. Open Terminal and cd to your project folder.

  5. Type the following to change the permission to give read/write/execute to the file:

    chmod ugo+rwx ditaqrg.sh

    chmod changes the file permissions. ugo stands for “user, group, others,” (these are the three groups that can access the file) and rwx means “read, write, execute”. Then we list the specific file we want to apply these permissions to (ditaqrg.sh).

  6. Type ls -l ditaqrg.sh. Verify that the file now has rwxrwxrwx permissions. (The -l adds more detail in the response.)

  7. Type ./projectname.sh to run the file.

If you look in the folder, you will see your .md files now have corresponding .html and .dita files.

Note that each time you run the script, the Markdown files will overwrite the DITA files. If you don't want this automated process to keep overwriting the DITA files, remove the Markdown file.

Code explanation

The heart of this transformation lies with this script:

multimarkdown -f -b *.md
cd /Applications/oxygenAuthor_161/frameworks/dita/DITA-OT/plugins/h2d
ant -Dargs.input=/Users/tjohnson/projects/ditaqrg -Dargs.output=/Users/tjohnson/projects/ditaqrg
cd /users/tjohnson/projects/ditaqrg

What is this code doing? I'll go through it line by line.

multimarkdown -f -b *.md

First, the multimarkdown line will process your Markdown files into HTML by running the Fletcher script. The -f parameter says to create a full header in the HTML document. This means the HTML files will include the following:

<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8"/>
</head>
<body

This is necessary for the ant scripts to run. The -b runs a batch conversion, so it will process all files in the folder. The *.md restricts the conversion to any files with an .md extension, which is the traditional file extension for Markdown files.

cd /Applications/oxygenAuthor_161/frameworks/dita/DITA-OT/plugins/h2d

Now the script changes to the h2d directory. In this directory, you have a build.xml file supplied by OxygenXML that will convert your content from HTML to DITA. Customize this path to match your own OxygenXML installation directory.

ant -Dargs.input=/Users/tjohnson/projects/ditaqrg -Dargs.output=/Users/tjohnson/projects/ditaqrg

On this line, ant specifies input arguments for the HTML to DITA transformation.

Note that you must have an HTML file in the input directory for the script to work. If you don't, you'll see a message that says “build failed - could not create temp directory.” You don't need to manually create the temp directory – just put an HTML in the input directory.

If your HTML file were in this h2d folder, and your Terminal location was at the h2d folder path, you would now only need to type ant in the command line, and the build.xml file also located in this same folder would run the h2d transform to convert the HTML file to a DITA topic type. That's how ant works – when you type ant on the command line, it looks for a build.xml file in that same folder and executes the build file using the default arguments of the build.xml file.

However, it's unlikely that you want to store all your Markdown files and HTML files in this deeply nested Oxygen installation directory. In this example, we add some arguments to the ant parameter to specify a different input and output directory from the default settings.

The -Dargs.input specifies the directory containing the files you want to transform. If you point to the directory, all Markdown files in that directory will be processed. If you point to a specific file, only that file will be processed.

The -Dargs.output specifies the output directory for the transformed files. In this case, I want them to be output to the same directory as the Markdown files.

There's another argument you might want to add: Dargs.infotype. This argument specifies the topic type to convert the HTML into. Options here include topic, task, concept, or reference. The default (if you don't specify the option) is topic. So you actually don't need this argument if you're just converting to topics, which is what I recommend.

There's are some more arguments to be aware of: args.include.subdirs=“yes”. This argument says to look in subdirectories for files. I don't use subdirectories to organize my DITA files (except for images), so this argument isn't that relevant. If you don't specify this argument, the default value is “no”.

Two other arguments include the default language and the XSL file. The arguments are defined in Migrating HTML to DITA with Ant script. Note that in that article, it shows the following:

<code>ant -Dargs.input={file|direcotry} -Dargs.output={direcotry} -Dargs.infotype={topic|concept|task|reference}
</code>

The "direcotry" misspellings are not mine. Also, the file never explains that "{file|directory}" means you can enter either the file or directory, and also that you omit the {} tags.

cd /users/tjohnson/projects/ditaqrg

After the script runs, we return to the same directory we were in at the beginning. You can type ls to list the contents of the directory. You will then see a .dita file corresponding to each HTML file in the directory. If you group your files by kind, you won't see the files in triplicate as you browse them.

Limitations

You can't use DITA markup within Markdown and have it survive the transforms. For example, if you put a note element in your Markdown script, it will not survive the conversion to DITA.

About Tom Johnson

Tom Johnson

I'm an API technical writer based in the Seattle area. On this blog, I write about topics related to technical writing and communication — such as software documentation, API documentation, AI, information architecture, content strategy, writing processes, plain language, tech comm careers, and more. Check out my API documentation course if you're looking for more info about documenting APIs. Or see my posts on AI and AI course section for more on the latest in AI and tech comm.

If you're a technical writer and want to keep on top of the latest trends in the tech comm, be sure to subscribe to email updates below. You can also learn more about me or contact me. Finally, note that the opinions I express on my blog are my own points of view, not that of my employer.