Exploring Markdown in Collaborative Authoring to Publishing Workflows

One of the pains in tech comm is figuring out a good collaborative authoring to publishing workflow.

When you’re authoring content, you’re usually in the figure-it-out mode. You add pieces here and there as you learn how a system works. You gather feedback from subject matter experts, who also add comments, delete or add content, and so forth as they review the content.

Sometimes the content originates from the subject matter expert and you’re the one editing the content. Regardless, during this development phase, content is in a constant flux — increasing in words, decreasing in words, changing shape and form with each iteration. Sections get added and deleted. Steps are constantly altered, rearranged, inserted and reordered, and so on.

During content development, I need a format that’s flexible and easy to work with. I also need the content on a web platform that’s easy to share access so that I can collaborate with others.

I’ve found that Google Docs works great for collaborative authoring. Subject matter experts can highlight a word and add comments in the sidebar. Their comments appear right next to the highlighted words. I can also reply to their comments or resolve them. Resolved comments are hidden in an archive that I can always open and view later. Subject matter experts also find no problem at all originating new content in Google Docs.

I love how easy it is to carry on conversations about specific parts of the document.

I love how easy it is to carry on conversations about specific parts of the document in Google docs.

Best of all, Google Docs is free. SMEs don’t need to download special software or pay extra fees to access, write, and edit content.

Wikis are also good content development tools. Although sometimes adding comments in a system like Mediawiki isn’t quite as intuitive (there’s a separate tab on each page for comments), the wiki format has proven useful across the web for collaborative content needs of large-scale teams.

The Problem

Here’s the problem. When I’m done authoring the content, I want to port the content into another system for publishing. In my case, Drupal. Others have different publishing needs. Maybe you have a company website where the help material goes, or a help center portal, or some other platform. How do you get your content out of one system and into another, especially when that first system is the easygoing collaborative scratchpad that doesn’t have a lot of structure to pull on intelligently?

If you try a copy-and-paste route with Google Docs, you end up with a ton of span tags and other inline formatting in your code. Every paragraph has a left-to-right class associated with it. The HTML export from Confluence can be problematic as well. It all depends just how much your authors have been using those rich text editor buttons. The more they use rich text editing, the worse the code looks.

What I need is a universal format that allows content to be transported from one system to another without any reformatting.

The need for porting content from one system to another was exactly why I was looking into DITA, because you can do this kind of system porting with DITA. But all the workflows seemed problematic. If I authored in an XML editor and output to HTML, and then copied and pasted the HTML output into Google Docs, I’d still be spending a lot of time tagging content, because the source in Google Docs would change as the content gets edited and updated by SMEs and other authors. I’d have to copy and paste the Google Doc content back into my XML editor and restructure everything with the right tags.

Vendor-provided collaborative platforms for structured content offer solutions to this problem but usually recreate the collaborative tools within their own system. And these online platforms aren’t cheap, especially if you have a lot of users.

What I want is a lightweight system for porting content around. In my brief exploration into feasible and inexpensive methods, I stumbled across Markdown. Markdown is a shorthand syntax for HTML. It’s pretty similar to Mediawiki syntax but simpler, and its main job is to transform the syntax to HTML.

But here’s the cool thing about Markdown: the syntax is readable in its syntactical form. Writers and editors can easily work with the content without having to render it to the HTML format to read it in a more practical way.

When I say readable, here are a few examples:

Heading 2

Heading 3

A numbered list:

1. first item
2. this is my second item
3. this is my third item.

A bulleted list:

* first item
* second item
* third item

A code example:
<?php echo <p>Hello World</p>; ?>

And so on. You can read it pretty easily even though it’s in a code-like syntax. In contrast, the same content in a markup language is going to be much more cumbersome to read through. The same content in HTML would look like this:

<h2>Heading 2</h2>
<h3>Heading 3</h3>

<p>A numbered list:</p>
<li>This is my first item.</li>
<li>This is my second item.</li>
<li>This is my third item.</li>

<p>A bulleted list.</p>
<li>This is my first item.</li>
<li>This is my second item.</li>
<li>This is my third item.</li>

<p>A code example:</p>

<?php echo <p>Hello World</p>; ?>

What’s neat is that you can still write Markdown in Google Docs. Just leave that rich text editor alone. Google Docs won’t render the Markdown into HTML or anything. If it did, that would be wonderful. But it doesn’t.

The question is whether Markdown is sufficiently readable for SMEs who have no knowledge of Markdown syntax to quickly understand and work in Markdown syntax, rather than resorting to the rich text editor. Time will tell on that one — I haven’t yet piloted this little Markdown experiment.

Lots of people on the web, however, have been using Markdown for years (especially programmers). For example, if you don’t use Google Docs, you could also use Markdown with Mediawiki. The Alternate Syntax Parser Extension converts the Mediawiki syntax to Markdown syntax.

And Drupal has a Markdown filter module that allows you to use Markdown instead of HTML as the syntax.

What does this mean? Whether you’re authoring content in Google Docs, Mediawiki, Drupal, WordPress, or some other platform where Markdown is supported, you can simply copy and paste the content from one web platform to another without reformatting it. Markdown is a common language among platforms.

Granted, copy and paste kind of sucks a best practice, but it beats reformatting the content by hand.

Also, this authoring to publishing workflow would by no means fit a large publishing house’s need, such as publishing a 200 page manual or handling translation. But if you’re in an agile environment, where you’re publishing only a handful of new articles a week to an existing web platform (which you keep adding and adding to), it just might be enough structure to simplify the authoring to publishing workflow.

Madcap FlareAdobe Robohelp

By Tom Johnson

I'm a technical writer working for the 41st Parameter in San Jose, California. I'm interested in topics related to technical writing, such as visual communication, API documentation, information architecture, web publishing, JavaScript, front-end design, content strategy, Jekyll, and more. Feel free to contact me with any questions.

18 thoughts on “Exploring Markdown in Collaborative Authoring to Publishing Workflows

  1. Neal

    That’s a nifty idea. You don’t have to justify a new purchase (possibly a challenge in a small company) or try to convince your collaborators to use a new tool.

    How are you dealing with images? Or tables?

    And do you include links in the review versions, or add them later when you’re working in the authoring/publishing tool?

    1. Tom Johnson

      So I’m still figuring out the best process. But Markdown has a syntax for both images and links.

      A link is like this:


      An image is like this:

      ![Eiffel Tower](http://yoursite.com/eiffeltower.jpg)

      I’m planning to upload images into Drupal and grab the URLs. Then insert those URLs into my Google doc.

      I also realized that I can apply heading and bold styling to the Google doc. The formatting gets stripped out when I paste it into a plain text editor (which is what I want). But the extra formatting makes the Markdown syntax a bit more readable.

      I’ve been thinking about this simplified workflow more. DITA is great if you have a cruise ship you’re trying to steer. But it doesn’t seem like the right fit for my environment. 2 writes publishing maybe 10 articles a week to a website? Uh, this process is so much simpler.

      Also, I’ve found that engineers actually review content in Google docs. The collaborative platform makes life so much easier.

      I have to wonder how other agile shops publish content. If you’re in an agile environment, you’re publishing on a regular basis rather than storing everything up for a six month release. I may publish 5-10 pages a week, but that comes out to about 200 pages a year. The traditional waterfall method would publish all 200 pages at once, and doing that via copy and paste would be really tedious and impractical. But at the rate I’m publishing, it’s not burdensome at all.

      Re links to other non-published help content, I haven’t really explored it but I’ll probably create unpublished placeholders in Drupal while I flesh out the content in Google docs. I can link to the “node” and the link is stable. The node link remains the same regardless of the title.

      1. Neal

        That looks pretty reasonable. I’m currently writing in and publishing to one tool (Zendesk knowledge base), but using google docs would mean that reviewers wouldn’t have to log in to the site to review content.

        I might also want to pull the content into another system if I start creating online or in-product help. But that’s a more complicated issue.

      2. Neal Kaplan

        That looks pretty reasonable. I’m currently writing in and publishing to one tool (Zendesk knowledge base), but using google docs would mean that reviewers wouldn’t have to log in to the site to review content.

        I might also want to pull the content into another system if I start creating online or in-product help. But that’s a more complicated issue.

  2. Ellis Pratt

    There is a bug, unfortunately, in Google Docs that means comments added via Firefox disappear. If your SMEs are going to be adding comments, you’ll probably need to tell them to use a different browser.

    You can export Confluence content to DocBook XML, and that should give you cleaner code. Sarah Maddox covered it in her post “DocBook export and import round trip with Confluence wiki” http://ffeathers.wordpress.com/2012/02/19/docbook-export-and-import-round-trip-with-confluence-wiki/

    My biggest concern with all of this is, what would happen if you left the company? Would your SMEs be faced with an authoring environment that was a black box no-one understood, resulting in the the documentation being left to acquire digital cobwebs? With Google Docs, Word and wiki-like environments (also Flare, if you customise the UI) people are generally willing to add content.

    Did you try HTML Tidy on the Google Docs output? I wonder if that would result in clean enough HTML code? http://www.w3.org/People/Raggett/tidy/

    1. Tom Johnson

      I saw the Docbook export from Confluence, but that seems a bit heavy for my needs. Plus I haven’t found a Docbook import into Drupal, so I’d probably have to convert it to HTML and then import it into Drupal.

      Re cobwebs, I think my process here is the plainest and most appealing to developers. People already use Google docs for quite a bit of content. The Markdown syntax is extremely simple and many programmers already know a little about it. Copying and pasting the Markdown copy from Google docs into Drupal is as simple as it gets.

      In contrast, I just got an email today from someone at my previous organization who is trying to access the content I’d written in Flare. First they struggled to find license information. Then they didn’t understand the folder structure. If they ever get clean content out I’ll be amazed.

  3. John Tait

    There is always org-mode.

    Org gives you plain-text Markdown-style syntax, with really cool tables. You can publish to HTML, LaTeX, and many, many other formats directly from Emacs.

    You can assemble topics using the #+INCLUDE: construction.

    #+INCLUDE: “./pages/conref_tutorial.org”

    You can filter topics for export using :tags: on the headlines with these:

    #+EXPORT_SELECT_TAGS: colours

    Combining these gives you lots of control.

    A new feature in org-mode 8 (one I requested on the mailing list!) is the ability to group tags into a simple taxonomy, which gives you a bit of “subject scheme” control.

    My website is written in org-mode: you can see the top page source here: http://www.johntait.org/index.org

    A more typical page is: http://www.johntait.org/pages/conref_tutorial.org

    Org has lots of other features not related directly to publishing, such as its amazing agenda.

    1. Tom Johnson

      org-mode looks pretty powerful. I can’t believe I’d never heard of it. Thanks for the tip. I guess I’m more interested in the collaborative aspect of tools, so they need to be online and readily accessible. If you write documentation in org-mode, how do you get engineers to review it? That’s the key point I’m after.

  4. John Tait

    I should have made explicit, but the huge advantage with org-mode (and Markdown) is that the source code is completely readable as plain text, unlike XML. (The “top level” page has the complex options, but the content pages don’t have to.)

    Markdown famously copies the email format, and org-mode is similar – you can develop drafts in an email itself. Plus you can use git, etc.

    1. Mark Baker

      I love how things have come full circle. SGML allowed you to specify markup that was completely readable as plain text. The XML folk came along and said, that’s old hat, everything is going WYSIWYG, no one will ever see the markup, we don’t need that capability anymore.

      And now here we are, with a growing consensus that WYSIWYG does not cut it any more, and a growing popularity of plain text markup forms like Markdown.

      What we have lost is that with SGML, you could have markup that was both fully structured and fully readable as plain text.

      That said, it is worth pointing out what, apart from structure, XML gives you that Markdown and its various cousins, such as Wiki markup can’t — validation. XML fully separates markup from data, so if the markup is invalid, it can always always be detected. With markdown and wiki markup, you can’t validate the markup because anything that is not detected as markup is treated as plain text. That means it can’t be validated by machine. The only validation of the markup is that it looks correct when viewed in WYSIWYG mode.

      The same was true, to a certain extent, when SGML was configured to use plain text style markup. The parser could no longer recognize all markup errors, though it might still be able to recognize structure errors.

      So it is important to ask, depending on the application, whether you need markup that can be mechanically validated.

  5. Sarah Reese

    Thanks for sharing Markdown – it’s something that’s a great solution for many people, but most aren’t aware of its existence. I didn’t have a clue what it was until late last year. I’ve been using a combination of Markdown and HTML on Drupal for several months, and I would say it has its pros and cons. We’ve integrated the MultiMarkdown extension on Drupal for our Markdown input format, as well.

    The amazing thing about Markdown is that it’s intuitive – if you move from step one to step three, Markdown fixes the issue for you on the front end (although your content in the text editor will still skip step two). It’s pretty dummy-proof and is very easy to learn – this is a great benefit because it requires little to no commitment from those SMEs who are reviewing your documentation. That’s a lifesaver. Plus, you can do a great portion of what you can do in HTML in the Markdown syntax. And if you can’t use markdown for something, no biggie (at least in Drupal) because you can choose Markdown as the input format and use bits of HTML within your Markdown-formatted content (as long as you don’t use Markdown within your HTML tags). So it’s a pretty nifty little tool.

    The downfall of Markdown (for me) is that it’s intuitive. I realize I said that’s why it’s amazing, but it’s really a double-edged sword. I find that at times of adding pictures, nesting an unordered list within an ordered list, and even adding tables, Markdown will “intuitively” begin renumbering and there’s no good way to fix this. If for some reason it doesn’t renumber itself, then you may wind up with funky breaks between your numbered items. While this doesn’t impact the efficacy of the documentation, it just doesn’t look pretty. For me, pretty content is a HUGE part of my development process. I’m all about consistent breaks and spacing. Also, any tags that one might use in HTML to format things like tables (borders, table headers filled in a different colors, etc.) are not possible. From what I understand, this all has to be done in CSS styling, but once they’re done, they’re done, so it’s not a huge concern. These aren’t major issues, but they are enough to make me not use Markdown 100% of the time.

    I’ve come to use a blend of Markdown and HTML syntax based on what I’m developing. If I know that I will be doing a lot of nesting, adding images (which I don’t do often) or tables, I use HTML. If I’m doing just basic numbered lists, as most of our content requires, I’ll stick with Markdown.

    1. Tom Johnson

      Sarah, thanks for sharing your experience with Markdown. I’m excited to connect with someone else who actually uses Markdown for tech comm. Re the quirks you mentioned, this will be good to know. I’m still in my early pilot phase of using Markdown, so maybe I’ll run into the same quirks. But really the CSS styling should handle the majority of things, such as image spacing, tables, and more.

      I experimented with inserting an image between list items and didn’t see the issue you noted. Care to send me the actual code that produces the quirk?

      Re tables, I was playing around with ways to facilitate tables yesterday. You can actually apply some formatting in Google docs to facilitate visuals for SMEs (such as tables). The trick is to ensure everything comes over when you copy and paste it into plain text.

      For example, a table in markdown is like this:

      field | description
      stuff 1 | description 1
      stuff 2 | description 2
      stuff 3 | this is a really long description that wraps
      to the next line down
      stuff 4 | short description

      Maybe a SME wants a more readable looking table. Fine, you can insert a general table in Google docs and just put the text in there. Make sure you keep the pipes. When you copy and paste the table into plain text, the table formatting drops out and you only have the markdown left.

      Your CSS style then renders each element: table, th, tbody, tr, tr, etc.

      1. Sarah Reese

        I don’t have samples of the syntax that I was using, as this was several months ago… but I was able to view a reverted version of the end-user view in Drupal and for the example that I sent to a coworker (who is also a Markdown advocate) it was specifically an example using a table. So I went 1-4, with step 4 having the table (using Markdown syntax similar to yours above) and then what I numbered as step 5 in my editor wound up being automatically renumbered to 1, so it went 1, 2, 3, 4, 1. Combined, we spent quite a bit of time trying to troubleshoot this issue with no luck, so I decided to just move it over to HTML and save myself the frustration.

        I really like the idea of using Google docs to use for feedback – I will have to check that out. Also, if I’m able to find more examples of quirks, easy fixes, etc. within Markdown, I’ll be sure to let you know. We’ve completely revamped our knowledge base within the last year, so I’ve touched a massive amount of documents in that time and still have MANY more to go. I’m sure I’ll come across more as I’m going back through polishing everything up.

        On a side note, I’m really excited that I’ve found this site. I’m the only technical writer at my company (where I previously worked at what one might equate to Goliath National Bank) and it helps to be part of a community where I can learn from others and share ideas… so thank you!

  6. Mica Semrick

    Markdown is awesome. I write almost everything in it now. You’ll certainly want to check out pandoc (http://johnmacfarlane.net/pandoc/) if you haven’t already. It will transform your markdown into many document types.

    I’ve been trying to introduce people I collab with to git. It can be a bit much at first, but the basic commit & push is easy to understand. Github has a great graphical client now, and there are many other GUIs available for git that make its day-to-day use quite friction-less.

    1. Tom Johnson

      Mica, thanks for the tip on Markdown. I think Pandoc looks pretty amazing, though I haven’t tried it out yet. What do you use to write Markdown? Do you use a particular editor, like Sublime? I’ve been writing Markdown using Google docs, combining it with some rich text formatting solely to increase readability. When I copy over the Google docs content into a text editor, all the rich text formatting strips out but the Markdown syntax stays. So far it works pretty great.

      I have run into one limitation: links. How do you know what links will be before the pages are even published?

  7. mass unfriend facebook

    I am no longer certain the plafe you are getting yor info, but good topic.

    I needs to spend some time studying more oor understanding more.
    Thanks for wonderful information I used to be in search of this info
    for my mission.


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>