Structured Authoring By For And Or Nor With In the Web
It's always fun / makes my stomach turn to wake up to a newsletter that starts out saying, "Tom Johnson's post Structured Authoring Versus the Web triggered a wave of responses across the tech comm community."
I've been thinking about that post and discussion (1, 2 , 3). A lot of people have made excellent arguments in response and called me out on being short-sighted. One person noted that I should nuance my opinions with more notes and caveats for different situations.
I must say … you're right. I'm sorry. In many situations, a tech comm model that starts in DITA and publishes to a website makes sense, and I should have acknowledged that more fully.
It's the English major in me that naturally gravitates toward controversy. I like to explore alternative views and figure out what it means for my own authoring situation. In this post, I'll make a few concessions and notes in favor of structured authoring and the web.
First, I regret using the term "structured authoring." I since updated the post to say Structured Authoring like DITA, because by many definitions I'm already doing structured authoring and wouldn't dream of anything different, especially on the web.
I mentioned that I'm using Drupal. In Drupal you can define different content types, and then create vocabularies and terms for each content type (based on your taxonomy). Using those controlled/structured vocabularies, you can leverage the content for different views, navigations, facets, and other manipulation -- all based on the semantics of your content. There's no way to do this without a controlled vocabulary and some structure.
And I very much dislike wysiwyg tools. In the content area of an article, I usually work in source view, that is, without wysiwyg, if I'm even using the content editing area at all. Usually I work in another tool altogether because content authoring in Drupal is a really poor experience.
Although many people equate structured authoring with DITA or Docbook, doing so is shortsighted. Mark Baker noted that "DITA may have a lot of currency in Tech Comm right now, but compared to the prevalence of microdocument architectures across the Web, it is a small sideshow."
The web actually provides a lot of opportunity for structured authoring because you can easily put a form to content, providing a more controlled way to wrap content with semantic tags. I am all for this kind of structure and like the directions that some leaders like Karen McGrane champion with baking semantics into web authoring. (See also this post on What you see is SMUC for more discussion.)
Another concession: I readily agree that you need a common language to move content from one system to another. It makes sense to write in structured content like DITA because through that structure you can plug the content into another system. Connectors either exist for moving content from platform to platform, or you can pay someone to build a connector, or you can learn programming and probably do it yourself -- as long as you have structure.
Given how poor it is to author content in a Web CMS -- in fact, even though this post is in WordPress, I'm typing it in Evernote instead; sometimes I use Sublime -- I feel the pain that usually gives rise to the need for implementing component content management systems. It makes a lot of sense to separate out your authoring from your delivery platforms. I mentioned this a bit in my post on Authoring to Publishing Workflows.
Another concession: If you have a large authoring team, with a lot of writers working together on content, then yeah, you will probably want a structured authoring model like DITA to enable this collaboration with your content. And especially if you are translating and versioning this content, you need something like DITA to manage and publish it all.
And another concession: Noz Urbina pointed out that content re-use has a more in-depth angles than merely reusing the same topic in multiple places. Check out this lengthy discussion on the content strategy group for different angles to explore here. Thanks, Noz.
Now for a few reflections.
I wonder why this post stirred up so many responses. Whenever people get upset, there's a bit of truth at the core. Sure, maybe people are dismayed that someone with web influence would question a DITA-to-website-publishing model.
But maybe people are upset for a different reason. Maybe publishing to a website is somewhat of a sore point. Maybe it is a bit more gnarly and complicated than it should be.
I know that whatever authoring tool you're using, getting the output into a website (not pushing out a webhelp file) is tricky business. When I used Flare for authoring, our organization's web publishing group (another department) wanted to pull the content into the org's main website. They were going to transform the XML output and pull it in -- but then no one pursued that much. So many questions cropped up, like how do you delete outdated articles once the content is on the website, how can we control/edit/update the content once it passes to the web (the gatekeepers wouldn't allow us access to their tools), how do you make the TOC match up with the web navigation, and such. In the end they wanted us to deliver Word or HTML files to them so they could manually add it to a custom web CMS with forms that wrapped the content with the right tags.
I think that many people who do create content in DITA and publish out to websites often do it via a component content management system and some custom publishing workflows -- an expense that a large enterprise could probably justify. Or they build their own custom tools to accommodate these workflows. (Companies like Google, Apple, Twitter, etc., often do this.) Other custom scripts and connectors are also available.
So when I initially raised the question of connecting structured content with website publishing, it kind of exposed a key issue. I think tech writers feel somewhat on their own when it comes to getting content out of their tools and into (not as a link on) their company's website. We want to maintain our own toolsets, and we often end up running the whole authoring to publishing workflow. The result is a content set that is often siloed from other org content -- a little webhelp frame in its own separate system. Bridging this gap in an easy way remains a strong need in our discipline.
I hope the discussions will continue on this topic. And by all means, when I'm wrong, let me know in the comments, even if it makes my stomach turn. My views evolve and change. I can be persuaded and dissuaded towards a great many topics. I admit that I often champion the underdog, and I love the possibility of open source movements over expensive vendor solutions. I'll try to make more balanced concessions in future posts. In the meantime, thanks Larry, Ellis, Noz, Rahel, Scott, John, Michael, Don, Connie, Neal, and dozens of others who have chimed into this discussion.
About Tom Johnson
I'm an API technical writer based in the Seattle area. On this blog, I write about topics related to technical writing and communication — such as software documentation, API documentation, AI, information architecture, content strategy, writing processes, plain language, tech comm careers, and more. Check out my API documentation course if you're looking for more info about documenting APIs. Or see my posts on AI and AI course section for more on the latest in AI and tech comm.
If you're a technical writer and want to keep on top of the latest trends in the tech comm, be sure to subscribe to email updates below. You can also learn more about me or contact me. Finally, note that the opinions I express on my blog are my own points of view, not that of my employer.