How to Search Engine Optimize (SEO) Your Help Documentation
Continuing with my ongoing analysis of why people can't find answers in help, we come to #4:
The user searches for the answer, but the help's poor SEO prevents the answer from surfacing.
This reason received 81 votes -- the third most voted reason.
I've written about search engine optimization (SEO) in the past, but this time I want to focus my attention more on SEO of help material rather than web pages in general. You can get info on SEO from a near infinite number of sources online, but few of them address SEO in the context of documentation.
Tech writers don't have much time for SEO
First, although I know that basic SEO involves analyzing your user's keywords and then optimizing your page with those keywords, as well as figuring out clever ways to back to pages (preferably through external sites), few technical writers have time to do this.
Most of us are working in agile environments with rapid publishing cycles. Just getting help ready for the next release consumes most of our time. We're not worrying about whether we're using the right keywords and if we're repeating them in optimal places on the page. Heck no! We're hoping that we've covered all the information bases and that the material is clear, accurate, and integrated with the rest of the documentation before release.
Given that we focus more on information than search engine optimization (SEO), how can we still adequately address SEO with help material?
Publishing platforms do matter
While technical writers may not have time to address SEO in their material, the publishing architecture of documentation is something that can be controlled. If your publishing platform is optimized for search, then you can minimize the time needed to SEO your content and instead focus on content creation.
I suspect that many help platforms have poor SEO. In a Madcap Software webinar earlier this week, Justin Brock noted how he addressed SEO with his Flare output. He shifted away from iframes in his output by changing the structure of the URLs (using /content/index.htm rather than /Default.htm). He also incorporated Twitter Bootstrap and his own navigation system based on snippets in the output.
Justin noted a few other tips in his webinar, like paying attention to the URL hierarchy, but his point about iframes being bad news for SEO seemed most salient. If your help platform uses iframes and you're publishing on Google, your SEO is probably minimal. (By the way, check out Justin's Flarestrap project if you're using Flare.)
iframes are better than framesets, but both have a negative impact on search engines. An iframe basically pulls content from another source into your page dynamically. When you open up a help topic, right click, and view the source, if you can't find the text on the page you're reading, then chances the publishing mechanism uses an iframe for the content.
iframes are valid techniques for HTML, but I think Google tries to rank the original source of information -- following back to the original source rather than the embedded instance. If the original source leads to some dark corner of your publishing files, one that is stripped of context, pulling the content out on its own may appear odd.
Why do help tools use iframes? I think many help authoring tools use iframes in order to accommodate the table of contents (TOC) in the sidebar. To enable a TOC that gives you rapid navigation through folders, allowing you to expand and collapse and jump from one article to the next, the TOC is usually decoupled from the main content window.
While this may make documentation more navigable via a TOC, we must remember that a lot of people, if not more, use search instead of the TOC to find content. As such, any iframes that hide information from search should be avoided if you want to make content more findable in search.
Use web platforms to maximize web visibility
If you want to maximize the visibility of your content on the web, use an established web platform that doesn't hide content in iframes. Platforms like Mediawiki, WordPress, Drupal, and myriad other online web platforms do a great job at ensuring maximum visibility in Google.
Some of these platforms even take into consideration things like "canonical" links. In WordPress, you can get to the same content a number of different ways -- through the most recent posts on the homepage, date-based archives, tags, categories, search, and the single post view. When Google indexes your site, it knows which view to prioritize through a canonical link. The canonical link identifies which content makes it into the canon, which is your primary source for the content.
Changing links isn't good for Google indexing and ranking
Another benefit of web platforms is that links are more stable and fixed. This may seem like a disadvantage at first. You may like how your help authoring tool updates all of your links automatically, or changes the names of your URLs dynamically when you update a cross-reference, and such. But here's the problem. If your links constantly change, you have to reindex your site on Google with the new links. In my experience, Google doesn't like it when you're constantly changing the URL of a page.
For example, at my last job, for a couple of years I had all my content on Mediawiki. Then I switched it to Flare and published to HTML5 webhelp and changed the server location. I did all the things necessary to optimize the SEO for the new webhelp -- creating a sitemap, pointing links to the new content from regularly indexed sites, decommissioning the old site content, adding the site map to Google Webmaster Tools, etc.
And then I waited. After a few weeks, the content ranked on the fourth or fifth page of results on Google, while the old Mediawiki pages consistently remained at the top. I think it took more than a year for Google to move the search results for the new content to the first page of Google, and even now the rankings don't outperform basic blog articles I'd written about the product on Joomla.
The lesson I learned is that while Google may index your site, if you constantly change the URLs and location of the help, it may take a long time (i.e., months) for Google to rank the content appropriately. It's better to stick with the same page URL and location for your help content.
Web platforms usually don't make it easy to update links across your entire site -- and that's a good thing. If you want to update a URL, it's a manual process, with no guarantee that you won't break all other links pointing to that page.
Some web platforms are actually considerate of old links in an SEO friendly way. When you update a page on Drupal with a new URL, Drupal automatically creates a 301 redirect of the previous URL to the new URL. The 301 redirect lets Google know where the new content is.
Few people link to help because help is unsexy
While we're on the topic of links, I should also mention that Google's algorithm is heavily based on backlinks, that is, the number of links pointing a particular page. In fact, backlinks are the original genius of Google's search. While the first search engines ranked pages by keyword frequency and location, Google thought to weight pages based on how many other pages linked to the page, and judged the page's content partly by the text and context of those links.
This is a problem for documentation, since few people actually link to help material. There are a few reasons for this. First, if your help output uses frames and doesn't show unique links for each page, users may not be able to create unique links to topics at all.
Second, help material usually isn't sexy enough to link to. Unless there is some excitement or tremendously useful information on the page, why would anyone link to it? Help is simply too boring to merit a link.
Third, many of the topics in a help file are small and part of a larger context of a book-like paradigm. You don't see many of the standalone, Smashing Magazine-type articles full of rich content, visual eye candy, and useful hacks. Commonly, the help topic is a plain text task that consists of simple steps and nothing more.
In short, most technical writing styles doesn't encourage people to link to help content. If no one links to the help content, the help content's SEO suffers significantly in Google.
Article titles that include product names can help with SEO
Another way that web platforms outperform tech pub outputs is with heading titles. With a web platform, all of your content is on the same site. This means you can't have vague topic titles like "Introduction." Instead, if you have multiple products on your site, you usually need to preface the title with something like "ACME Product: Introduction" in order to separate information about one product from another. I wrote about this technique previously here: Subpage Titles on Wikis: Challenges, Conventions, and Compromises.
Although I initially thought this was a disadvantage, I now come to see it as an advantage of web platforms in search. An article that has the product name in the title is so much more findable than a general term.
With a help tool that might produce half a dozen outputs for each user role, you don't have to be so specific with topic titles. "Introduction" might be the name of the lead-off topic for every single output. How is Google supposed to know which introduction belongs to which product and role?
Google has a lot more content than a help file
Perhaps the most challenging part of SEO is simply finding results for user keywords. In Your Content Isn't the Internet, and Your Search Page Isn't Google, Ben Minson points out that one reason people find information on Google but not in help files is because Google has trillions of pages in its index, whereas a help file just has a few hundred pages.
If a user searches for "how to build a didgeridoo" in your help file, but you don't have anything about didgeridoos, the user won't find any content. In contrast, if a user searches for "how to build a didgeridoo" on Google, and Google has indexed blog pages that explain how to build didgeridoos, then the user will find content.
This doesn't mean that content is more findable on Google. It means more content is findable when it exists in the place that a user searches for it. If your help file had information on building didgeridoos, the user would likely locate it through search as well.
Although this seems a simple concept, few users seem to get this point. It doesn't mean you should put all your content on Google, but you should be strategic about using Google. If your product has widespread commercial applicability and many users do post help information from their sites, then yeah, your help needs to be in this same space because users will most likely also search Google for answers to related questions, and you want to be found there.
But if your product is not generally available and users aren't posting tips, then does it really matter if your product's help is on Google? Users who come up empty handed on Google will turn to more specialized sources for information.
Of course it's always best to be on Google, for a host of marketing reasons. People who search Google and don't find answers may assume the answers don't even exist in other sources (because Google is, after all, the master repository of all known information). But sometimes products and their information assets are proprietary and confidential, so they aren't publicly available.
You're probably stuck with your platform, so make the best of it
Regardless of the viability of your platform's architecture, you're probably stuck with what you've got, so how can you make the best of it? One easy way to make your help more Google-like is to incorporate Google Site Search. You can see how Justin Brock has implemented Google Site Search here. At the very least, when you incorporate Google Site Search, you start to see how Google sees your pages, because you will most likely use the site search to look for content.
Another step would be to start looking at search phrases and keywords within your search analytics. Just what are users looking for? How you capture the keywords and other search strings depends on your platform, but getting the information and letting it inform how you write is key. (See Looking at Search Analytics to Improve Search for more information.)
Another step would be to consider a solid web platform for publishing your help content. When you're shopping for documentation tools, don't just consider whether the tool has HTML output. Look at whether the pages have unique URLs, whether the content uses iframes, whether you can generate site maps, whether it uses word-friendly URLs or some random number in the URL, how the title tags get rendered, and more.
Like, SEO is so "2009" anyway
Finally, realize that with social media trends, it turns out SEO is becoming less of a big deal after all. According to The Guardian,
… a recent Forrester report on how consumers found websites in 2012 shows that social media is catching up with search, accounting for 32% of discoveries versus 54% for search, according to the US respondents, up from 25% in 2011. The trend towards localised results delivered to mobile users, perhaps via an app rather than a web page, is another reason why traditional SEO is decreasingly important. (See SEO is dead. Long live social media optimization.)
In other words, Twitter, Facebook, Linkedin, and dozens of other social networks are becoming the go-to places for users to find information. Results from these social network databases don't appear on Google. In fact, more and more real estate from Google's search is taken up by paid ads, making the organic element (and hence SEO)less apparent, especially on mobile devices.
As such, one way to increase your findability strategy is by listening and interacting on social networks. Anne Gentle has a good post on this here: The Big Shift from Search to Social.
Here are some related links for more information on SEO:
- The “Home Depot Model” of Findability, or, Social Search
- Figuring Out Search Engine Algorithms
- Increasing Search Engine Optimization in Flare
- Looking at Search Analytics to Improve Search
If you have tips for search engine optimizing your help content, let me know in the comments below.
About Tom Johnson
I'm an API technical writer based in the Seattle area. On this blog, I write about topics related to technical writing and communication — such as software documentation, API documentation, AI, information architecture, content strategy, writing processes, plain language, tech comm careers, and more. Check out my API documentation course if you're looking for more info about documenting APIs. Or see my posts on AI and AI course section for more on the latest in AI and tech comm.
If you're a technical writer and want to keep on top of the latest trends in the tech comm, be sure to subscribe to email updates below. You can also learn more about me or contact me. Finally, note that the opinions I express on my blog are my own points of view, not that of my employer.