Home
  • About
  • Contact
  • Presentations
  • WordPress Consulting
  • Advertising
  • Guest Posts
  • For Students
  • Jobs
  • Podcasts Book Reviews

    Implementing Faceted Classification/Search with a Help Authoring Tool [Organizing Content 7]

    May 21st, 2010 | Posted in blog 23 Comments »

    This entry is part 7 of 51 in the series Findability

    In my last post, I presented faceted classification and faceted search as an alternative method of organization for help content. While faceted navigation systems are common on the web, implementing a faceted navigation system to describe help content using one of the common help authoring tools, such as Flare, RoboHelp, Author-It, Doc-to-Help,  is more challenging.

    Faceted Browsing

    According to Tony Self, one of the strengths of a help authoring tool (HAT) is the table of contents (TOC) feature. Through the TOC, you can easily create a quick system of navigation.

    It’s common to use this TOC to create a system of topic-based containers for users to navigate, but there’s no reason why you couldn’t instead dedicate the folders to facets. In the prototype below, each of the “Browse by … ” books in the left pane represents the facets by which users can navigate the help content.

    Faceted browsing system showing alternative methods of navigation

    Faceted browsing system showing alternative methods of navigation

    Users can browse the content by the following facets:

    • Browse by Topic. Topics are arranged in the traditional topic-based, hierarchical containers.
    • Browse by Role. Topics are arranged by role (super agent and regular agent), and then broken down by topic containers.
    • Browse by Skill Level. Topics are broken into two books, Advanced and Beginner. And then most likely broken down by topic containers.
    • Browse by Popularity. Only the top 20 most popular topics are listed.
    • Browse by Concept or Task. Topics are divided into concept and task groupings, and then probably broken down by topic containers.
    • Browse by Status. Topics are organized by facet. (In Swordfish, statuses play a prominent role for operations. Depending on the status of the operation, you can perform certain tasks, so this provides a way to organize the tasks.)
    • Browse by Help Format. Topics are divided into video, diagram, and FAQ groupings.
    • Browse by Problem. Topics are arranged by problem, somewhat like a troubleshooting grid.
    • Browse by Screen. Topics are arranged by screen. These are the context-sensitive help topics.

    Almost every HAT allows you to create multiple TOCs. Each of these facets is simply a secondary TOC that is integrated into the master TOC.

    One of the problems with these facets is that they don’t entirely get away from the topic-based containers that I am resisting. With 200 topics, I can’t simply divide all topics in the help into two groupings, such as Advanced and Beginner. A second tier facet is also necessary, and the only second-tier facet that makes sense is the topic container.

    However, some inclusion of topic containers isn’t necessarily a bad thing. The main reason topic containers fail is because of the abundance of topics. In small help systems, the topic containers provide easy navigation and aren’t as frustrating for users. The first facet, such as Advanced or Beginner, breaks down this large number of topics into a smaller subset, which makes the navigation more feasible.

    Faceted Search

    When a user clicks the search tab and searches for an item, the user can narrow the search results by using similar facets, as shown in the following image.

    Faceted Search

    Faceted Search

    I’m not sure how it would be implemented in other HATs, but in Madcap Flare, you can tag each topic with a keyword (this is called inserting Concepts). You can then add this set of tags/concepts below the search box by adding a “Search Filter Set” to your help.

    The challenge here is to provide the same facets that users get by browsing. Tags work a bit differently than TOCs, but the idea is the same. You can limit the search results by the following:

    • Regular Agent Role
    • Super Agent Role
    • Advanced Level
    • Beginner Level
    • Conceptual Topics
    • Step-by-Step Topics
    • Most Asked About
    • Video Format
    • Diagram Format
    • FAQ Topic

    Not all of the TOC facets end up as tags. The Browse by Role facet has two roles, so I can tag topics within these books as either a Regular Agent Role or Super Agent Role.

    The Browse by Skill Level facet has two levels, Beginner and Advanced, so I can tag the topics within this facet as Beginner Level or Advanced Level.

    For the Browse by Concept or Task facet, I can tag the topics as Conceptual Topics or Step-by-Step Topics.

    The Browse by Popularity is tagged as Most Asked About (or Most Popular).

    Finally, the Browse by Format allows me to tag the topics as Video Format, Diagram Format, or FAQ topic (these are the only three main formats in the help material apart from the concept/task distinction).

    Shortcomings of Faceted Search

    One of the most difficult problems with setting up faceted navigation is to identify facets for information topics. With merchandise or other products, you can often identify a clear set of attributes that define the product. With shoes, you can classify and filter the shoes by brand, style, color, cost, gender, sport, size, and other qualities.

    With Google, you can classify the content by a plethora of information formats, including news, books, videos, images, maps, discussions, shopping, blogs, twitter (“updates”), and more.

    In a library, you can classify books by author, publication date, genre, period, subject, book type, etc.

    Help topics don’t seem to have a set of clearly identifiable facets. Some of the facets that you could classify a help topic with — for example, length, format, reading level, information type, corresponding screen, etc. — aren’t that helpful to users.

    Help content is, however, rich in topics. But using topic tags as the attributes seems like the same game as the topic-based, hierarchical containers. For example, look at the faceted search filters that I’ve circled in the following image.

    Topic facets on a search for a topic seem redundant and confusing to me

    Topic facets on a search for a topic

    If a user searches the help for “run a black operations campaign,” what benefit is there in limiting the results to the facet “Related to black ops”? Won’t the results already contain black ops topics?

    Further, what if a user searches for “coordinate informants for black ops”? Which facet would I then choose to limit the results — Related to Black Ops or Related to Informants? It’s the same problem as the topic-based, hierarchical containers.

    Searching for a topic and then limiting the results by topics seems redundant to me.

    Moreover, if these facets are simply topics, there’s a likely chance that you’ll include dozens of topics. Adding too many topics in the Search Filter Set will bloat the number of search facets beyond what is usable. If you look at the search filter set in Flare’s online help, you’ll see that they use about 80 different topics in their search filter set, which seems excessive.

    It’s clear that faceted classification and faceted search have benefits for many product websites, as well as for many information-heavy websites with many types of content. But the lack of a clear set of facets for help material makes it more challenging to implement in a way that is clearly beneficial.

    Series NavigationFaceted Classification, Faceted Search [Organizing Content 6]Second-Level Faceted Navigation [Organizing Content 8]

    Sponsors

    Tags: , , , , , , , , , ,

    If you liked this post, keep updated with new content: Subscribe to I'd Rather Be Writing.

    Both comments and pings are currently closed.

    23 Responses to “Implementing Faceted Classification/Search with a Help Authoring Tool [Organizing Content 7]”

    1. wade courtney says:

      I would love to figure out a way to do this with AIT.

      • Tom Johnson says:

        Sorry, I’m not an Author-it user, so I can’t comment. But from what I hear, Author-it is an incredibly powerful tool. Surely you can tag the content in some way or another.

    2. Larry Kunz says:

      Tom, it seems like there are two questions here: How do we organize the help topics in a way that makes them useful, and how can we get the HATs to do that for us?

      For the first question, I’m reminded of the talk Rachel Lovinger gave at the STC Summit about the semantic web. It seems to me that the same ideas that people are using to build the semantic web — especially setting up an ontology, or a standard way of creating classifications and relationships for topics — could help us solve the problems you describe here. I know way too little to be able to suggest specifics. But it seems promising.

      I see that Rachel has posted her slides on Slideshare:
      http://www.slideshare.net/rlovinger/theres-no-semantic-web-without-content-and-data-3987048

      • Tom Johnson says:

        That’s a great powerpoint presentation on the semantic web. Thanks for pointing me to the link. I flipped through all 101 slides. I’m not an expert on the semantic web either, but my guess is that it applies more to the internet than a help file. Here’s what I basically understand. The semantic web tags content with identifiers that describe the meaning of the content, so that when people search for “stars in the sky,” results that contain titles like “new actor stars in movie” don’t appear. The meaning of some words, e.g., stars, can be ambiguous (“stars” is one example in her slides).

        I don’t see that much ambiguity in the terms in a help file, but there is definitely application here. Of course there is some ambiguity, because that was my whole contention with topic containers. But is it ambiguity of the same kind? I think in the instance of help, users are searching for “form new group” when the help topic may be titled “initiate operations team.” Would the semantic web solve those kinds of problems? I have no idea.

        The Dublin Core seems like an interesting and ambitious initiative, but do its 15 characteristics — contributor, coverage, creator, date, description, format, identifier, language, publisher, rights, source, subject, title, type — provide useful facets for users to navigate help? Perhaps this metadata could be useful to help in other ways.

        I’m going to ping Rachel to see if she has any comments on this thread.

        • Although an ontology is in a broad sense like a taxonomy, it’s a much more sophisticated system of knowledge representation. It’s organized using object orientation (classes and subclasses). The foundation of this organizational method uses description logic, which can make the vocabulary and its relationships machine readable.

          Our current help tools aren’t sophisticated enough to handle the processing necessary to manage a true ontology. Handling the relationships requires specialized software. When I developed information products for the National Cancer Institute’s Vocabulary Services group, the editors used Protégé, an open-source, OWL-based editor, to manage ontologies and thesauri. The tool uses RDF schema to manage resource locations.

          I’m no expert on this, but if you’re interested in a great overview of the semantic web, read Explorer’s Guide to the Semantic Web by Thomas Passim.

    3. Bill Albing says:

      Finally, we are using the Help Authoring Tool as it can be used — not limiting ourselves to the book (hardcopy) pattern. Now we’re getting somewhere. The tool is just a tool; we have to provide the intelligence so that the help system is truly useful and usable.

    4. Ben M says:

      I don’t know about Flare, Author-It, or Doc-to-Help, but RoboHelp allows multiple references to the same topic within the same TOC. This makes multiple organizations of information within the TOC possible. The top-level folders/books would be labeled by facet. Does every facet (book/folder) need to contain every topic? I don’t think so, but I haven’t thought about it for more than a few seconds here.

      One problem with doing it this way, though, is that I don’t think the TOC will sync correctly when navigating among topics via links because it wouldn’t know which instance of the topic to highlight. Breadcrumbs in topics probably wouldn’t fit, either.

      • Ben. You are spot on. In RoboHelp – I can’t speak with any authority about other HATS – it always goes to the first instance in the TOC. That is one reason why I avoid doing this. Search results is another.

    5. Tom, this is a great post. I really think you’re on to something. I have recently moved into a new role as Managing Editor/Content Strategist for a government website, and one of my goals is to overhaul the help content. We don’t use help authoring tools, but your ideas have given me much food for thought. I thank you for sharing them.

      • Tom Johnson says:

        If you find this topic and series interesting, check out the recorded presentations from the IA Summit. If you use iTunes, just subscribe to the Boxes and Arrows podcast and download them from there. Every talk is interesting and insightful and engaging. Kind of makes me think I should be going to that conference rather than the STC conference. If I keep this thread up long enough, that’s where I’ll find myself.

    6. Terrific exploration, clear analysis, helpful examples, good challenge. Thanks, Tom.

      You make a couple of problem statements that particularly intrigue me:

      (1) “I think introducing topic facets to narrow what is most likely a topic search presents the same issues as topic-based navigation.”

      and

      (2) “If these facets are simply topics, there’s a likely chance that you’ll include dozens of topics. Adding too many topics in the Search Filter Set will bloat the number of search facets beyond what is usable.”

      The indexer in me responds to #1 with this thought: Wouldn’t we want these to be, not topic facets, but content facets that are cross-topic in nature? That kind of facet would add value and present a true alternative to the topic-based TOC.

      The indexer in me has to leave the room when it comes to #2. An indexer wants to give readers every word they might have in their heads at every level of granularity. A faceter (time for a new word!) would have to be highly selective. Cross-topic facets might be grouped under a heading like “Browse by Main Theme,” but preferably something less English-teachery (time for another new word).

      Naming contest, anyone?

    7. Patrick Warren says:

      I see that Rachel has posted her slides on Slideshare[...]

      The only (rather important) thing in her slide deck missing is mention of RDF Schema (RDFS), which is used to extend the limited set of elements that comprise the RDF vocabulary.

      “The RDFS vocabulary builds on the limited vocabulary of RDF.”
      Source: http://en.wikipedia.org/wiki/RDFS

      “I’m not an expert on the semantic web either, but my guess is that it applies more to the internet than a help file.”

      (Tongue-in-cheek) Aren’t there now individuals arguing that it is all content? To a geek like me it is actually all data, the only difference is some is human-readable, some is machine readable, or both. A help system is just content intended and built for a specific purpose,and I’ve been building intranet sites that serve this purpose for years now; provide help on an application, process or system.

      “The Dublin Core seems like an interesting and ambitious initiative, but do its 15 characteristics — contributor, coverage, creator, date, description, format, identifier, language, publisher, rights, source, subject, title, type — provide useful facets for users to navigate help? Perhaps this metadata could be useful to help in other ways. ”

      With a contemporary web based CMS and authoring system such as the one I now use, Drupal 6, I can leverage and use the existing vocabularies, taxonomies and ontologies such as DC (Dublin Core), RDF, RDFS, SKOS, etc. (Which by the way is exactly what you want to do.)

      Much of this is currently achieved through the use of additional but mostly free 3rd party modules for Drupal 6, but for the next release version 7, some of the RDF and Semantic Web functionality will be moved into Drupal Core. Implementers and developers will no longer need to use the additional modules, the functionality will be immediately available post installation of Drupal.

      Among the ‘search’ related modules there is also an individual faceted search module for Drupal 6: http://drupal.org/project/faceted_search For larger ‘knowledge’ projects there is Lucene, Nutch and Solr: http://groups.drupal.org/lucene-and-nutch

      So what exactly does this have to do with this thread?

      It opens up a question that no one seems to have asked yet. Do I really need to use a specialized help development tool for help (knowledge, point-of-need learning) related content development and management?

      For me the answer is an emphatic no.

      • Tom Johnson says:

        Patrick, thanks for commenting. (By the way, when you have several links in the email, it gets filtered and I have to approve it, but that’s my not my default setting for comments.) I love your passion for the semantic web. I feel ignorant with all the acronyms you’re using, and I want to explore this further. Would you like to write a guest post about how to leverage the semantic web in a help file?

        Re the last question, that’s actually one that I’m getting to. I asked this question in the last Scriptorium Trends webinar. Web platforms continue to move forward with innovative technologies — for example, faceted searching/classifications, auto-suggest, dynamically related content, social media integration, etc. In contrast, tech writers are using tools that are years behind this level of innovation. For example, the search feature in most HATs works about the same as it did 15 years ago. Users can’t comment, there aren’t RSS feeds, no jQuery, etc. Will tech writers abandon the HAT vendors and move to the latest web tools to keep up with trends and stay competitive? Are tech writers trapped in the past and hopelessly outdated because of their limited toolset?

        It will be years before a common HAT comes out with the same faceted search module that you linked to. I have been eager to move my help content to WordPress for about 2 years now, but our internal architecture at work doesn’t support PHP or MySQL yet. Still, it’s an idea that’s always on my horizon.

        The problem with web tools, though, is that they don’t have all the features that help authors need. While they have more web-based features, they sometimes fail with the rendering of online content into printed PDF guides. You can’t really single source the material that well. On the other hand, the long printed manual is dying, and many times the single source rendering from online to print leaves the reading experience disjointed and choppy, which is what I noted in my Things Fall Apart post.

        By the way, when I asked my question in the Trends webinar, Tony Self agreed with me, noting that the main advantage of a HAT was the TOC feature. Ellis Pratt said many HATs are moving forward with more web-like technologies. And Sarah O’Keefe said that an XML structure provides you with the flexibility to do whatever you want with your content.

        • Patrick Warren says:

          First I promise to limit links this time, lol!

          “Would you like to write a guest post about how to leverage the semantic web in a help file?”

          Hmmm… I’d certainly be open to writing a guest post, but I’m trying to grasp your definition of ‘a help file’, literally as in a .chm or figuratively as in anything that can be ‘classified’ as help? We should probably discuss this further offline for now, but I’m up to it!

          Sorry if I jumped the gun on your next question, bad habit of mine, tends to ruin other people’s punch lines too. Anyways it’s not just HATs that are light years behind, but most monolithic software AND as important the development and publication processes also.

          Next and I think you already alluded to this, print and pdf is loosing ground to mostly online publication, which again IMO provides numerous advantages. As early as the late 90s I was already working with and on very high-end electronic manufacturing devices that had built-in interfaces with complete technical and maintenance manuals, which in turn were connected to the manufacture’s web site via the Internet for auto-updating purposes.

          Help Outputs?
          chm, pdf, html, xml, xhtml?
          chm is outdated and susceptible to virus insertion, and I believe I read somehting about Microsoft eventually dropping it. Not exactly cross platform friendly either.

          PDF was geared towards two things, electronic transfer and the ability to easily print the information if so desired by the end user. I think and feel pdf is and will continue to be useful, but less and less for it’s printing purposes and features. A more useful feature of PDF is it’s ability to include embedded metadata, but unfortunately that adds yet another layer to the pub process and many don’t know about it or use it.

          HTML
          Overly extended, too much vendor interference too, time for the next best thing. HTML 5 I’m still undecided about. Microformats/Microdata may be useful in some instances.

          XML
          Now we are getting somewhere! Structure!

          XHTML
          XHTML IS XML

          What’s next?
          RDFa in XHTML files or RDF/XML auto-generated via PHP, Perl or Python. Combined structure AND semantics, WOW what a concept!

          “It will be years before a common HAT comes out with the same faceted search module that you linked to. I have been eager to move my help content to WordPress for about 2 years now, but our internal architecture at work doesn’t support PHP or MySQL yet. Still, it’s an idea that’s always on my horizon.”

          I can feel your pain on this. I’ve had a distinct advantage being both an engineer and a technical writer (among other things) on projects, processes and products. On my 2nd to last assignment I just had one of the server admins set me up with a virtual instance. I kept it small at first till I showed management (and others) what it could do. I’ve used the same approach for introducing SharePoint also to those unfamiliar with it. Not sure how well WordPress would lend itself to ‘help development’, but only because I’m less familiar with it than Drupal. WordPress is the only other ‘web tech’ that I have found forging ground in the RDF/Semantic web arena, but more on the social side of it; SIOC.

          “The problem with web tools, though, is that they don’t have all the features that help authors need. While they have more web-based features,[..]”

          This is rapidly changing. With Drupal it can still be a bit of a challenge to setup PDF outputs, but it is possible. So is DocBook and other standard DTD or XML schema usage. What I’ve been monitoring and seeing is if there is a strong enough need someone soon develops a method and add-on for it. The tech Rick Sapir of the Carolina Chapter of STC is using, TikiWiki it is easy, in fact it is a standard feature of of the box.

          “By the way, when I asked my question in the Trends webinar, Tony Self agreed with me, noting that the main advantage of a HAT was the TOC feature. Ellis Pratt said many HATs are moving forward with more web-like technologies. And Sarah O’Keefe said that an XML structure provides you with the flexibility to do whatever you want with your content.”

          A TOC is a printed matter construct and feature. It mattered when we had printed page numbers, not as applicable to web based content, in fact I think a standard TOC would be rather limiting for web based knowledge. I think the equivalent is a site map of a site, minus the page numbers of course. Ditto on what Sarah said! And with RDFa/XHTML or RDF/XML serialization you gain semantics and inferencing too.

          • Tom Johnson says:

            Nice tour through the standard help formats and their limitations. Re Drupal, I know it’s a more robust platform than WordPress. I’m glad to see that it has some of this functionality that I’m describing here. Too bad it also runs on PHP and MySQL — otherwise it would more popular as an enterprise tool. I’ll get you those questions for the guest post now. Look for them via email.

    8. [...] Implementing Faceted Classification/Search with a Help Authoring Tool [Organizing Content 7] [...]

    9. Enjoyed the article, and great discussion.
      1. I disagree that TOC is solely a print construct.
      Properly, headings and TOC give the user major clues as to the terminology and structure of the thing you’re writing about. It tells them some of the terms that are used in the document. Ditto the Index, if such a thing is provided. The index provides (among other things) a map from the user’s terms to the document’s terms and pointers to the points of use.
      Yes, a good side map is very similar to a TOC. Not at all irrelevant.
      2. Selecting and naming facets is a biggie. That could be a discussion all by itself. Also, display of facets/narrowing terms can be important to user success.
      3. In WordPress (for example) there are Topics, Categories, and Tags. I haven’t figured out how to use Tags as narrowers for Categories, but it seems like that would give some of the benefits of faceted search?

    10. dang- that’s “a good site map”

    11. I couldn’t find the IA recordings – another hint, perhaps?
      Short of facets, categories and tag clouds can help navigation.
      Here’s a good article on facets and design:
      http://www.uie.com/articles/faceted_search/%29

      • Tom Johnson says:

        Thanks for the link, Jay. It’s is somewhat ironic about the lack of findability with the IA recordings. But in case you’re still looking for them, just subscribe to the Boxes and Arrows podcast on iTunes. That’s where they are.

    12. It would be easy for a user to get confused if duplicate topics existing in the TOC, search or elsewhere. I started to think you could apply conditional build tags to topics for the different “facets” and build completely separate outputs for each. The issue then would how to make the content accessible whilst preventing duplicate results.

      Shooting from the hip it maybe best to take control of this away from RoboHelp (or any other HAT) to a degree. My thinking would be that a redirect topic could be used to direct a used of a particular facet to the relevant output. The redirect topic could be indexed to include the relevant keywords which would be also used as part of the search: at least in RoboHelp. I can’t speak for other HATs

    « »