What Does “Every Page Is Page One” and “Include It All, Filter It Afterward” Mean?

One of the more memorable presentations I attended at Lavacon in Portland was Mark Baker‘s “Include It All, Filter It Afterward” presentation. You can view the slides from his presentation here. I also embedded them from Slideshare below.

Because I liked the presentation so much, I want to explore the ideas a bit more, as well as integrate the discussion into my ongoing theme about organizing content.

“Every Page Is Page One”

Mark used David Weinberger’s Too Big to Know book as a foundation for some of his arguments. Mark explained that before the Internet, knowledge had a much higher cost to access. By cost to access, I mean you had to travel to the library and search out books from card catalogs, or something similarly painful. When you finally found the book, usually after a great deal of effort, you treasured its contents because it required so much to acquire.

Additionally, the book was laborious to produce for the writer and publisher, involving multiple drafts, peer reviews,  editorial reviews, specialized layouts and formats for printing, distributing, etc. It took a lot of effort to get that book to you.

As a result, you tended to read the book more thoroughly. You wouldn’t just read a page and toss it aside, because you can’t just go get another book without expending a lot of time, money, and effort to get a new one. Knowledge from books was precious and expensive.

With the Internet, however, the cost to access knowledge is almost nothing — the click of a mouse. You search for a term, and if you don’t like the result, there are dozens more to click and read, all within a few seconds. It’s easy to throw away an ineffective result. You may skim a few paragraphs and, if the information isn’t readily visible and consumable, you can click back and try another. Clicking different links in Google’s search results doesn’t cost you anything, so why should you spend time on an uninformative site? There’s no investment, so you can discard it as fast as you find it.

The ease of publication also lowers the value of knowledge. With two million new blog posts published a day, why should anyone have much investment in a single post? The whole act of writing, publishing, and distributing probably took less than an hour. As such, the value of that knowledge decreases. Discarding one blog post in favor of another is common and practical. When I move through my RSS feed, I give titles an average of three seconds before moving on.

Mark says that because of the low cost to access knowledge, it’s much more likely that someone reads 1 page from 10 sources than 10 pages from 1 source. The web has reduced the cost of knowledge so much that users tend to jump from source to source, reading only a page or so before moving on to somewhere else.

Because users may only view one page, you have to present enough information on that page so that it makes sense to the reader, so that its meaning can be complete and whole enough to deliver value to the user no matter where the topic is in your help system. For the user, every page is the beginning and end, or “every page is page one.”

Mark lists several characteristics of an Every Page Is Page One topic. They are self-contained, establish their own context, conform to a common type, and link richly.

“Include It All. Filter It Afterward.”

The argument about “Every Page Is Page One” transitions to another, related argument: “Include It All. Filter It Afterward.”

The days when you carefully organized content into groups that made sense, when you arranged content in books in a table of contents in the online help to be processed in a fixed order — all those days are over, because the user is no longer encountering and interacting with a book. The paradigm has changed. The user either interacts with a single page from your site, or more ideally, the user interacts with information as he or she has chosen to organize it, such as a grouping of all articles that have a specific hashtag.

Mark explains:

It is no longer the writer’s job to filter and organize content for the reader. In the book world, the physics and economics of paper meant that the writer had to act as a filter, carefully selecting a small and highly organized set of information to provide to the reader. But on the web, the power to filter and organize information passes into the hands of the reader. Rather than seeking out content silos and then searching within them, readers prefer to Google the entire Web and then select and filter the results they receive. In the words of David Weinberger, their preferred strategy is, “include it all, filter afterward”.

Despite this, writers tend to approach the web as simply another publishing medium, where they will make filtered and ordered content available to readers in a form that assumes the reader is looking at their content in isolation. The reality is that most readers are encountering their content as just one item in a set of search results — they are including everything and filtering afterwards. To better serve readers who seek information this way, writers need to change from creating content that is filtered and ordered to creating content that is easy for readers to filter and order for themselves.

Rather than spending time organizing and arranging content for the user, Mark says it’s better to provide filtering mechanisms — even if it’s just search results — for users to organize the content themselves.

Beyond search engines, the Internet is rich with sites that provide alternative organization tools. For example, on Twitter, users add hashtags to the tweets to sort and organize the content. I add #techcomm to my tweets, and look for other tweets with the same #techcomm hashtag. Twitter didn’t come up with this hashtag. The users did. Twitter just provided the tool.

When you search on Google, the search results appear in an unordered list, but Google provides tools for you to filter the results. You can filter by images, maps, shopping, books, videos, news, places, blogs, flights, discussions, recipes, and patents. In other words, Google gives you tools to organize the mass of content for yourself, not only sorting by keyword but also by type.

With Digg and Reddit, rather than organizing the content for readers, the site instead provides voting tools that allow users to vote articles up or down. The most popular articles surface to the top, which increases their visibility.

Pinterest also doesn’t provide an explicit organization of content for users. Instead, Pinterest provides a pinboard that allows users to pin items on boards with specific topics. You can then view the most popular pins, or the most popular pins based on others in your social network. This allows you to view content based on what your friends find interesting.

LinkedIn is another example of a site that provides a tool to allow users to organize the content themselves, rather than organizing it for the users. Mark says, “LinkedIn contains massive amounts of information from thousands of users, and gives you the tools to filter what you see.” You can see updates from other people that you link to, and you can also join groups and participate in discussions of topics that you select. As a user, you control the kind of information you see on the site.

Weinberger’s ideas for filtering are not so different from the ideas his previous book, Everything Is Miscellaneous. In that book, Weinberger argued that we should attach metadata to everything, and then let the user select the metadata he or she wants in order to view all objects with the same metadata.

In Weinberger’s model, you write as much relevant information as you can, add appropriate metadata to the content, and then let the user filter and organize it through the tools you provide, whether those tools are search, tags, faceted filters, dynamic aggregators, voting mechanisms, pinboards, friend updates, or other tools.

Forgetting about a predefined organization is kind of an exciting, liberating proposition. All that effort in grouping and sorting your topics — it doesn’t matter in the end. What you need to focus on instead is the information, and provide a way for users to filter it themselves.

Relating “Every Page Is Page One” to “Include It All. Filter It Afterward”

Exactly how does the idea of “Every Page Is Page One” connect to the other idea of “Include It All. Filter It Afterward”? In both cases, we’ve moved away from the book paradigm to the single page paradigm. In the first case, users are viewing one page on ten different sites due to the low cost of information access. Viewed in isolation, the pages aren’t browsed in a larger table of contents. To the user, each page is the first and only page viewed.

In the second example, “Include It All,” the individual pages of information are also disconnected from a predefined grouping. The users (not necessarily the content creators) define the way the content is grouped, sorted, and organized. In this case, every page is page one as well because a predefined sequential ordering has been replaced by a dynamic, user-driven method for filtering content. Every page has to be retrievable as an independent object so that it can be reordered, sorted, and organized in various user-defined orders. Again, every page is page one to the user.

Analyzing Assumptions

Overall, I really enjoyed this argument. I find it deep and insightful. There’s a lot of truth to it, especially for Internet behavior. However, I think some of the assumptions, when viewed in context of help material, push the analogy a little too far. I’ll examine several assumptions and consider a few counterarguments.

Assumption #1: Bounce Rates and Next Page Flow

(12/8/2012 Note: Based on the comments on this post, I updated this section on bounce rates to be more accurate with the Every Page Is Page One argument.)

I originally thought the Every Page Is Page One style meant that a user jumped from site A to site B to site C, but Mark later explained that it’s not just jumping among sites, it’s jumping around within a single site as well. For example, users tend to jump non-linearly from one site section to another in the same way a user might jump around in a book, moving from page 3 to page 62 to page 15 to page 99, and so forth.

Essentially, any non-sequential, non-linear movement through your help content creates an Every Page Is Page One experience, because with each new page, the reading experience resets. The reader doesn’t bring over the knowledge and context from the previous page.

Whether jumping from one domain to another, or jumping around within the same domain, I think bounce rates might be a relevant web analytic to analyze. A bounce means a visitor viewed just one page on your site before leaving. On my blog, 79.89% of the time, readers check out just one page before bouncing to some other site.

My bounce rate

My bounce rate is about 80%

This means that for 80% of my readers, the idea of “Every Page Is Page One” holds true. The other 20% dig deeper, sometimes visiting several pages of my site on their visit. Since I don’t have a linear sequence to my blog posts, the movement from one blog post to another aligns with the Every Page Is Page One experience as well.

Non-blog websites often have a lower bounce rate. Kissmetrics has an infographic showing bounce rates for different types of websites.

Bounce Rates infographic from Kissmetrics

As you can see, the bounce rates vary by the type of site. A service site (self-service or FAQ site) supposedly has only a 10-30 percent bounce rate.

Blast, a marketing and analytics site, provides some additional bounce rates for different genres of content:

  • 40-60% Content websites
  • 30-50% Lead generation sites
  • 70-98% Blogs
  • 20-40% Retail sites
  • 10-30% Service sites
  • 70-90% Landing pages
(Most notably, their 70-98% bounce rate estimate for blogs makes me feel a bit better about my 80% bounce rate.)

But what about a help system? I use Omniture rather than Google to track hits on some of my help material. Omniture doesn’t have a bounce rate percentage, but it does have a related feature called “Next Page Flow.” Here’s the Next Page Flow report for one of my help products. The bounce rate (or percentage of people who exited the site after viewing the first page) was 46%.

Next Page Flow

Next Page Flow

The graph also shows the next path users took from the landing page, which can help in determining the content most important to users.

I don’t have a linear sequence that users are supposed to follow through the help material. Because I write in an Every Page Is Page One style by default, my topics tend to be large, self-contained units of information. I arrange about a dozen options in the table of contents, and there really isn’t an established order or sequence to move through them (for example, see my Calendar help), so just looking at the next page flow doesn’t tell me much. But it does point out that the path away from the homepage varies quite a bit. 16% of users clicked the yellow path, 9% clicked the green path, 6% clicked the red path, 23% clicked the orange path (“Other”), and 46% exited the site. There is no clearly defined path that users take through help content.

If I had bursted a Framemaker book into a sequential reading experience online, then I could analyze the Next Page Flow to see if readers really proceeded to the next page they were supposed to. But since I don’t follow this practice (what Mark calls “Frankenbooks“), I can’t analyze the behavior.

I bring up bounce rates to point out how frequently users bounce around online, especially from one site to another. The bounce rates for my help products average 45 to 65 percent. Clearly, readers do not read sequentially online. Not only do they move from site to site, they also move in a variety of paths within a site.

The Next Page Flow can be helpful, however, in determining the related topics to present in a specific help topic. For example, in the Next Page Flow above, if a lot of users clicked to topic R after viewing topic J, then I might put some links on topic J pointing to topic R and vice versa. Such a situation might occur if topic R and J have some close relationships or confusingly similar terminology.

Assumption #2: Users Want Tools to Order Help Content Their Own Way

Many of the sites mentioned as examples that allow users to organize content their own way — Twitter, Digg/Reddit, LinkedIn, Pinterest, and so on — don’t focus on help material. Their content is more entertaining, social, and fun. As such, users are more inclined to engage by tagging, sharing, pinning, and posting content.

In contrast, help material is less pleasant. About the only help site I know of that incorporates a social tool (other than search) is Stack Overflow. On Stack Overflow, some users ask questions and others provide answers. You can vote the most useful answer to the top. This method works well, but by and large people viewing help are in an entirely different state of mind than those using social tools. In help systems, users are angry, frustrated, impatient, struggling to understand, etc. Do they really want to pin some of their favorite instructions on a pinboard designed for your product?

I remember seeing a feature in Flare’s webhelp skin that I thought was amusing. You could bookmark your favorite help topics so that if you wanted to quickly access them, you could easily view all the topics you had bookmarked (at least until you cleared your browser’s cache). I think I used that feature maybe once or twice out of novelty, even though I’ve used Flare’s help many times.

As long as help remains instructional material, it probably will never succeed in the social/interactive/gaming/playing space. But that doesn’t mean we can’t invent new tools and find user-driven organization systems that work for help scenarios.

Voting on a help topic might be a good idea. And aggregating the most popular topics would also be helpful. Showing the most recent comments can increase engagement, especially if you respond to the users’ questions. Providing faceted filters in search results can also be powerful. For search, it might be more effective to integrate a Google Custom Search. We are still so young in exploring these tools, but the options are there and will continue to expand as the web matures.

In the absence of any others tools to assist with organization, some form of guided organization is probably better than none at all.

Assumption #3: This Philosophy Doesn’t Address the Learning Situation

When we talk about users visiting 1 page on 10 different sites instead of 10 pages on 1 site, we’re probably not talking about someone who wants to learn a tool. If someone is in find mode, searching for a specific answer in a sea of information, it makes sense for the user to constantly search from site to site rather quickly for the information.

For example, my wife was recently trying to turn on iTunes’ Home Sharing for her new computer so she could hear her playlist from her iPhone. She said she looked briefly in iTunes’ help (which, she noted, wasn’t helpful) and then searched on Google but didn’t find anything. Then she gave up.

Contrast this behavior with someone who says, “I would like to learn Adobe Illustrator better.” This latter person, with a goal of learning, may want to progress through a series of tutorials or sequential instructions that start simple and become more advanced.

One of the most popular elearning sites — Lynda.com — does exactly this. The online instructors have about 20-30 tutorials for a specific software application, and many of the ideas build on each other in the course. You can skip around as you learn a new software tool, but you don’t usually hop from Lynda.com to Youtube.com to Vimeo.com looking for ways to learn Illustrator. If you want to learn, you become accustomed to a particular site and watch or view multiple topics from a course list or table of contents.

In learning scenarios, every page is not page one unless the tutorial is particularly bad. That said, the e-learning tutorials should allow users to skip around, moving to topics that interest them rather than being forced into a specific order. The way Lynda.com tutorials are broken up into small segments (5 minute videos) does allow users to jump around in the order they want.

Assumption #4: Not All Help Content Is Ubiquitous and Online

When help information yields a lot of search results on the web, the access cost of the help is undeniably low. For example, if you have a question involving WordPress, google it and you’ll probably find a similar answer on  20 different sites (including mine). Click one and if you don’t find the answer in 20 seconds, try another, and another. There’s no loyalty to any particular site or author.

However, if you have a question about a more specialized application that doesn’t have a lot of competing websites with similar information, the value of the help is much higher. In many companies, the applications don’t have any help material other than what the company provides. For example, my colleague writes help for an application that manages a supply chain process. If a user doesn’t find the answer in the help, will he or she start googling the question on the web? If so, there won’t be any answers there.

The Every Page Is Page One philosophy assumes that help material is ubiquitous enough to have multiple competing sources on the web. If the help material is not all over the Internet, if the help material is more rare, the user may value the content more carefully, finding it to be more of an essential and unique guide.

However, even if the content is not ubiquitous on the web, the user may treat the content with the same low-cost behavior because the web has rewired our brains to function this way. Nicholas Carr in The Shallows explains that Google has rewired our brains with shorter attention spans and made us more prone to distractions. I’m sure it has affected the way we search for information as well.

A user who searches in a specialized application’s help file and doesn’t immediately see an array of sites providing similar answers may close the help file and try other approaches, such as asking a friend, calling support, using trial and error, or simply giving up. Google has trained our brains to hunt and peck from a variety of sources rather than plodding through one big thick manual.

Still, users who are forced to rely on one site for information will probably give it more than a one-page glance, for sure.


One larger question to ask is whether the behavior of someone using a help system is the same as the behavior of someone using the web. Does help stand in a unique genre of material, one that plays by different rules, even when it’s on the web? Do users recognize official help material and dive more deeply into it, browsing and searching rather than immediately discarding it? Or do users act with the same behavior as they operate on the general web?

If there were different behaviors for different genres of content, I think the differences are evaporating. Everything is converging on the web, and we should design our help to fit more comfortably online.

Although I’m wrapping up this post (at at mere 3,700 words), this is not the end of the conversation. I’m going to record a podcast with Mark Baker, from Every Page Is Page One, later in the week. Stay tuned and check back later for a continuation of the discussion.

Madcap FlareAdobe Robohelp

By Tom Johnson

I'm a technical writer working for the 41st Parameter in San Jose, California. I'm interested in topics related to technical writing, such as visual communication, API documentation, information architecture, web publishing, JavaScript, front-end design, content strategy, Jekyll, and more. Feel free to contact me with any questions.

  • http://humanistnerd.culturecom.net Ray Gallon

    Tom, very stimulating and thoughtful analysis. One thing, though: if every “page” is page one, we’re using the word rather loosely. We refer to web “pages” that actually have nothing much in common with pages from books or other print material. What if we abstracted the term, and called it an “information object?” Wouldn’t that make more sense? In that case, “every page is page one” might become “every information object stands alone.” Maybe then we analyse user behaviour without some of the preconceptions we bring from the world of print.

    • http://everypageispageone.com Mark Baker

      Ray, I think “every information object stands alone” actually says the exact opposite of “every page is page one”. “Every information object stands alone” was true in the paper world. The defining characteristic of the web is the every information object is linked to other information objects.

      Every page is page one is true of a linked web of information because you don’t have any control over which page anyone will start from, or which link they will follow to arrive at your page.

      Your page may be the first page or the eighty-seventh page in the user’s journey through the web, but you have to treat it as page one, because page one is the only one that is written to be a potential starting point. If you don’t control the user’s path through the information, you can’t write page two (or page 87), you can only write page one.

      If you simply chunk a book an throw it up on the web, as is so often the case today, the reader may land on page 87, but it is not going to work well for them.

      To be sure, there is some ambiguity between paper page and web page, but I would argue that the web meaning of “page” is coming to be assumed in discussions of this kind, not so much because more people are reading the web than are reading books (though that may now be the case) but because a book page is not a unit of information, just an arbitrary physical division. Only on the web is a page a unit of information.

  • http://everypageispageone.com Mark Baker

    Tom, I don’t think “every page is page one” is a way of saying that the bounce rate is 100%. Bounce rate of 100% says that the reader only visited on page under a domain before moving to a page under a different domain. Every page is page one does not say that readers only read one page per domain. It says that when they navigate the web by search or bt link, every page is a new page one for them, no matter how many pages they have already read.

    The same would be true in the book world if someone brought home six books from the library and read the first page of each one. The first page of the sixth book would be page six of the reading session, but still page one of the current book.

    The reader would understand this, of course. They would not expect the first page of the sixth book to flow from the first page of the fifth book. There is a reset of expectations that happens when you put down one book and pick up another. Opening a new book resets the page count to 1.

    Similarly, when you perform a search or follow a link on the web, the page count is reset to 1. And if you navigate the web by searching and following links, as opposed to turning pages, then there is a reset with each navigation, and thus every page is page one.

    The exception, of course, are those sites that break articles up into multiple arbitrary pages and provide links which are literally just turning pages. There seem to be two significant categories of sites that do this: ad supported sites that want to generate more ad impressions for a single story, and technical documentation that has been chunked out of FrameMaker books or assembled into Frankenbooks from DITA topics.

    I consider one of the main design features of good every page is page one topics to be rich linking — primarily rich linking to ancillary material in your own content. The web is a hypertext medium and people can link out of your content at any time. If you don’t provide links, users can create their own by highlighting a phrase and searching for it. By providing rich linking within your own content, and by making every page work as page one after the act of linking resets the page count, can be a good way to reduce your bounce rate.

    • http://idratherbewriting.com Tom Johnson

      Mark, thanks for clarifying. I think I better understand your point. It’s the behavior of turning from page to page that you find incongruous with with user behavior. With your book analogy, though, don’t you mean users who might jump from page 1 to page 6 to page 56 and then page 21 and page 42, etc, skipping around, rather than moving among books? A user who moves from book A to book B is similar to the user who jumps from site A to site B. In the second instance, the jumping behavior would be considered a bounce but not in the first instance.

      It’s hard to say whether a user’s behavior on a site follows a more logical method of browsing than randomly jumping around. I guess I haven’t ever constructed an online help file designed to be read in a sequential, page-turning fashion, so the idea of a user who clicks one page and then turns to the very next page, and then the next, seems very foreign to me.

      Maybe I’m taking the EPPO argument to the extreme. I would almost consider a Wikipedia article to be the posterchild of EPPO.

      Would it be possible for you to define the opposite of EPPO? What does that look like? Any examples that you can point me to in the world of help?

      • http://everypageispageone.com Mark Baker

        Tom, I definitely consider Wikipedia to be a prime example of EPPO. When you follow a link in an Wikipedia article that leads to another Wikipedia article, you don’t expect that the second article is going to be written as a continuation of the first. It is going to be written as a page one, which it might well be for any reader who arrived at it by search.

        Unless there is an expectation that the page at the end of a link is sequentially related to the current page (such as when the link says “Next”, for instance), then there is an implicit reset when the user clicks the link. The page they arrive at may not be the first page they looked at today, but as far as the information structure is concerned, it is a page one.

        If you jump around in a book, skipping from page 1 to page 87, you don’t expect page 87 to be a continuation of page one, but you do expect it to be a continuation of page 86. You don’t expect it to be written as a starting point, and so you don’t expect that it will necessarily work for you. On the other hand, if you skip from page one of one book to page one of another book, you fully expect page one of the new book to work as a starting point, and not be a continuation of some other page.

        When a reader comes to a web page that happens to be created by bursting a book, or by deliberately constructing a Frankenbook, they have not accepted the consequences of deliberately jumping from page one to page 87. They have followed a link, or performed a search, and they expect the page they land on to work as a starting point, not to be something in the middle of some sequence they did not search for.

        So, “page one” does not mean, first page I read today, but an effective starting point for my current inquiry. And because every link starts a new inquiry (even if it is subsidiary to my initial inquiry), every page is page one.

        The opposite of EPPO, therefore, is Frankenbooks — the forcing of everything into a hierarchy linked by back, next, and home buttons, where no matter where you land, you are always in the middle and never at an effective starting point.

        (Real books, by the way, most definitely still have a place. Frankenbooks, on the other hand, are an abomination.)

        • http://idratherbewriting.com Tom Johnson

          I revised the bounce rates section in this post based on the comments here. (Thanks for clarifying. My update makes the bounce rates section less of a counterargument and more of a supporting argument for EPPO.)

  • http://www.sesam-info.net Jonatan Lundin

    Thanks Tom and Mark for shedding some light on the new communication paradigm. But I believe that there are some aspects that must be discussed more thoroughly before a “Include It All, Filter It Afterward” approach can work well for users of technical documentation.

    I have discussed these aspects here: http://www.excosoft.se/index.php/about-us/blog/item/28-how-do-a-technical-communicator-know-the-eppo-pages-to-write

    Any comments?

    • http://idratherbewriting.com Tom Johnson

      Jonatan, thanks for the well-thought-out post in response to this post. I especially liked your points about the lack of vocabulary and specificity users sometimes have when searching for answers. This imprecision in what users are looking for encourages them more towards browsing behavior. I really liked a lot of the quotes you used, and I may reference your post and those quotes in an upcoming post.

      As a side note, I’m just wondering if SeSAM is getting much traction as a branded methodology. I’d much rather read the salient principles you’re espousing instead of seeing them referred to SeSAM.

      • http://www.sesam-info.net Jonatan Lundin

        Hi Tom,

        I agree very much to your point that we need to examine and discuss important principles without framing them into some tool or methodology. In a sense it makes the discussion more objective. I will try to follow this rule as much as possible.

        One of the important principles to discuss is the fact that many users cannot express their information need in terms of keywords to an information system. I argue that many users of technical products do need assistance in defining their problematic situation. The solution could be an assistance that mimics the reference interview a librarian is doing when someone enters a library to find information on a certain topic. Or, consider such assistance the support technician asking a lot of question to the user in trying to understand the problem at hand.

        I’ve tried to design such a guided assistance as a simple mock-up here: http://www.sesam-info.net/SUI_DishWasherABC_pres.ppsx. I’d be interested in hearing what people think about this kind of assistance are and if there any such assistance out there already being used. And, finally, what you would consider being an EPPO page in a guided assistance like the one linked to above.

  • http://everypageispageone.com Mark Baker


    I think you are absolutely right that we still need a method for determining the initial set of EPPO topics to write, and that the fact that the EPPO topics we produce are part of a living and growing network of information has to be taken into consideration when making that calculation and also that EPPO topic creation and and should continue after product release.

    The aspect of choosing the best process and the best tools for implementing EPPO is not something that as entered this discussion so far. To me, while you can write EPPO topics in isolation and simply put them out there for Google to find and readers to link (which is exactly what we are doing in this series of interrelated blog posts from each of our blogs), in a commercial documentation environment I think it is still very important to create a coordinated set of related EPPO topics.

    One of the characteristics of a system for managing such a collection is that it should be able to provide both a rich browsing experience and rich internal linking through the topic set. At the same time, I think it is vital that the system should allow you to fully integrate a new EPPO topics simply by adding it to the collection. The system should take care of inserting it into any appropriate browse sequences and link relationships. This is what the SPFE architecture is designed to accomplish.

    One of the features of SPFE is that it uses soft linking techniques to do ordering and linking based on content structure and metadata. One of the side benefits of this is that the system is really good at finding holes in the content. If you mention and markup a task or a concept in one EPPO topic, and the system can’t find another topic on that subject, it can raise an error that tells you that you need a new topics.

    I find that this system, coupled with a robust methodology to determining an initial topic list up front provides a very good mechanism for ensuring the completeness of the information set. I can’t comment specifically on the suitability of SeSAM as a method for deriving that initial topic list, except to say that any methodology for developing such a list should be based first and foremost on user tasks, which I believe SeSAM is. Start with the task topics and the soft linking mechanism in SPFE will show you what concepts and references you need.

    • http://idratherbewriting.com Tom Johnson

      Mark, I thought your note about providing a rich browsing experience was interesting. You said, “One of the characteristics of a system for managing such a collection is that it should be able to provide both a rich browsing experience and rich internal linking through the topic set.” It would be great, at some point, to see you expand on the browsing experience. The impression I get from EPPO is that it’s more geared to users who primarily search rather than browse. Supporting browsing behavior suggests more of a focus on organization, or on filtering tools of some kind.

      Re SPFE, I read the info at SPFE.info, but it would be great if there were some examples of help that implemented the technique. Are there any to check out?

      • http://everypageispageone.com Mark Baker

        Tom, I see EPPO not as a negation of the book model so much as the reversal of its priorities.

        With book, the primary focus is on sequential reading, with secondary support for skipping around and for searching. Even thought we consider searching to be of tertiary importance from a design standpoint, we still believe indexes are important.

        With EPPO, the primary focus is on random access and non-sequential reading. EPPO topics seek first to be accessible by search and to be useful when found by search or linking, but they may also provide a way to browse through a collection of topics, and even to read a set of topics in a particular order.

        Of course, when we casually create an information product, we tend to ignore all secondary and tertiary uses and just think about supporting the main method of use. Thus a casually created book may not provide an index and may not include frequent subheadings to support a browsing user.

        Similarly, a casually written EPPO topic (of which the web contains billions) may only work when found by search, but not support people finding it by browsing or sequential reading, and may not be richly linked to provide the user ways to navigate to ancillary information.

        But when we create books professionally, we think about and provide for alternate models of use, besides sequential reading, and when we create EPPO topics professionally, I believe we should similarly provide for alternate models of use by providing structured collections of EPPO topics that support browsing, and even sequential reading, for a reader who is so inclined.

        The difference, of course, is that in the book world the methods for supporting secondary and tertiary models of use are well established and well understood, whereas in the EPPO world of the web, there are no such established conventions, and we need to develop them.

        • http://howtowriteeverything.com Marcia Riefer Johnston


          I like your encouragement toward “structured collections of EPPO topics that support browsing, and even sequential reading.”

          I’ve seen this done well only rarely. What a refreshing experience! When content modules (a) stand alone AND (b) read coherently in sequence, readers can get value from the information whether they want a quick answer or a guided learning experience.

          Of course, it takes extra planning and insight to pull this off!

          (FYI to Tom, the best example I’ve seen comes from a Help system.)


  • http://www,write2help.com Just Plain Karen

    Tom, as usual, I enjoy reading your perspective on things, and this might be a little off-topic, but I think a 3700-word blog post is just heinous. (And I wouldn’t be bringing this up if I hadn’t already noticed your posts in general seem long.) I don’t have time to read something this long, and I hate reading long articles online. However, I don’t want to skip your posts for this reason alone. Any chance you might consider tightening down your writing to, say, 1000 words or less? Just my $.02

    • http://idratherbewriting.com Tom Johnson

      Thanks for your feedback, Karen. I wouldn’t say long articles online are “heinous.” The blog format just doesn’t support long-form content very well. I recommend using apps such as Instapaper to save the content to another format that may be more suitable to lengthy reading. This is how I tend to read posts. When I’m in scan mode and don’t have a lot of time, I’ll save articles to Instapaper to read later. For example, Jonatan’s post, linked to in the comments here, was one of them. This morning I woke up early and couldn’t go back to sleep. This was a perfect time for me to read the more lengthy post on his site. I read it via my Instapaper app on my iPhone while curled up on my bed.

      About length, I sometimes want to dig more deeply into topics than a 500 to 1000 word post allows. I once read that the most shared articles average about 1,600 words in length, which is longer than many blog posts typically are. I’ll probably try to keep my posts about that length, though, based on your feedback, and then chunk them into multiple series. Thanks for being candid.

      • http://www,write2help.com Just Plain Karen

        Thanks for the reply, Tom. I’ll check out InstaPaper.

        I’ve enjoyed the discussion between you and Mark on my comment. I still maintain that the article is too long for a blog post, and I can’t imagine subjecting a user to a 3700-word help topic, but maybe that’s a reflection of how I personally like to read and process information online.

        • http://everypageispageone.com Mark Baker

          Well, I would prefer not to subject a user to a 3700 word help topic either. But suppose there is a case in which to describe the task in a useful manner genuinely required 3700 words. What then is a better alternative than a 3700 word help topic?

        • http://idratherbewriting.com Tom Johnson

          Karen, thanks for your comment and welcoming attitude toward the discussion. Question: Can long-form content live online, and if so, what shape does it take? Or can long-form content only live in offline books? If it can only live offline, how do you get around all the shortcomings of the book format?

    • http://everypageispageone.com Mark Baker

      Karen, I think it is certainly true that the longer a piece is, the fewer people will read it. This is because the less important something is to me, the less time I will devote to is. A longer piece will only be read by people who find it really important to them.

      As a writer, though, I am not so much interested in being read by a lot of people who don’t think what I have to say is important. I want to be read by people, whatever their numbers, who do think what I have to say is important.

      As a reader, I am always disappointed if an article treats something that is important to me in a brief or superficial way. If its important, I want more.

      So, while Tom’s hit count numbers may go down for longer pieces, his real influence may increase with those who think what he is talking about is important. (I’ve blogged about this in the past: http://everypageispageone.com/2011/12/07/why-analytics-may-mislead/)

      My view is that any piece of writing should be as long as it needs to be to say what it has to say, neither more nor less.

      • http://idratherbewriting.com Tom Johnson

        Karen’s response actually coincidentally ties in with the topic of the post. She wants shorter, more concise articles, but the author (me) thinks that chopping this article up into, say, three separate posts will lose the thread of continuity, so I leave it as one single 3,700 word post.

        With help topics, what if it really takes 3,700 words to explain a feature in full. Well, the reader may have the same reaction as Karen — calling it heinous. Okay, so let’s break up that help topic into three separate topics. However, the three really have such a close relationship with each other, they are no longer EPPO topics. They kind of like a sequential trio of articles.

        This is the balancing act of writing in the EPPO style. The topics will probably be longer to provide context and stand alone, but they can’t be too long because the online format doesn’t work well with long content. On the other hand, shortening the topics goes against the EPPO guidelines. When we shorten topics, it doesn’t allow readers to jump in non-linear ways without confusion.

        • http://everypageispageone.com Mark Baker

          Well, we shouldn’t be explaining features, we should be supporting tasks, but if it really takes 3700 words to support a task adequately, then that’s what it takes. I can’t see how splitting it into three pages connected with back/next links make its less heinous, it just makes it heinous plus two clicks.

          It is also more heinous if my search results send me to the third page and I have to back up two pages to get to the beginning. And it is more heinous still if it is broken up into three discontinuous topics that are not linked, and which I have to find separately before I get the information I need.

          Some people might certainly find it heinous that a task takes 3700 words to explain, but that is the fault of the task (or the user), not the topic. A topic of 1200 words that did not fully support the task, but left the user hanging or guessing would certainly be heinous in its own right.

          I think what really happens, though, with tasks of that level of complexity, is that more people decide to seek help from more experienced colleagues rather than try to figure it out from the text. The longer text is read by fewer people, but the information it contains reaches less patient readers by other means.

          It is not about being read by the most people, but being read by the right people. It is not about traffic, it is about influence.

          In any case, by this measure, a 70,000 word book would certainly be more heinous, and a 250000 word Frankenbook more heinous still.

          All that said, if the right length for a topic to cover its material is 3700 words, that is how long the EPPO topic should be in its source. If someone decides to split it into three pieces with back and next links on output, that is purely a formatting decision — a wrong formatting decision, in my view — but just a formatting decision.

          And that said, I do think any EPPO system should implement a review threshold that flags long topics for review to see if they really need to be that long. There should also be a review threshold for really short topics for the same reason.

  • Pingback: A Few Notes from Too Big To Know | I'd Rather Be Writing()

  • Mark Leonard

    Mark, Do you have any links to product or enterprise websites that produce EPPO content, or something similar, that you like? Wikipedia is a great example but I’d love to see examples for an enterprise product or suite of products.

    This is a great discussion. Thanks all.

  • Pingback: Podcast: Include It All, Filter It Afterwards — Interview with Mark Baker | I'd Rather Be Writing()

  • http://lopascribes.wordpress.com Lopa

    A very helpful and thoughtful post! Thanks Tom!

  • Pingback: Knowledge Has a New Shape, and It’s Not the Book | I'd Rather Be Writing()

  • Pingback: Podcast: Include It All, Filter It Afterwards — Interview with Mark Baker » eHow TO...()

  • Pingback: A Few Notes from Too Big To Know » eHow TO...()