EServer TC Library: The Most Popular Technical Communication Website in the World
EServer TC Library (tc.eserver.org), an indexed library of technical communication articles, is the most popular technical communication website in the world, according to Alexa.com, a site that measures web traffic and Internet reach. The EServer TC Library doesn't produce original articles itself, but rather has a team of indexing scholars adding links to worthwhile content across the web.
The indexers sort the content into twelve categories and add metadata such as the author, publication, and year to each work.
You can browse the library in various ways: you can search for keywords; click links in tag clouds; see the latest additions; or navigate by author, publication, year, language, or other metadata. You can also rate the article's quality and subscribe to the site's RSS feed.
According to Alexa.com, the EServer TC Library gets more visits than any other technical communication site. For example, if you compare the reach (total Internet users) of tc.eserver.org with idratherbewriting.com and stc.org, the result looks like this:
In short, the EServer TC Library dwarfs all other tech comm sites. Granted, EServer TC Library is a library, which people primarily use to browse content located elsewhere, so it's perhaps not in the same category as the other sites. Still, the sheer amount of traffic is impressive. With 17,000+ visitors and 100,000 hits a day, the site has a huge influence on the technical communication community.
I caught up with Geoff Sauer, the creator of the EServer TC Library, and chatted with him over email. Geoff is an assistant professor in the Rhetoric and Professional Communication program at Iowa State University.
I'm continually amazed by the quality of information that you add to the EServer TC Library.
You're very kind.
We started the site in 2001 because a few of my evening MA students were concerned about how few colleagues at their workplaces ever read peer-reviewed journals, or even had access to even search for scholarly articles. Then, talking with their professors, a few of my students said they were surprised by how little the faculty read practitioners' writing online and in trade journals.
We originally envisioned a database-driven index -- a bit like a library database -- intended to help our academic readers find quality works by practitioners, and to permit our practitioner readers (who might not otherwise come across peer-reviewed articles) to see what scholarly research might address their questions. There are historic reasons for the dichotomy (which we don't underestimate), but it's always nice to hear from people who appreciate what we're trying to do.
How do you find all of the content for EServer TC Library? Searches from Google? Special search engine tools?
I'm the son of an English professor and an academic reference librarian, and I learned from my parents and during my Ph.D. work how to find new things to read, always. I don't use Google much, except for casual searches, but I browse the web frequently, read online extensively, and often follow links from sites I like, several times a week coming across caches of new-to-me content which I feel might be useful for our readers (and for my students).
But the new works in the TC Library aren't all added by me; the TC Library has both an advisory board and an editorial board (see http://tc.eserver.org/about/) and these fifteen people consistently add works to the catalogue too, as do many of their (and my) students and former students. The site was designed in April of 2001 to encourage everyone to add works to the index. The 'Add a Site' link at the bottom of every one of the hundred thousand or so pages of our site enables anyone online to add sites to our index, particularly those they feel contribute significantly to the field. We also have a bookmarklet for those who are serious about adding new works to our site. So a great deal of the content on our site is added by technical communicators, not by me. Since 2001 we've been using all the Web 2.0 techniques we could to make the site inclusive for volunteers from the entire tech comm community.
You know how this works; you do the same thing with your Writer River site.
We'd like to make our site better at this, though. We're always encouraging our frequent users to help us improve the interface, to make it easier to add works. We have a discussion forum designed mostly just for this. Please encourage your readers to post, if they have suggestions or ideas about how to help us make it easier for people to post works we don't yet have in our index.
The EServer also has agreements with several academic publishing firms, which have FTP accounts on our servers which permit them to upload XML metadata whenever new issues of their tech comm journals are published (something like RSS feeds, but which use more sophisticated schemas such as the SAGEmeta system, PRISM, and the Journal Archiving and Interchange DTD). This gives us articles from dozens of peer-reviewed journals. We then have some custom XSLT scripts I've developed which import these quickly and easily into our database, permitting us to add peer-reviewed scholarly articles quickly into our catalogue.
I notice that the content is often added in spurts, such as a dozen at a time, and all in one category. What's your process?
The system was built to encourage anyone on the Internet to add works whenever they wish. But that means, of course, that I can't predict when we'll get lots of new works and when we only get a few.
Some members of our advisory board at first, in 2001, worried that too-open access would be problematic. George Hayhoe, for example, wanted to ensure that our site employed good judgment to ensure the quality of works we indexed; he argued that was more important than quantity. But as our site developed, a few of us felt that, in conjunction with the 'rating' feature, that this would be somewhat self-correcting. The Web 2.0 genre which emerged at this time tends to support this theory. Today, we tend to believe that it's a good thing for people to see that articles which advocate positions a majority of readers disagree with are low-ranked; in some ways, that's better than not including idiosyncratic articles at all.
To answer your question completely honestly, though, whenever I find a site that seems to have a good number of viable articles for the TC Library, I bookmark it on my laptop. Then, when I have some spare time in the evening (in front of the TV, for instance), I go down the list and add a few more works to the site, until the show (or my free time) ends. That may help to account for the trend you've noticed, when ten to twenty works are added in about an hour. That's often me. :)
How do you keep all your content from becoming a jumbled mess? Beyond the basic categories, are you adding tags? Do related results appear by keywords? How are you managing this sea of information?
I feel strongly that the fields of technical communication are not well-organized, overall. The STC likes to term technical communication a "profession," but my reading in the fields leads me to believe that this is a goal, rather than a present reality. Is usability a part of technical communication, or a separate profession? Arguments can be made both ways. Many technical communicators consider lots of diverse forms of work a part of TC, but when I talk with people who work in many specialties included in TC textbooks, often they don't agree. I think there's work to be done before we can define the clear subcategories of the "field" of tech comm.
Our category system attempts to work with that disparate collection of expertises.
When we first built the TC Library in 2001, three students and I developed a system of categories for each work that uses a four-level hierarchy of metadata categories. So a work may be classified as "Articles>Information Design>Databases/Case Studies", and it would then appear in directory views for any of those four categories (or any view that includes one or more of them).
That was our first idea for a taxonomy, in 2001 (when we estimated the site might someday contain about 5,000 works). Today, these categories include 408 discrete phrases. In early 2008 we added a tag cloud interface, where you can see how many works use each of those 408 terms.
In 2006, two colleagues in the field, JoAnna Springsteen and Saul Carliner, began a project to rethink our taxonomies and develop a better scheme for our works. Unfortunately, after a few months of work, they weren't able to agree on any alternative scheme that seemed to work better for us, and discontinued their work. One can try card-sorting as much as you wish, but when you look empirically at thousands of works, any system of categories is difficult to impose on all of them. Our fields are simply too diverse, still.
So we've continued with the same system we began with. If any of your readers have a suggestion for us, we'd be delighted to reconsider. We're watching with interest the work of the STC Body of Knowledge group. Though some members of our boards (including me) have some strong reservations about some of the choices in the STC BoK's draft map of "the field," we feel this sort of work, when done well, is much needed, and definitely a contribution to the discussion.
We have a small team that always works on our interface, to try to make our categories clear for users. Right now our home page doesn't represent these categories very well. It shows twelve top-level categories in bold, with 133 subcategories underneath them. But we're always working on prototypes of more effective home page interfaces (we usually keep the most promising at http://tc.eserver.org/beta/). Again, we always invite discussion from your readers. It's a discussion I feel the field needs to have.
Does the voting process make the good content rise to the top? In searches? Digg-like sorting? How many people vote on your articles?
Our system allows users to decide how they prefer to have directory views sorted. By default, all category views show as "unsorted" (in the beige select menu centered at the top of the directory listing). Our readers can choose whether to sort by author, date added (with the newest at the top), by user rating, title, or year published. I myself usually keep them sorted by rating. Clicking the 'Preferences' link just above that allows us to set lasting cookies with our favorite view.
But to answer the rest of your question, I've been working for a while on an article about how we in tech comm tend to rate our work. In brief, we have an open five-point Likert scale which permits anyone to post a 1 (poor) to 5 (best) rating for any work in our index. I have written an algorithm which we run on a monthly basis to eliminate multiple votes from the same person, which uses a few patterns of voting we've noticed that would seem to bias votes about particular works. After removing spurious ratings, we average about 4.26 votes per work in our index.
At present our system has a strong bias toward ratings of '3 (good)' (29.7% of ratings); after that comes '2 (ok)' (18.8%), '4 (great)' (18.0%), '1 (poor)' (16.8%) and '5 (best)' (16.7%). A fairly balanced distribution, heavily weighted toward the center (for some reason). I'll provide some reasoning about why this is the case in the article itself, which I'll hope will be published in 2009.
How many people do you have looking for this content?
We're an all-volunteer organization, with no revenue except for donations from generous people who wish to support us (I suppose I'll shamelessly add a URL here, http://tc.eserver.org/about/donate.php , where people can make donations if they wish). We won't run advertising. Because of these, we don't have "people" per se. I have some graduate students who believe in the project and try to help by adding works when they have some spare time. We have dozens of users who add their own works to our catalogue when they post new ones to their sites, to help publicize their writings. And we have volunteers who find and add quality works as a service to the field(s). But I can't tell you how many; you don't need an account to add a work to our site, so we don't track how many, exactly.
Surely you must feel motivated to write about some of these articles you find. Do you have a blog? Do you have another venue where you write about these topics?
I've never really enjoyed the blog format for my own writing. While I find them easy to read, blogs tend to be too short to produce much coherent argument about a topic. While I sometimes find them entertaining, for my own writing I tend to prefer longer, more careful arguments about a problem, its origin and history, and what we could/should/might do to address it. I go up next fall for tenure here at Iowa State University, and the cliché about "publish or perish" will be tested. The standard here for tenure is whether assistant professors (like me) have made an "impact on the profession." We'll see whether the senior faculty and administrators at Iowa State will credit my work on the TC Library site when considering my tenure case.
But it's unclear under the tenure system how much actual reward we will receive for such work. So when I write, it's mostly for scholarly conferences, peer-reviewed journals, and I'm working on a book project right now.
I do tweet (http://twitter.com/geoffsauer), if people care to follow extremely-short posts about things that happen. If people 'friend' me on Facebook, they will also occasionally hear about EServer and TC Library-related events and news.
How old is the technology that's running the EServer TC Library site? Are you running into any kinds of limitations?
We always need funds for equipment. The EServer itself runs fifty collections, of which the TC Library is only one. And the EServer's technical board (which I chair) keeps an up-to-date 'wish list' of equipment we'd purchase if we were to receive a grant or donation.
But at present, the TC Library database is running on a 2008 eight-CPU x 3.2GHz Core2 Duo Macintosh Pro workstation, which is about as fast as any computer a website in the field is likely to have. We're running a RAID 5 array we built ourselves (I teach students how to build RAID arrays -- it's much cheaper than purchasing such equipment pre-built). And it's got a gigabit ethernet connection to the Iowa State University OC-12 lines, which gives us respectable speed for a one-location, small-server setup.
We always need better technology; a second server workstation would allow the site to use a round-robin domain setup, permitting the site to remain 'up' during some necessary maintenance for the servers. It would also permit the site to remain faster when being used by multiple students in classes. (We use the TC Library in a number of English 302 'Business Communication' and 314 'Technical Communication' courses at Iowa State -- we teach 117 of these a year.) So we'd definitely like to add to our hardware.
But we're not unhappy with the infrastructure we have.
For as much as the EServer TC Library is used, does the recognition it receives match up?
I'm a scholar of new media and know a great many people in the field, and I don't know anyone who feels they receive "sufficient" recognition for their online work. But I've talked with enough people to see the pattern, and to avoid believing I have that problem -- too much. :)
We did receive an APEX Award of Excellence in 2008, which was nice. More than fifty academic libraries link to us as a recommended resource in writing, technical communication, or human-computer interaction, which is very rewarding to me. My local chapter of the STC (the Central Iowa Community) has expressed an interest in nominating me for an STC Associate Fellow, which would also be nice. In the 2006 Report of the STC's Education Task Force, the report suggested that the STC should work with the TC Library team to think about how to develop a comprehensive directory of online resources (they haven't, but it's nice to know the committee recommended us). You published the podcast interview with me in Tech Writer Voices in 2006, there was a nice interview with me in 2007 in the International Journal for Technical Communication. Someone wrote a Wikipedia article about me in 2007, and now you're interviewing me for I'd Rather Be Writing, which is also nice. I'd say by any objective measure that we're well-recognized.
I'd personally like to have a bit of a discretionary budget so that I or members of our boards could attend more practitioner conferences -- STC, DocTrain, Watson, IEEE PCS, and others. That would help us to network more, and to raise our visibility among technical communicators. I'd also like to improve our relationship with organizations in the field, but that's on our wish list for the future. For the time being, we're working to improve the user interface and the experience design of the site, and will hope that our users will tell their friends.
How does your purpose, "Our goal is to provide tech comm practitioners, students, teachers and managers a comprehensive single location from which to find and access the complete body of knowledge in our field," compare with the Body of Knowledge initiative by the STC?
Well, I've been following reports about the STC BoK project with some interest, as well as the Usability Professionals Association's seemingly now-moribund 2005 UPA "beta" BoK website. Obviously we both preceded the STC's initiative by a number of years, which suggests there may be a zeitgeist (a "spirit of the age") which tends toward the idea.
A number of people have asked me about similarities between the STC BoK and the TC Library project, but I haven't seen much resemblance. It seems to me that the STC BoK is using card-sorting exercises to develop a deductive map of the field, and the TC Library is attempting to categorize the works in the field, generating a map inductively. Between the two projects, we may be able to help to answer the questions the field still needs to address in the next few years. But we're using very different methods.
I'm always a bit cautious about Body of Knowledge ventures, though. In my experience, they're often associated with certification, creating maps of the "key" knowledges in a field so that -- soon after -- exams can be developed to determine how well-versed practitioners are in "the profession." This tends to be significantly revenue-generating for professional organizations which provide such a service. If the STC's BoK might enable such a plan, I'd be very skeptical. Try to imagine a test which could determine, effectively, whether any five people you know in the field knew "the central" tenets of our field. Ask five people you know what five "central tenets" might be. Try to imagine an examination that could equally test people with specialties in copyediting, technical writing, usability, content management, and technical illustration. I just don't believe tech comm currently has such homogeneity, and worry about any attempts to make real any single vision of what the "key" knowledges of our field are.
But I wasn't asked to be a member of the STC's BoK project, and I don't know if that's part of their mandate. I certainly think attempts to articulate the relationships among and between various branches of knowledge in our field would be a useful thing. As long as we accept them only very skeptically. :)
- Rendezvous with Knowgenesis (interview with Geoff about the TC Library)
- Eserver TC Library RSS Feed
- Geoff Sauer's home page