The genius of Github and how it can transform tech comm
In this post, I explore Github and Github Pages. Github helps you put your development platform on the web; Github Pages renders your site directly from your code repository, blurring the distinction between a revision control repository as a storage container and a web publishing engine.
The genius of Github
In one of the recent podcasts on API documentation, I interviewed Joe Malin, an experienced API doc writer (and former programmer) who spent 7 years at Google and many other years at other Silicon Valley companies creating API documentation. One of the things Joe said that really got me thinking was how brilliant an innovation Github is.
I transcribed this part of the interview (it's around the 19-20 min. mark of the podcast):
Joe: ... Code samples, literal code files, also have a place. I think they're going to get a lot more popular. One of the main reasons is the existence of things like Google Code and Github and similar things that are basically open repositories of code for people to jump in to and grab. Github, in my humble opinion, is one of the most revolutionary things that has happened to software in 20 years.
me: Why do you say that?
Joe: Because it basically gives you the ability to grab a repository of code that is under development, which means that if the developer updates the code, you can go in and pull down the updates. It's not just giving you code samples. It's giving you living code samples. And it's organized in a way, if the developer does it right, that allows you to build the code on your system, and not just grab the code sample.
Github has a lot of other features that make it easy to use. You can associate a wiki with your Github site. You can associate Github Pages, which allow you to basically put your entire developer experience into Github. You can have these things call Gists, which are code samples associated with your site. It's a brilliant idea and tremendously well implemented.
Android is using it to host code samples now, with the result that if you get the latest way of building Android apps, called Android Studio, you can get code samples that download into Android Studio as sample apps, and they come out of Github (I think, either Github or Google Code) -- at any event, they come down a repository. If somebody at Google goes about updating them, you can find out about it and get the latest and greatest.
Here's the same two-minute excerpt from the podcast in audio form in case you want to listen to it.
If you want to listen to the whole podcast, go to the podcast episode.
Why Github is such an innovation
To summarize and expand a bit on Joe's comments, Github is one of many revision control repositories where programmers upload, share, and version code. Almost every IT shop uses some form of revision control software for their source code. It may not be Github. Maybe it's Mercurial, Perforce, Subversion, or something else. But programmers invariably use it.
Among revision control repositories, why is Github so special? In theory, any revision control repository might lay claim to a similar innovation. But Github combines code repositories with the web and open source software. You can create a public Github repository for free, for any project.
These repositories cater especially to open source projects, because you're putting your code online in a way that makes it easy for people to access and use the code (which is often the point).
Developers regularly push updates to the code repository. Other people can clone and fork the repositories. Forking a repository means that once you get the code, you build and branch it in a unique way to suit your own needs.
You can also run pull commands to get the latest updates -- the updates download almost instantly to the same folder where you initially cloned the repository. The file transfers are fast because you're not actually downloading new files -- you're downloading changes to the original files you already downloaded.
Revision control software like Github provides an easy mechanism for sharing code bases and keeping others updated when there are changes. More than any other revision control repository, Github enables open source projects to share code in a central hub where all team members can see, share, and update code.
Using revision control to manage code projects is a process that software developers have followed for years. It makes it easy to share and collaborate on code with others -- much easier than checking files in and out of a proprietary CMS or storage system somewhere. Github enables a phenomenon of "social coding," as they call it.
File formats for revision control
Revision control software works best with text files, e.g., the type of file you can open and read in a text editor. When you have binary files (machine-readable only), revision control software isn't so good. For example, suppose you're sharing an Illustrator file, or an audio file, or a PDF. I don't think revision control can intelligently show you the differences between versions in a way that is human readable.
If you switch away from binary file formats to authoring in text file formats, why not use a repository like Github to share files? In fact, using revision control repositories instead of a content management system was one of the best discussions I had at TC Camp. There are a lot of technical writers, especially those using DITA (which uses a text-file format), who are using revision control software instead of a CMS.
Programmers often have complicated workflow scenarios, with different branches, access permissions, and build configurations. I think managing content projects (instead of code projects) might actually be simpler. Why aren't more technical writers using revision control software to manage and collaborate on content? If you're using a text-file format, there's no good reason not to. There's no reason not to treat content with the same workflow as code.
Beyond replacing the CMS, the online code repository enables us to see and share code in new ways. You go beyond simple code samples and get live, working code for fully developed systems.
Beyond replacing the CMS
Instead of just sharing content or software code, what if we shared code for our help systems themselves? By this I mean the help themes, templates, or publishing engines we're using (in short, the help software) for publishing documentation.
Here's an example. I'm developing a Jekyll theme for documentation sites. The Github repository for my Jekyll theme is here: https://github.com/tomjoht/jekyll-doc. This is very much a work in progress. You can clone the site to your local computer, and then you will get all the files needed to run the same Jekyll theme. You can use this clone as is or fork your own version of the theme, adding it to a different repository.
In a recent post about ScrollSpy, I mentioned that GetBootstrap.com was built on Jekyll. You can see the code for their site in Github here. You can download the entire site and run it locally on your machine, because their documentation is built on Jekyll and made available on Github. Other static site generators (such as Docpad) would allow the same sharing and cloning of content.
Imagine if help platform development were open source
Now imagine if thousands of technical writers across the globe were using different themes for their documentation, and they stored these themes in Gibhub repositories. You could easily clone a repository and see how someone is publishing their help content. We could all use and piggyback on each other's code. Or people could just jump start their own doc projects by cloning a theme they like.
I'm not saying Github repos will replace vendors, because there are clearly limitations about the size and scale of what you can do here (for example, Github doesn't store databases or provide authentication systems -- they're just text files). But as more tech writers begin using static site generators for their help, they can clone and share themes in ways that will help promote continued innovation.
The upward spiraling workflow
In this open source, constantly iterating workflow, the web dynamics looks a bit like this. One person will put project A out there, and tomorrow person B clones it, forks it, and improves it. Person B shares the forked repository online, and Person C clones and forks it, making more improvements, and sharing it. And so on. By piggybacking on each other's efforts and code, we can extend farther into a more advanced, robust help solution for help content.
By the time person Z comes along, this whole help solution has already been built like Ferrari. It's just a matter of cloning and customizing it through iterations that successively improve it.
This is the dynamics of the web applied to living code repositories. WordPress is a great example of this dynamic, with thousands of developers contributing to a vibrant community of plugins, themes, and other open source code.
When we apply it to the way we publish help content, we can see how revolutionary this innovation can be. It has the potential to make proprietary solutions unnecessary and empower lay technical writers to plug into a robust help publishing systems that enable them to do content re-use, snippets, variables, conditional processing, and more in a professional, easy-to-implement way.
Code repository and publishing engine in same package
One particular innovation worth noting is Github Pages. If you look at my Github repository, you just see a bunch of code files. If you use Jekyll, behind the scenes, Github will actually build your Jekyll site each time you push to your repository. You don't have to build your site first and then push the built site to the repository (though you can do this to, especially if you're using plugins not allowed by Github Pages).
Here's a quick demo. Suppose I want to make a small update to my navigation bar, adding a page. The process takes about 20 seconds.
I'd Rather Be Writing Newsletter
Get new posts delivered straight to your inbox.
About Tom Johnson
I'm a technical writer based in the California San Francisco Bay area. In this blog, I write about topics related to technical communication — Swagger, agile, trends, learning, plain language, quick reference guides, tech comm careers, academics, and more. I'm interested in simplifying complexity, API documentation, visual communication, information architecture and findability, and more. If you're a technical writer of any kind (progressional, transitioning, student), be sure to subscribe to email updates using the form above. You can learn more about me here. You can also contact me with questions.