Does Translation Mean You Should Omit Illustrations?

One can hardly dismiss the power of visuals. One of the oldest truisms in communication is that a picture is worth a 1,000 words. Instead of lengthy text, we praise infographics, diagrams, workflows, and other visual illustrations that communicate ideas. (See this collection of New York Times infographics.)

In Visual Language: Global Communication for the 21st Century, Robert Horn’s main premise is that the combination of text with visuals creates a powerful form of communication.

The other day a colleague, a graphic designer, told me she doesn’t read text on web pages; she moves right to the visual, she said — proudly.

In short, large blocks of text suck. Visuals rock. (See Nation Shudders At Large Block of Text.)

The Problem

Despite this triumph of visual communication, when it comes to technical documents that need translation, visual communication is problematic. If you’re translating the document into 10 languages, every screenshot you use requires translation as well. One screenshot becomes ten. Ten screenshots become 100. One hundred screenshots become 1,000.

And not only do the screenshots need translation, you need access to other operating systems, you need the ability to maneuver around in other languages to produce the scenarios to get the screenshots, and so on. For example, try reproducing an error message screenshot in 10 languages.

Let’s say you forego screenshots and instead stick with illustrations only. Although it’s possible to communicate basic ideas through wordless shapes, you may end up with the equivalent of Pictionary scribbles. Conversely, if you remove illustrations and other visuals, you end up with a text-heavy encyclopedia.

As such, it seems that technical communicators have the following options:

  • Pursue a costly route of image-based text, which may result in attractive but expensive and time-consuming documents.
  • Strip out all images and deliver text-heavy documents, which users may despise.
  • Deliver documents with mysterious and perplexing wordless shape diagrams.

An Alternative

Let’s explore an alternative to this conundrum. If you’ve ever assembled something from Ikea, no doubt you’ve marvelled at the wordless instructions that move you from step to step. There are no multi-lingual instruction manuals with Ikea products — just one global picture booklet.

Holly Harkness explains that the model works “because Ikea builds simplicity into their products from the get-go” (The wordless manual). In other words, wordless manuals work because the products are so easy to assemble, they don’t require words.

Recently while assembling an Ikea bookshelf, my wife struggled to understand this particular Ikea picture:


Who should you call, the local Ikea store, or Ikea headquarters?

She kept calling headquarters when she should have called the local store, apparently. Upon returning the bookshelf with its defective parts to the local store, the sales clerk asked why she didn’t call the store first. My wife had called — but to headquarters, not the local store.

I doubt the wordless picture model that Ikea adopts could work for all forms of documentation, especially software documentation. But certainly Ikea shows us that it’s possible to use wordless visuals to communicate an idea.

(By the way, you can browse Ikea’s manuals online here.)

Ikea’s Secret?

One hunch I have about Ikea’s visual technique is that they use a lot of small pictures in sequence, rather than several large diagrams (see this example).

In the Ikea model, you see about 20 smaller pictures to follow, rather than one or two large diagram-like pictures. Why? Without words, you’re forced to simplify the process you’re representing. The way you simplify an image is by breaking it into smaller images.

Perhaps the way to incorporate illustrations in documents that require translation is to chunk up the illustrations into simpler images that almost anyone can follow. You may end up with more images, but as a whole the images in sequence can help tell the same story that the single image would tell. (Note: I’m referring to illustrations, not screenshots.)

The Principle of Small Multiples

My hunch about why Ikea’s technique works ties in with a principle I read about in Edward Tufte’s Envisioning Information (a classic to have on any coffee table). Tufte says one principle of visual information is to show a series of small multiples:

Small multiples, whether tabular or pictorial, move to the heart of visual reasoning–to see, distinguish, choose (even among children’s shirts). Their multiplied smallness enforces local comparisons within our eyespan, relying on an active eye to select and make contrasts rather than on bygone memories of images scattered over pages and pages (p.33).

In other words, small multiples force you to compare between the images. In that comparison, you can derive some meaning. The differences tell a story.

To illustrate, Tufte includes the following image, called Color Coordination, which has been redrawn in Tufte’s book from Yumi Takahashi and Ilkuyo Shibukawa.

Color coordination

Color Coordination, an example from Edward Tufte's Envisioning Information. This image shows the idea of color coordination through small multiples.

My Own Examples

In my own documentation, I’ve started to move to wordless pictures in sequence as well. Without understanding the context at all, what do you make of the following images? What story are they trying to tell?

Layers 1

Layers 2

Layers 3

By breaking up a single image, which might have been adorned with various callouts and labels, into a sequence of simple images, with slight variety between the images to tell the story, I hopefully communicate an idea without words. This image can be used in documentation in any language, without requiring translation.

Additionally, below each image I could use captions to elaborate on the meaning of the illustration. In the case above, my images are showing the idea of a layered calendar, similar to Google’s calendar. You have multiple calendars available, and you can turn the calendars on or off to determine what events show on the main calendar view.

Unless I included a lot of labels and callouts, this image wouldn’t work as a single image. But by breaking it into a series of images — small multiples –– the meaning is clear, even without words.

Madcap FlareAdobe Robohelp

By Tom Johnson

I'm a technical writer working for the 41st Parameter in San Jose, California. I'm interested in topics related to technical writing, such as visual communication, API documentation, information architecture, web publishing, JavaScript, front-end design, content strategy, Jekyll, and more. Feel free to contact me with any questions.

18 thoughts on “Does Translation Mean You Should Omit Illustrations?

  1. ted

    I actually thought the image sequence represented layers in a graphic app – I didn’t realize it was a calendar, mostly because it was displayed on a diagonal – calendars are typically straight on – but it may be harder to show the layers that way.
    When I worked for a huge international company, we outsourced all translation and the more images a doc included, the more time and money they needed to translate. This was mostly due to software screen captures – we had a lot of them early on. For the other graphics, all they required was the source files – and only if there was text within. The concept of reducing or removing all text from graphics would have definitely saved some time and money back then – but since we were in software, the majority of our images were screen captures – and they had to recreate every one. That resulted in a mandate that we reduce all images in all deliverables across the board.

    1. Tom Johnson

      Ted, thanks for your feedback. You’re right that my graphic would be a lot better straight-on rather than diagonal. I’m a hack at best in Illustrator. I may revise it based on your comment.

      There is quite a difference between illustrations and screenshots. I don’t really see how screenshots could be salvageable in a translation scenario, other than translating every one of them.

    2. Melanie Blank

      I didn’t realize it was a calendar, either! Sorry, Tom – I didn’t know what it was supposed to represent. :(

  2. John Melendez


    I think one thing that needs to be mentioned here about illustrations that need to be used for audiences of different languages is the use of numeric callouts. Numeric callouts obviate the need to translate text embedded in the graphic – much like the kind you see here:

    The method for using numeric callouts is to label the parts of the graphic as you see here:

    In a table adjacent to the figure with the numeric callouts, you define the called-out items in wherever origin language you will – English, for example. When the document in which the graphic and the accompanying English-language text gets translated, there is no need to redo any language in the graphic.

    I thought this practice was common knowledge. What does everyone else think?

    1. Tom Johnson

      John, thanks for pointing this technique out. A colleague and I were discussing it yesterday, and despite the cognitive disconnect of having to refer between numbers and a reference table below, I think it’s a completely practical approach for translation. Thanks for bringing this up.

      By the way, on the topic of callouts, I did write a post about them a few months ago.

    2. Melanie Blank

      John — I agree with you, and this is the way I have been taught to handle callouts when material will be translated. Makes perfect sense. No text on the image itself. (I like this method even if translation isn’t an issue.)

    3. Val Swisher

      John – you took the words right out of my keyboard. One of our basic rules is “no callouts in the illo.” instead, use numeric callouts that correspond to translatable text in the main body of the content.

    4. Paul K. Sholar

      1) Why would you use a screenshot containing English text in all internationalized documents, regardless of whether you are using numbered callouts to isolated the translated text?

      2) Isn’t it possible that different human-language versions of the same application have enough differences in screen layout such that they are noticeably different from a screenshot taken of the application showing English text?

  3. mike

    Couple things. Thing one: let’s not forget accessibility. Even if you provide graphics, you might be obliged, depending on your audience, to describe what the graphic shows.

    Thing two: Screenshots are popular in tutorials about using software. If you do straight-up screenshots, you almost certainly will be capturing on-screen text that would need to be localized in non-English versions of the docs. An alternative is to “conceptualize” the graphics to not show the text directly, along the lines of what you’re showing here. The downside, tho, is that creating a conceptualized version of something like a screenshot is 10x (or more) slower than using simple screenshots. This then becomes a tradeoff. Do you reduce the number of graphic that you use in order to accommodate the pace at which they can be conceptualized (let’s assume that 10x factor)? And if you do, are you creating a less-than-optimal experience for your E1 readers (that is, using fewer graphics than you want) to accommodate the process overhead of using those same graphics for E2+ readers? (The other alternative, of course, is to leave it up to the localizers to re-shoot the screenshots in the target language, which among other things means that they have to run be running, live, the same version of the same software in order capture the screenshots.)

    1. Tom Johnson

      Man, I didn’t even think about the time sink in involved in creating conceptualized versions of graphics. You’re right — a screenshot might actually be a lot easier to create and maintain. Thanks for pointing that out.

      1. mike

        We’ve been arguing about that tradeoff for years around here. :-) (Part of a larger discussion about the tradeoffs — perceived or otherwise — to be made to accommodate eventual localization and/or E2+ readers.)

        1. Tom Johnson

          I haven’t heard the E1 and E2 designations for readers. What does that stand for?

          It seems a little silly to worry about translating into 10 languages when 85% of my readers speak English, Spanish, or Portuguese. It makes sense to me to create a deliverable (such as a series of videos) in English first to win that audience. Then as the product matures and stabilizes, translate the videos into the other main languages. Those other 7 languages — Russian, Italian, Korean, Japanese, etc., would just have to be satisfied with text, recognizing that they don’t comprise the majority. But is that ethnocentric and offensive?

          1. mike

            E1 = English as first language, E2 = English as second language, etc. I got this from Edmond Weiss. (See

            How much you have to worry about localization and non-English readers depends on your product. Over 50% of our audience is in non-English-speaking countries. We routinely sim-ship products in English + 8 core non-English languages. (We also provide documentation in additional languages using other — e.g. community-assisted — channels.) So obviously we are obliged to worry a lot about the impact on localization of decisions about documentation.

            Your situation sounds quite different, so your priorities and approaches are obviously also different. I don’t think there’s a universal answer here. As I was noting originally, our particular concerns have influenced our discussion of how to add art to documentation. I actually like the approach(es) discussed here — “conceptualized” art, even for screen shots — which we have worked with quite a bit in the past. (You can see some examples here: However, as noted, the process overhead really slows down the production of art for screenshot-heavy documentation like tutorials, so we’ve had to think hard about what to do. There certainly is an argument, often heard, that localization can take of itself and we should do what’s best for the E1 reader. However, see above. On the other hand. Etc. :-)

          2. Tom Johnson

            Thanks for pointing me to the examples of the conceptualized art. That’s really interesting. I can see how there would be a lot of overhead for doing that.

            Also, thanks for the link to the source for the E1 and E2 terms.

            By the way, I didn’t you know you worked at Microsoft. Do you ever work with Harry Miller?

  4. Corporate Logo Design


    Your article on the topic Does Translation Mean You Should Omit Illustrations? includes the information that I was looking for.
    Your post includes great tips and you managed to keep it simple and understandable.
    Your post helps me to understand what Does Translation Mean You Should Omit Illustrations? really is, and I will surely recommend it to other people.

    Thanks and keep up the good work.


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>