Translating taxonomies and categories

So as I mentioned, Livia, Jorge and me have been looking into international information architecture. What happens when you run a site in multiple languages/locales and need to manage the information architecture of that site? Can you just translate a taxonomy from one language to another? We are gathering a lot of material, and we'll start sharing that and opening up the conversation. Me, I plan to write a series of blog posts on international or global IA, of which this is the first.
  1. Translating taxonomies and categories
  2. Translating categories, translating terms
  3. Translating the Dewey Decimal Classification system
  4. Designing the relationship between content and locales
  5. Emergent i18n effects in folksonomies
  6. The Maori versus Dewey, and why limiting access can be culturally appropriate.
Let me move this along - I'd like to talk about translatability of categories a bit today. There is much more to come, later. This is kind of a long post, and rambling, too, so bear with me. Say I run a website with recipies. We have an extensive soup section. Now, the word "soup" isn't just a word, it is also a category, in which you can classify many particular soups, for example, the soup you ate today (a specific instance of a soup). "Tomato soup" is also a category, on our site it's a subcategory of "soups". So you can see there is a difference between a word and a category. A category might be "everything about this company", in which we can put all the information about a company, and it has a label "About us", consisting of two words. Information architects like to group things together to make them easy to find (into categories). We've done research with our users and it turns out many of them would like, on a recipy site, to have a look at various chunky soups. Chunky soups are soups with big bits floating in them. Some people like chunks in their soup - I certainly do, most of the time. it's chunky and they like it In the US, chunky soups is a well known category (ask anyone what it is, yey). So we'd like a link on our site saying "chunky soups", under which you can then find various types and examples of those soups. First, we have a labeling problem in English: "chunky soups" is actually a trademark by Campbell (a big soup maker). You're legally not allowed to use it. But we strike a deal with them and they let us use it. So "chunky soups" is a category, and even better, a category (and a label) that our users understand. Some particular soups will be part of the category, others won't. We start using it on our site, our users find what they need. Peachy. Then we start developing a Spanish version of our site. And a French one, at the same time. We want to get into these markets. Can we just translate the category "chunky soups", and if so, can we use the same soups within that category? Is the category even relevant to Spanish users? I asked some Colombian Spanish friends (a wholly unscientific survey) how they classify soups. They said, a soup is either a "sopa", a "caldo" or a "crema". Chunky soup as a category doesn't seem relevant for this user group. Let's be clear: it's not that chunky soups don't exist in Colombia. Colombians have chunks in their soups, big ones, I've tried them. It's that the category "chunky soups" just isn't used in daily life, isn't relevant. It, in practice, doesn't exist. I don't think our Spanish-speaking users (if my little survey extends to all Spanish speakers) will look for chunky soups on our website. Let me be even more clear. I am not talking about dictionary definitions here. I am not trying to find out what the "real" meaning of chunky soup is, or what the real meaning of a "caldo" is. I ask my users - the way they classify things is what matters. Looking it up in a dictionary only helps so much. Asking a chef for the "correct" translation is problematic too, you want the category used and understood by users, not by a domain expert like a chef. So we seem to have an example of a category in one language/culture that doesn't really exist (or isn't useful) in another language/culture. A similar example is "chowder", an English category of soup that I honestly don't know a Dutch equivalent for. Let me stress this: it is not just that I don't know the translation of the word. I don't know of the existence of the category in Dutch. I've never heard anyone mentioning anything like a "chowder" soup. And I like soup! A "clam chowder" soup for example, would a far as I know just be a "clam soup" (translated) in Dutch. No chowder involved. Through Google, I find this definition for chowder: "A thick American soup made of meat or fish and vegetables with spices. It is almost like a stew." Note how they explain the term to English speakers from other cultures by mentioning it is "almost like a stew". A second issue with translating categories is something one might call "semantic overlap". I didn't invent the term, it seems to be well known when discussing language and words (although I am still searching for a definition). The only difference here is that I am talking about categories, not words. Anyway, a category in one language might have an applicable translation, but that category often doesn't mean completely the same thing. For example, the Spanish word for "house" is "casa". But the meaning of the category "casa" might not be 100% identical to the category "house". It is conceivable that, if you ask Spanish speakers to point out "casas" in a city, they'll point to some building at some point that, as an English speaker, you would never classify in the English language category "house". (I am not sure this is a valid example, my friends kick me when I start asking again "what kinds of X exist in Spanish" so I haven't actually tested this. Better examples are welcome.) In other words, categories in different languages often don't mean 100% the same thing. And that missing overlap can create problems for our website. If we categorize products in one language within a category, and then translate that category, we can't automatically assume that all the same products will be categorized under the same category. I am still working out examples of this stuff. I'm not even sure I'm right with all these statements. The only way to get examples I've found is to ask native speakers, so it takes some time. Comments are appreciated! A third problem with translating categories lies in the relationships between categories. Categories are often grouped in taxonomies, in trees (with varying structures). You click "soups" first, then you get subcategories like "vegetable soups" or "meat soups" or whatever. I am not sure that you can always assume that every category in the taxonomy can be translated. Some languages might have less granularity in how they classify things in a certain domain. (I won't mention 100 words for snow, don't worry. I don't think that is exactly what this is about). In other words, in English, you might have category A, subcategory AB and a subcategory of that, ABC. It is conceivable that in Spanish, there is no word for AB, just for A and ABC. I haven't found an example yet though. Again, comments appreciated! So this is all very interesting: culture-specific categories, semantic overlap of translations, translating relationships between categories. But is it practical? Have you encountered problems like this in practice? It's not because it's intellectually interesting that this path in our research will also turn out to be particularly practical. A final note: translatability of categories seems to be closely related with the ambiguity of your taxonomy. In a taxonomy of countries (almost no ambiguity, although Tibetans might disagree), or a taxonomy of products, there is little ambiguity, and translation should be fairly straightforward. In a subject category that helps people find stuff, there might be a lot of ambiguity and translation might be harder. Ambigious taxonomies are also the ones that require the most research by the information architect, so you could say that, if you need a lot of research to develop a category, you'll also need to work hard to translate it. Comments and such are very welcome. Remember, this is our thinking in the very early stages. Also, the soup example I used is just an example. It may not even be correct. Here are some of the other examples I've been playing with over the last few days and thoughts. Access to native speakers is crucial with this work, and it's hard work finding good examples, so if you can shoot down my examples please do. If you can provide better ones, that's even more appreciated.
  • "Habitacion" in Spanish means, pretty much, "room". But not entirely: if you ask a Spanish speaker to count the habitaciones in a house, they won't count the living room. Problems with semantic overlap. There are other translations for "room" in Spanish, but I don't think there exists an equivalent of "habitacion" in English, at least not one that's as commonly used. A funny thing happened when I was asking native speakers about this, by the way. They wouldn't hesitate in saying: "there are 2 habitaciones in this house", but if I would press on (to get all the info), they'd start doubting and say: "Maybe I was mistaking." They're not. It's like usability testing - the user is right.
  • "Vaso" is a decent translation for "cup". But again, I think there are differences. I didn't have a chance to explore them much though.
  • Does the basic-levelness of a category have something to do with its translatability? You would expect a basic level category to be universal.
  • I don't think that, because the example we used is a category introduced by a company (or was it?), that it is invalid. But I'd like to find better examples.
# Nov 28, 2004