Racial and Ethnic classifications as an example of classification challenges.
If you want to learn about some of the specific challenges in developing taxonomies, have a look at the racial and ethnic classifications used in the US census. The development, evolution and discussions around this taxonomy highlight many of the problems you can encounter on a smaller scale when developing taxonomies for websites. These problems are inherent to what it means for us to classify. There's no way around them.
In october 1997, the Office of Management and Budget (in the USA) announced the revised standards for federal data on race and ethnicity. The taxonomy is as follows:
Please choose your race (one or more):
- American Indian or Alaska Native
- Black or African American
- Native Hawaiian or Other Pacific Islander
- White
- Some Other Race
Please choose your ethnicity (only one):
- Hispanic
- Non-Hispanic
Hispanics can be of any race (so you can choose Black and Hispanic). The Some Other Race category was introduced in the census 2000 questionaires, not originally part of the standard taxonomy.
One could write a book about this taxonomy. I'll try to keep this short and funky.
In 1977, the taxonomy was like this:
Please choose your race (only one):
- White
- Black
- American Indian and Alaskan Native
- Asian and Pacific Islander
Pleace choose your ethnicity:
- Hispanic
- Non-hispanic
Back then, the racial categories were considered scientifically valid and mutually exclusive. Obviously, things have changed since.
In 1990, "Other Race" was added, but the biggest change is that people can now choose more than one race. In the 1990 census, half a million people ignored the instructions and checked more than one box. Something had to be done. Imagine being a kid with parents of mixed race.
One result is that data from the 1990 census cannot easily be compared with data from the 2000 census. This is nothing new. Almost every census for the past 200 years has collected racial data different than the one before it, and extracting racial trends is deeply problematic.
Change in taxonomies is something we need to prepare for. It means we will not always be able to effectively compare data over time. It also means we should avoid building the taxonomies we expect to change (and most will) too deeply in the infrastructure of our websites (say, URL's or database schemes).
Of course, a racial taxonomy is deeply suspect. Scientist these days generally agree that race and ethnicity are social constructions. Humans cannot be categorized in a taxonomy of races based on biological information in a scientifically valid way. However, race continues to be a social reality in the US. It is this social reality that the taxonomy is trying to capture. Since the social reality changes, the categories will continue to change. Are you recognizing any of this in your own work yet?
This is one reason why people are asked to self-categorize. In the past, census enumerators were instructed to report a person's race based on observation - you can imagine the problems.
Self categorization of course has many problems: people may percieve their choice of race to have some influence on their future (job availability), which can affect their choice. And many people have only limited awareness of their own geneaology - they may not know what race their are supposed to be.
The race categorizations are heavily discussed and disputed every time they are changed. Many political groups argue for or against certain changes in the taxonomy.
The reason is simple: the categories have an impact on policy. If a certain group isn't categorized in the taxonomy, they can't be easily measured, and it becomes much harder to lobby for certain changes that should benefit that group. For example, for the 2000 census many advocacy groups for racial minorities encouraged multiracial people to check only a single race (the minority race). Classification is political, and if you've ever worked for a large company trying to implement an intranet, you'll recognize this.
There is much (much!) more to say, and I feel bad for only touching briefly on such a fascinating topic, so here's some bed-time reading to get you started:
- Recommendations from the Interagency Committee for the Review of the Racial and Ethnic Standards to the Office of Management and Budget Concerning Changes to the Standards for the Classification of Federal Data on Race and Ethnicity.
- Racial and Ethnic Classifications Used in Census 2000 and Beyond
- Using the New Racial Categories in the 2000 Census