How Search Can Help You

How Search Can Help You Understand Your Audience: "It seems to me that many of the metrics with which we measure user interaction with the web are deeply flawed, and provide ample evidence that the internet was invented by physicists and technologists, not marketing and advertising executives."

# Jun 7, 2003

The structure of content and metadata

I was explaining what a database is to my gf the other day (she got it in about 90 seconds), and I've been thinking about structured content a lot lately. Here are some thoughts (some extremely basic).

Unstructured content.
Anything you type in your word processor program is structured content, unless you are one of those thousand monkeys typing for a thousand years (and even then). It is structured because you write sentences, paragraphs, you draw relationships in your head between things you write and assign meaning. The problem is that the computer doesn't know that - it doesn't know what is the title of the piece for example - so it can't do much with this structure in your (or the readers') head. That's why we call this unstructured content: it is unstructured for the computer. Most of the following structures are attempts to structure things for computers. None of these structures (even ontologies) can capture all the finesse and subtleties of the structures in our heads (most structure exists in our heads, not in the world out there).

Metadata.
Not just data about data, metadata is really defined by its use. If you use it like metadata, it is metadata. Sometimes data can be metadata in some circumstances, and plain data in other circumstances.

An ordered list
An ordered list consists of elements in a certain order. Separating things (into elements) is useful for the computer, because know it knows there are more than 1 thing, not just a big Word file, and it can start doing nice things for you, like sorting these elements alphabetically. The computer remembers the order you put things in even though it doesn't understand that order (it's not anything logical like an alphabetical ordering), because the order may have meaning to you (the most important things are at the top).

Databases.
A relational database is different from a list, not only in that you can manipulate (like sort alphabetically) things easier than if you were to just type a list in a word processor, but also in that things are related to each other (ie., a person is related to an address, or a product is related to a price, so you can list all products of a certain price for example).

XML
An XML document also has structure, but isn't particularly relational. Imagine an article with a title, an author, a header, an introduction and the main body (divided in paragraphs). If you put tags (much like HTML tags) around all of those (and follow some rules), you have XML. XML is great at structuring content (and even better at exchanging stuff between applications), but not particularly good at relating content like a relational database does. Structure and relationships are the two basic elements we are discussing: they are different things. Get your head around them.

Ontologies
Ontologies really expand the relational model: not only are things related in complex ways, they are related in different ways: there is more than one type of relationship. You don't just draw an arrow between Peter and information architecture, you say: "Peter has as profession information architecture". Once you build a complex model like this, complex programs can take advantage of that information, and, for example, if you are looking for information architects, the program would know that Peter is the person you should talk to. The problem with ontologies is that they are so darn, well, complex. They are hard to get your head around, hard to create and especially hard to write programs for because they are so flexible. Many ontologies start by creating a limits: someone decides to only use these relationships and these types of elements. Often this is directly related to how this information will be used in the interface, although purists say that you should create an ontology without worrying how the information will be used (I disagree).

Topicmaps.
A topicmap is a structure in which you can build ontologies. A topicmap provides a standard structure (topics, relationships and occurrences) and a technical environment (an XML langauge to express your topicmaps, a query language, tools, ...). So it is easier to build ontologies with topicmaps because a lot of the complex, hard work has been done already. A key advantage of topicmaps is that they have merging capabilities built in - a very useful feature. Topicmaps are cool, but haven't taken off in a big way yet. I believe that will change within a year or two, although the fact that topicmaps decrease lock-in effects means that adoption by vendors of corporate technology will be problematic.

Taxonomy.
A taxonomy is a word that is used differently by people with different backgrounds, so in this discussion I will use it in a generic way. A taxonomy is a tree-ish structure in which you can put metadata. Not the nicest of definitions, I realize :)

Topics/terms/nodes.
In a taxonomy, you have terms/topics/nodes. A topic is something that exists, but can have different words to describe it. A term is a word (or more than one word). A node is a term used by programmers to describes leaves (another term used by programmers) on the tree.

A tree taxonomy.
A tree taxonomy is the structure most website are organized in. All nodes (or leaves) have one parent.

A polyhierarchical taxonomy.
A tree where nodes can have two parents. These structures are often nessecary when creating large trees in which to organize things - such is the nature of classification. (Yahoo is an example) Polyhierarchy means you can classify things better so it will be easier to find stuff for people, but is somewhat harder to implement (both in the backend code and in the interface).

A faceted taxonomy.
A faceted taxonomy consists simply of multiple tree taxonomies, used together, with the rule that the individual taxonomies should be exclusive, ie. that a topic/term in one facet cannot possibly belong to another facet. Faceted taxonomies happen to be one of the structures that are extremely useful on the web, because we have found ways to build interfaces around them that people find easy to use.

Classification.
The act of saying: "This thing belongs to this category (for example subject, or location)". Classification is subtly different from assigning properties, where you say: "This thing has this property (for example creation date)". There is some overlap between classification and assigning properties (you could say assigning an author is a giving something property or a classifying it as being written by the author).

Classification systems.
Most websites will have multiple taxonomies used for various purposes. The combination of all these is called a classification system.

A controlled vocabulary.
We are going to get subtle for a second: a controlled vocabulary isn't so much about classifying things or assigning properties. It deals with the things within the system, called terms. Terms are words (or groups of words). That's why it's called a vocabulary. A CV controls the use of terms. There are various types of CV's. A simple example is the synonym ring: Term A = Term B = Term C. You can see, a simple structure that controls the use of these terms. You can make this more complex by saying: Term A is preferred (you should use that instead of the other terms). CV's are often used to improve search engines.

A thesaurus.
An even more complex CV. A classic thesarus has this structure: central is a preferred term, which can have a parent (a 'broader term'), siblings ('variant terms'), children ('narrower terms') and related terms. Some people think this type of thesaurus is the end-all of CV's, but you can keep expanding the types of relationships: you could define what types of related terms existed. You could add types of variant terms (acronym, latin name (when doing species), ...). At some point, you'd realize you need the ability to keep defining different types of relationships in your model, and you would have created an ontology.

Yeah baby.
If you read this far kudos to you. There are many types of structure and relationships that we can use to design websites. Which ones you choose depends on how you are going to use them. Most structures mentioned above have been found to be useful for web development (ontologies are still rare). There is still a lot of work to be done to identify the best structures for webdesign, to develop interfaces for them and to develop efficient ways of populating them.

By the way, after finishing this I realized I had been inspired by Victor's excellent metadata glossary :)

# Jun 7, 2003

Powerpoint criticism is easy, but

Powerpoint criticism is easy, but what is powerpoint but a souped-up outliner? The main appeal of Powerpoint is as a presentation structuring tool - by making you think in outlines, it helps you design the structure of the presentation. Thus, Powerpoint really is an Information Architecture tool.

The fact that people then project those slides on a wall during the presentation is an unfortunate side effect, but we can't reasonably expect most presentors to have information design skills as well, so I can live with that.

# Jun 7, 2003

If I try to fix

If I try to fix my MT template for my RSS feed with the templates on the MT site, it doesn't work because I don't have the latest version of MT. If I search Google, I find nothing. Arg! Where is a basic RSS template for MT version 2.21?

# Jun 6, 2003

Enabling Dimensions:Imagine... Using the computer

Enabling Dimensions:Imagine... Using the computer blind-folded - Typing wearing a pair of oven mittens - Navigating a website with the mouse unplugged - Web surfing with tape over your spectacles. Strange but not so strange!"

# Jun 6, 2003

Oracle is doing a hostile

Oracle is doing a hostile take-over on PeopleSoft. For those who don't know, PeopleSoft sells things like Enterprise Portals (like SAP). It's one of the few things companies are spending money on these days, and the move is towards selling complete enterprise solutions. Whoever wins in the enterprise portal market has a good handle to sell all the other enterprise systems as well (lock-in at work: integration nightmares mean IT departments are reluctant to buy products from too many different companies, even if the products from other companies are superior).

Motley Fool: "Just this past Monday, PeopleSoft announced that it would be buying smaller rival J.D. Edwards (Nasdaq: JDEC) for $1.7 billion in stock. The combination of the two companies would have catapulted them ahead of Oracle, to take over the No. 2 spot in the enterprise-applications software market. PeopleSoft had hoped to close the deal this fall. [..] Does Oracle really want PeopleSoft? Or does it just want to prevent the merger of two rivals that would surpass it once combined?"

InfoWorld: "If Oracle succeeds in its bid to acquire PeopleSoft, users are in for a rough transition, industry analysts said Friday."

IT-Director: "The normally staid world of the ERP gorillas has been thrown into turmoil this week. First PeopleSoft announced plans to acquire JD Edwards. Then Baan announced its sale and subsequent alignment with SSA technologies. Now Oracle has stated that it will be making an offer on Monday to buy PeopleSoft entirely.The only one we haven’t heard from yet this week is SAP.
[...]
Instead, Oracle states that it will neither sell PeopleSoft applications to any new customers nor integrate the product lines of the two companies, thus reducing integration risks for customers trying to tie disparate systems together. Instead, the firm is pledging to offer streamlined, automated migration paths for PeopleSoft’s customers to move over to Oracle e-Business suite over time." (Good analysis here)

# Jun 6, 2003

"Intranet Focus Ltd provides consulting

"Intranet Focus Ltd provides consulting services on intranet and extranet deployment and management. We develop content management and intranet strategies based on information audits, advise on information architecture design and implementation, and the selection of content management and search software."

# Jun 6, 2003

Anyone know good places to

Anyone know good places to look for internships/temp jobs for a beginning anthropologist (not me) in New York City? Professional organizations? Job postings of universities? Other ideas?

# Jun 6, 2003

DaveNet : New York Times

DaveNet : New York Times Archive and Weblogs: "As I undestand it, the Times wishes to encourage people with weblogs to point to and comment on New York Times articles, but it also must protect sources of revenue that are not related to weblogs." So the Times online archives will now be free for links coming from weblogs with a specific querystring attached. Interesting. The question is: is there an agreement to keep these free (for x years at least)?

# Jun 6, 2003

inflight correction: "Something little observed

inflight correction: "Something little observed outside of engineering is that pre-globalization, machinery had cultural accents. You could tell if you were working on something that was American, British, French, or Japanese. Science is science, but where you have choices available for achieving an end, cultural intonations will come through."

# Jun 5, 2003

Joi Ito's Web: No-Shop Agreements:

Joi Ito's Web: No-Shop Agreements: "Basically, my point is that if you decide that you like each other and REALLY want to work together but that it will take a lot of work before the actual transaction happens, a no-shop allows both parties to focus on building the business. It's like an agreement that after two people are engaged, you both don't date anymore."

# Jun 5, 2003

I was having a drink

I was having a drink with my gf in Williamsburg last Sunday when I was visiting a friend, and I saw a local magazine with an intriguing cover. Looking closer it looked darn much like a social network graph on that cover. Handdrawn. I checked the inside and all it said was "Mark Lombardi". My friend came to the bar a bit later and I showed him, and he told me the Mark Lombardi story. Turns out the guy was doing lots of these diagrams in the 80s. Now they are considered art. He was doing a lot of them about politicians and arms dealers and such, so the CIA reportedly kept watch on him. Then one day, they found him hanging dead in his appartment in Williamsburg. Here's some pictures: first one. Second one.

# Jun 2, 2003

(Dutch) Eindelijk! AIfIA in het

(Dutch) Eindelijk! AIfIA in het Nederlands. Informatie Architektuur informatie in het Nederlands. Stuur deze link naar je vrienden.

# Jun 1, 2003

Amazon.com: Sponsored Links: Amazon now

Amazon.com: Sponsored Links: Amazon now sells sponsored links in the "People who are interested in this book may also be interested in ...". I'd love to hear experiences on how effective this is.

# May 31, 2003

BitTorrent looks like a good

BitTorrent looks like a good solution for making large files available (it makes each downloader a P2P server for others to download the file), but here's my question: it doesn't seem to solve the problem of, say, 1 download an hour of 50 Meg. That download will come from your server, and you'll still build up quite a large bandwith bill over time. It only works when more people download stuff concurrently, because the bitTorrent client will be open on their computer and P2P will start working. Is this interpretation correct?

# May 31, 2003

Dan Saffer is persuing a

Dan Saffer is persuing a Master's degree in Interaction Design at Carnegie Mellon University and will be keeping a blog about the experience. Should be good for the ones amongst us who would like to go back to school but can't gather the courage to make the move.

# May 31, 2003

The IWIPS conference (2003, 17-19

The IWIPS conference (2003, 17-19 of July in Berlin) this year looks not so interesting: the same talks that have been given there (Dray Associates, Aaron Marcus) for years - the same preoccupation with Hofstede. But this one may be worth the trip: "Guidelines have become an established aid to the development of usable user interfaces. In this paper we examine the validity of guidelines across cultures, suggesting that they are specific to the culture in which they were developed. We go on to suggest that the ability of Design Patterns to encapsulate context, and give examples of solutions that have proven to be successful in that context, may be a more effective aid to the design of culturally localised software."

# May 31, 2003

Check it out and do

Check it out and do some home redecorating.

# May 30, 2003

I visited Belgium last week

I visited Belgium last week and gave a talk (that's me and Peter Bogaards in them pictures) about Information Architecture at the Belgium chapter of the Society For Technical Communication.

I found out that IA in Belgium (and most of Europe) stands nowhere. The UK is ok. Holland seems to have a bit of IA going on - they have an information design tradition to build on. But Belgium has nothing - this was the first event discussing IA in Belgium I was told. (!) It may be because design is taught and perceived as an art in Belgium, in art academies. There seems to be little understanding of design as having anything to do with research, or as an analytical activity.

That's too bad. Belgium hosts much of Europe's institutions, and they sure could use some IA. At the talk, there was lots of interest from decision makers - managers from various levels obviously struggle with IA problems, and seem to have a feeling that this "IA" thing might have some answers.

There is also almost no user centered design in Belgium. I spoke with Vero Vanden Abeele who turns out to be the only person teaching user centered design in Belgium. I hope we catch up. I did get some business cards from a few consultants who seem to be doing some IA-like stuff, but I have to look into that a bit more. On the pro side of it all: if I ever (not for a while!) decided to go work in Belgium, the place seems ripe for some good IA's and UCD people.

# May 28, 2003

Paper prototyping: the book (yes,

Paper prototyping: the book (yes, I'm back from my holiday).

# May 28, 2003

# May 20, 2003

I'm all for breaking new

I'm all for breaking new ground but they have got to be kidding (turn on sound). It's a usability firm!

# May 17, 2003

Globe Alive [main]: "GlobeAlive BETA

Globe Alive [main]: "GlobeAlive BETA is the first search engine to list live people as search results."

# May 17, 2003

inflight correction: "Classification is like

inflight correction: "Classification is like modelling; useful up to a point."

# May 16, 2003

IBM to deliver Information Integrator

IBM to deliver Information Integrator | CNET News.com: "Formerly called Xperanto, DB2 Information Integrator acts as a dedicated search engine for corporate information, collating data from multiple sources. Rather than having to install a huge, centralized database called a data warehouse to store that disparate information, companies can use DB2 Information Integrator to query several sources and present a consolidated result."

# May 16, 2003

Sleuthing Out Data - Emerging

Sleuthing Out Data - Emerging Technology - CIO Magazine May 1,2003: "More and more, the problems that earn CIOs their paychecks revolve around making it easier for users to explore huge volumes of data. They do this through finding known objects in huge search spaces, assembling top-down overviews that summarize the important points of a topic, and helping searchers decide what they really want when their initial search ideas are confused, misguided or ambiguous." Sounds like IA.

Even though they have a simplistic idea of categorization ("trees"), there's a good bit about the politics of searching: "It's difficult for anyone to understand who hasn't lived through it to appreciate how political categorization management is [...] We had a category nomination process. We had a category retirement process. They all required long meetings." Auch. Categorization by comittee is even worse than design by comittee.

# May 16, 2003

O'Reilly Network: Information Architecture Meets

O'Reilly Network: Information Architecture Meets Usability [May. 13, 2003]: "We spoke with both Lou and Steve about the advantages of their joint seminars, the common pitfalls of web usability and information architecture, and the state of the web industry today."

# May 15, 2003

reveries - tim armstrong -

reveries - tim armstrong - google: "You know and love Google as a search engine. Tim Armstrong, its VP of Advertising, wants you to know that Google is a top media property, too."

# May 14, 2003

Even though I like my

Even though I like my current host, they are too expensive (or my sites are too popular): I'm getting additional bandwidth bills. I need to move the poorbuthappy domain. Poehosting looks good - anyone used them? Any other suggestions? My requirements are: multiple MySQL databases, PHP, over 10Gigs of bandwidth a month (and scalable) for under US$ 20. The usual (multiple emails, ...). Nice to have: Apache rewrite.

# May 12, 2003

simplel?gica:creaci?n_web: "Nos basamos en los

simplelógica:creación_web: "Nos basamos en los estándares web para conseguir sitios atractivos, usables y eficaces."

# May 12, 2003

Jonathon Delacour: Enabling CJK language

Jonathon Delacour: Enabling CJK language support. If you add, say, Korean characters to your text and a user hasn't installed a Korean font, they will see a bunch of boxes. What is a user friendly way of helping them out? Something like: "(Korean - only see boxes?)"

- Indicate which language it is
- Have a link to a page that explains how to install the fonts

# May 11, 2003

I wanted to update my

I wanted to update my RSS feed, but after using the MT default templates I get validation error upon validation error (pubDate must be an RFC-822 date). Where can I find a valid RSS feed? (I was using 0.91 - I don't particularly care for 1.0).

# May 10, 2003

Hackers and Painters: "I think

Hackers and Painters: "I think the answer to this problem, in the case of software, is a concept known to nearly all makers: the day job. This phrase began with musicians, who perform at night. More generally, it means that you have one kind of work you do for money, and another for love.

Nearly all makers have day jobs early in their careers. Painters and writers notoriously do. If you're lucky you can get a day job that's closely related to your real work. Musicians often seem to work in record stores. A hacker working on some programming language or operating system might likewise be able to get a day job using it."

# May 9, 2003

CEO For A Day: Signal

CEO For A Day: Signal vs. Noise Weblog / Blog (by 37signals): "If you were in charge of 37signals, what would you do differently? Are we focusing too much on one thing and/or not enough on another? Are we missing opportunities or mostly getting it right? How does it look from the outside? What say you?"

# May 9, 2003

Recent post on a usability

Recent post on a usability list, about security when entering username/passowrds: "Actually, in our limited testing so far, any user data entry errors have been immediately and easily resolved by the user without help (other than the error message). What I see happen in testing is: (for example) a user enters a wrong number or password, they read the "error" message explaining the entered information was incorrect, the user re-enters correct information (carefully) and gains entry."

Unless you are working from a Windows laptop that was previously connected to an extra keyboard and are now typing from the laptop keyboard (a typical scenario when logging in after taking your laptop away from the base station). In that case, you have to press a well hidden key combination or your keyboard will not function correctly: certain letters will show up as numbers. I was logged out from our network like this, on a Saturday. Worse than the caps lock key. One idea: if people misstype their password, give them an additional textfield to check their keyboard entry with their password, and explain how to fix keyboard problems. Like "Using a laptop? Type your password here (it will show up on the screen) to check your keyboard settings - they may change if you have unplugged an external keyboard recently or pressed the CAPS-LOCK key." (and include how to fix it) This may be overkill though - I have no idea.

# May 9, 2003

Fifteen Tips for Remote Collaboration

Fifteen Tips for Remote Collaboration (via IASlash). I have found that working remotely with people you know well (ie. people you have worked with in person) works well. Working remotely with people you don't know that well is a lot harder.

# May 9, 2003

Revenge of the Nerds: "Let

Revenge of the Nerds: "Let me start by admitting that I don't know much about ICAD. I do know that it's written in Lisp, and in fact includes Lisp, in the sense that it lets users create and run Lisp programs.

It's fairly common for programs written in Lisp to include Lisp. Emacs does, and so does Yahoo Store. But if you think about it, that's kind of strange. How many programs written in C include C, in the sense that the user actually runs the C compiler while he's using the application? I can't think of any, unless you count Unix as an application. We're only a minute into this talk and already Lisp is looking kind of unusual."

# May 8, 2003

A new Amazon patent application,

A new Amazon patent application, filed in May 2002, but made public on Thursday, would cover a system that allows people to preorder a used item from an unspecified seller when that item isn't yet offered by anyone else on the site.

# May 8, 2003

Like many people, I am

Like many people, I am continually amazed by the obesity of people in the US, and coming up with theories to explain it is kinda fun. I have two theories so far.

One: individualism. The US is an extremely individualistic society, and it shows in the way they consume food. People here costumarily order things like "a sandwich on this type of bread with extra that without that and with some that and something else on top, and can I have a water of this brand with that, oh, and make that sandwich with this type of mayo".

It's baffling. I often have trouble when ordering food because I can't do that. Kids are taught that way as well: they get to order whatever they want - it is the american way. Individualism also means family meals are probably less popular here than in, say, Europe.

Second theory: US is the country of the big. Portions are huge. Third theory: processed foods are cheaper than non processes foods.

(ok so I have more than 2 theories)

Fourth theory: infrastructure. US is the country of the cars. In Holland, almost everywhere next to the car lane, there is a separate lane for bicycles, and then one for people who walk. Here, it is hard to walk anywhere because it feels like you're walkig on the highway always (no lanes for bycicles). Kids get brought to school in cars or buses in most areas.

I'm probably wrongg or badly informed on many of these things, so feel free to correct me.

# May 8, 2003

Fascinating: MELISSA BATESON: "Life is

Fascinating: MELISSA BATESON: "Life is filled with choices: a hungry starling has to decide which field to forage in, a peahen has to choose between the various magnificent peacocks she encounters on a lek and we have to choose which brands to buy every time we visit the supermarket. I am interested in how both animals and humans make decisions between alternative options, especially when the options on offer differ in more than one attribute."

# May 7, 2003

Alphagalileo: "AlphaGalileo is the fast

Alphagalileo: "AlphaGalileo is the fast effective way to get news to journalists around the world. AlphaGalileo provides instant access to news, images, background information and a database of experts."

# May 7, 2003

Edge: WHY DO SOME SOCIETIES

Edge: WHY DO SOME SOCIETIES MAKE DISASTROUS DECISIONS?: "My UCLA undergraduates, and Joseph Tainter as well, have identified a very surprising question; namely, failures of group decision-making on the part of whole societies, or governments, or smaller groups, or businesses, or university academic departments."

# May 7, 2003