Given these ideas
about a search-results structure, and its relationship to a URL
namespace and a doctitle namespace, let’s plan and then
implement a multidocbase, multiengine search system. We’ll
start with the ProductAnalysis docbase that we’ve been working
on for several chapters. To that we’ll add a second data
component—an NNTP conferencing system that, we’ll
suppose, is a less formal, less structured complement to the
ProductAnalysis docbase. In its newsgroups, analysts can gather
source materials, discuss work in progress, and share email that
merits the attention of the group. In Chapter 13,
we’ll look at how to set up this kind of NNTP-based intranet
conferencing system. For now, we need only concern ourselves with the
data store: a bunch of files that begin with the headers
Newsgroups:
, From:
,
Date:
, and Subject:
.
To begin, let’s enumerate the abstract markers we’ll use to organize search results, and map out the relationships between these markers and each of our two docbases (see Table 8.1).
Table 8-1. Mapping Docbase Elements to Abstract Markers
ProductAnalysis Docbase Elements |
Abstract Markers |
NNTP Conference Elements |
---|---|---|
Docbase |
Type |
conference |
ProductAnalysis |
Subtype |
newsgroup |
creation date |
Date |
date |
analyst |
Author |
from |
title |
Title |
subject |
{summary} |
Summary |
{summary} |
company | ||
product |
Only two mappings come for free. It’s quite clear which docbase
elements should map to the abstract markers DATE
and AUTHOR
. A third mapping—to
SUMMARY
—would also be straightforward, but
as the curly braces are meant to indicate, that element isn’t
defined as an explicit element in either docbase. That’s not an
insurmountable problem, but it means we’ll have to do a bit of
extra work to support this mapping.
I’ve arranged the columns in a way that implies further mappings, but I haven’t yet finalized them. To see why not, we need to mock up a few different search-results screens and explore how the abstract markers interact with real data. Suppose the plan is this:
DATE TYPE SUBTYPE TITLE AUTHOR SUMMARY
Table 8.2 shows a search-results screen based on that plan.
Table 8-2. Search-Results Plan, Version 1
May 1998 | ||||
---|---|---|---|---|
TYPE |
SUBTYPE |
TITLE |
AUTHOR |
SUMMARY |
PA |
Microsoft |
IE 5.0 |
Jon Udell |
Version 5 features... |
CON |
analyst.contacts |
Did you try Tim |
Jon Udell |
Tim’s the LDAP guru... |
PA |
Netscape |
Directory Server |
Ben Smith |
The latest rev... |
CON |
analyst.sources |
LDIF spec |
Jon Udell |
The LDIF format... |
In this example, the mappings to the TYPE
marker
are PA, a label (or icon) that denotes a record from the
ProductAnalysis docbase, and CON, which signifies a conference
message. Already we’re in a bit of conceptual trouble. Recall
that ProductAnalysis is a member of the Docbase family. So perhaps we
should use the TYPE
marker DOC, for Docbase, and
demote the instance name ProductAnalysis to the
SUBTYPE
position. That would map more comfortably
to the conference component, where the TYPE
marker
is CON, and newsgroup names occupy the SUBTYPE
slot. Table 8.3 shows another version based on that
idea:
Table 8-3. Search-Results Plan, Version 2
May 1998 | ||||
---|---|---|---|---|
TYPE |
SUBTYPE |
TITLE |
AUTHOR |
SUMMARY |
DOC |
ProductAnalysis |
Microsoft, IE 5.0 |
Jon Udell |
Version 5 features... |
CON |
analyst.contacts |
Did you try Tim |
Jon Udell |
Tim’s the LDAP guru... |
DOC |
ProductAnalysis |
Netscape, Directory Server |
Ben Smith |
The latest rev... |
CON |
analyst.sources |
LDIF spec |
Jon Udell |
The LDIF format... |
This solution maps TYPE
and
SUBTYPE
nicely, but what’s going on with
TITLE
? In version 1, we ignored the
TITLE
element of the ProductAnalysis record and
mapped COMPANY
to the abstract
SUBTYPE
, then PRODUCT
to the
abstract TITLE
. In version 2, we’ve again
ignored the specific TITLE
and mapped the
COMPANY/PRODUCT
cluster to
TITLE
. These efforts reflect a sense that,
although COMPANY
and PRODUCT
occupy a lonely position at the bottom of column 1 in Table 8.1, lacking an obvious mapping to column 2, they
are nonetheless highly salient features of their data set and
therefore should play a prominent role here. For the same reason, the
COMPANY/PRODUCT
cluster usurps the role of
TITLE
in the display template for the
ProductAnalysis docbase (see Example 6.8).
What if we add a second SUBTYPE
and promote
COMPANY
to the role of
SUBTYPE2
? That yields the structure shown in Table 8.4.
Table 8-4. Search-Results Plan, Version 3
May 1998 | |||||
---|---|---|---|---|---|
TYPE |
SUBTYPE1 |
SUBTYPE2 |
TITLE |
AUTHOR |
SUMMARY |
DOC |
ProductAnalysis |
Microsoft |
IE 5.0 / New version of IE |
Jon Udell |
Version 5 features... |
CON |
analyst |
contacts |
Did you try Tim |
Jon Udell |
Tim’s the LDAP guru... |
DOC |
ProductAnalysis |
Netscape |
Directory Server / New directory server |
Ben Smith |
The latest rev... |
CON |
analyst |
sources |
LDIF spec |
Jon Udell |
The LDIF format... |
This is certainly a credible solution. On the ProductAnalysis side,
we’ve found a home for COMPANY
and kept
PRODUCT
in the abstract TITLE
slot. What’s more, we’ve found a home for the specific
TITLE
as well—in a
PRODUCT/TITLE
cluster that maps to the abstract
TITLE
. On the conference side, we’ve
unpacked the structured newsgroup names to create a very clean and
natural SUBTYPE1/SUBTYPE2
mapping.
Should we use version 3? It’s purely a judgement call, but even though this version arguably delivers the best and most complete mapping, I’m going to vote for a modified version 1. Here are my reasons:
Version 3’s mappings may be elegant, but they chew up a lot of precious screen real estate. In a search-results display, datatype indicators provide useful clues as to the sources of information. These clues help users decide which links to follow, but other factors weigh more heavily in those decisions. Version 1 gets to the point more quickly.
Version 3’s TYPE/SUBTYPE1/SUBTYPE2
structure
seems to work out well for the two docbases we’re considering
here, but two is a pretty small sample. I’m not sure that other
docbases we might want to plug in will be able to comfortably fill
this structure.
It’s true that version 1 fails to make a strong analogy between
ProductAnalysis as a member of the Docbase family and analyst.contacts as an instance of NNTP
conferencing. On the other hand, the apparently unlikely
PRODUCT
<-> NEWSGROUP
mapping seems quite strong. The SUBTYPE
marker can
mean “one of the set of reports about Microsoft products”
or “one of the set of messages posted to analyst.contacts.” Both meanings are
appropriate and useful.
This process of analogy making is subtle and elusive. It’s also vital when you’re trying to build an information system that makes best possible use of diverse data sources. How do you learn to do it? For me the best guide is Douglas Hofstadter’s Fluid Concepts and Creative Analogies. It’s not about information architecture at all, but Hofstadter’s insights into how we make and use analogies should be profoundly useful to every information architect.
Let’s implement version 1, then, but with a few changes
suggested by our exploration of alternatives. We’ll steal
version 3’s idea of using a PRODUCT/TITLE
cluster to integrate the specific TITLE
from the
ProductAnalysis docbase. And we’ll represent
TYPE
using labeled icons. The icon’s
pictures will indicate general types (Docbase, Conference), and their
labels will denote specific types (the ProductAnalysis docbase, the
analyst newsgroup). Figure 8.3 shows a refined
version of the finished design.