The tools that enable libraries to provide efficient information services are a relatively recent phenomena. In libraries up to the mid 19th century, finding materials depended on the personal knowledge of the librarian. This is analogous to the current state of the virtual library, information is largely retrieved by asking around, putting messages on email lists etc, rather than by using a coherent set finding tools. However, as we are finding today with the virtual library, the mass of information grew to the point that no individual can be aware of all the information resources. Users of libraries in the 19th century found that new, systematic tools were necessary. Librarians such as Panizzi at the British Museum and Dewey in the American public library system developed the systems of cataloguing and classification that we use today. Even the centrepiece of the "traditional" library, the reference desk, is a relatively recent phenomena, unknown until the 1890s (Rothstein 1953 ). As an aside, librarians interest in technology as evinced by WWW is not new. Libraries were pioneering users of typewriters, photocopying and electric light.
We are now at a point where the virtual library requires the equivalent tools to Dewey's classification system and Panizzi's cataloguing rules. To explore this issue, I have been examining the concept of a virtual special library. A special library is one serving a particular user community (normally an organisation, but also members of a specialist discipline), and translating this into the online environment. On the Internet we can see a number of virtual special libraries evolving. A Campus-Wide Information System (CWIS) is an obvious example of the virtual special library, serving a particular organisation's information needs. We are also seeing virtual special libraries for specialist disciplines, unbounded by geographic location. An example for the library and information studies (LIS) discipline is the Bulletin Board for Libraries (BUBL) in the UK. This gathers together documents of interest to the LIS community, and imposes a structured order on them. Originally BUBL was implemented in a bulletin board software, but is now run as a gopher server, with an optional WWW server gateway. BUBL has been very successful, because:
In this connection, WWW evangelists should be wary as of making too much of the image based browsers such as Mosaic. Character mode browsers such as Lynx are still very effective as navigation tools and can be installed in organisations with minimal impact on bandwidth. Even where graphical browsers are being used, we need to educate users in the tricks of making economical use of bandwidth, for instance by turning off images, or using WinMosaic's right button click to selectively display inline images. Searchers for real information don't have the 30 seconds to wait while a pretty header graphic appears. Designers should be wary of creating documents that are browser specific, and should consider how a document could be used by a character mode browser.
Images can however be used effectively to augment information retrieval. One example in the prototype SSBUBL is the use of thumbnail images of the front pages of periodicals in the lists of contents pages. Users of traditional libraries tend to remember what was on the cover of a periodical they are attempting to retrieve, rather than the volume and issue number. The use of images can provide a simulation of the act of browsing the periodical stacks, with the advantage that the issues are on the shelves and in order!
An advantage, and a trap, of the virtual special library is that an impression can be created of a much larger resource than is actually present, by the creation of links to other resources. For instance, the section providing contents pages of recent LIS periodicals can include NZ periodicals stored locally, but provide a relatively seamless link to BUBL's contents page service. A disadvantage of this is that users can be confused about ownership of material, and links, location or contents of material at remote sites can change without warning.
The NZ internet environment has some special requirements, being at the end of a long and expensive trans-Pacific link, so the balance of holding materials locally as against providing links is weighted to local holding.
The recent development of local caching for WWW has been seen as a boon, since this has the potential to greatly economise on bandwidth. However, there are information issues to be resolved, since the cached document may not be the most up-to-date. If local caching becomes more widespread on the Web, the management issues of providing useful "use-by" dates must be addressed. To date, creators of electronic documents have not been as fastidious as hard copy publishers in labelling documents with dates of publication and editions - this may be something where we have to learn from the paper environment.
The fact that WWW provides a hypertext dimension to the basic hierarchical framework does not change the nature of the structure. In most cases the hypertext aspect is peripheral and the basic structure is still a hierarchical one, if only because a too "hypertexted" model will be confusing to users.
In many cases hierarchical menus are augmented by keyword searching, either of all the documents in the store, or of header information. An area that needs to be addressed by WWW designers is the consistency of the various tools for keyword searching. In most cases users are given little guidance as to what the search language is. Searching on single terms is relatively straightforward, but users of full text retrieval in stores such as Dialog and Lexis will be aware that "real" searches require more sophisticated tools - no user of Dialog or Lexis would dream of carrying out a search without having the boolean AND to combine concepts, the boolean OR to specify alternative terms, and a suite of proximity operators to specify various levels of connection between words and phrases. However few keyword search implementations in WWW give the user information about how, if at all, these tools have been implemented.
Very few WWW services use a controlled vocabulary/ subject heading approach, which following the model of the traditional library, would seem to be the logical next step. BUBL has its expandable alphanumeric classification (which can be searched on via the keyword search facility). However my impression is that most users use this as one would a shelf location, for the retrieval of a document which is already known, rather than as a subject searching tool.
One reason for the lack of a controlled indexing language approach is that this approach requires people resources to maintain it. The sheer volume of information appearing in the virtual library precludes a band of indexers attaching LC subject headings to documents in it. Developments in automatic indexing may in future provide a way of economically implementing a controlled indexing language.
It is often argued that keyword searching obviates the need for a controlled vocabulary approach. The pitfalls of the free text approach to retrieval in large document stores has been well documented (Blair and Maron 1985). In practice searchers may have a number of reasons for not searching on the keywords used by the author of the document. These include:
What lessons can be drawn from this for the structuring of information for other virtual special libraries? Since it seems that the hierarchical menu is likely to be the basic model for information retrieval on the Web, we need to work on conventions for making these as consistent as possible. We can identify basic groupings that will appear in most virtual special libraries, e.g.
Another area that needs to be drawn on is the body of classification theory developed in LIS. Classification theorists such as Ranganathan have developed fundamental classes of facets (for instance personality/ matter/ energy/ space/ time or PMEST) that could provide guidelines for the development of consistent menu structures (Rowley 1987).
The use of keyword searching is a valid alternative to the hierarchical menu but needs to provide more sophisticated text searching tools such as boolean and proximity operators. These tools need to be accessible to the user.
Garman, Nancy. (1994) User-friendly? The irony of it all. Online (July 1994) p.8-9.
Roesler, Marina and Donald T. Hawkins (1994) Intelligent agents: software servants for an electronic information world. Online (July 1994) p.19-32.
Rothstein, Samuel (1953). The development of the concept of reference service in American Libraries, 1850-1900. Library Quarterly 23(1):1-15
Rowley, Jennifer E. (1987). Organising knowledge. Aldershot: Gower.
Travis, Irene (1990). Application of knowledge-based systems to classification in libraries. In: Aluri, Rao; Riggs, Donald E., eds. Expert systems in libraries. Norwood N.J.: Ablex. p. 222-293.