The underlying structure of most hypertext environments, including the World Wide Web, do not inherently provide support for enhanced guidance through an information space. Users are often unknowingly led down irrelevant or uninteresting trails, with little or no support to guide them toward more relevant material. Extensive research in the hypertext community exists surrounding the use of enhanced guidance, trails and tours. Tools such as Walden's Paths offer enhancements to hypertext environments by providing a form of guided tour, a statically constructed path of relevant information. We discuss a more dynamic concept, called user-centric navigation, whereby a user is guided through an information space based upon his unique combination of characteristics and preferences.
To provide for individualized navigation, we extend the functionality of a hypermedia browser with a new type of multi-ended link, called a COOL link. Similar navigational techniques have been developed in the past for intelligent tutoring systems, but these tend to be brittle and domain-dependent. Our implementation of COOL offers a flexible, domain-independent approach that can be used to enhance hypertext environments without altering their underlying structure. While our techniques can be integrated with any hypertext environment, this paper is focused strictly on the World Wide Web. We further discuss our implementation, which takes advantage of SPI, a "soft" infrastructure facilitating semantic analysis of mixed hypermedia resources.
We likewise encourage enhancements to the Web, which facilitate an overall concept called user-centric navigation (UCN). In UCN a user is automatically guided through an information hyperspace, creating an individualized trail. Our suggested enhancements use three hypertext technologies that are not yet supported by the Web: (i) guided navigation; (ii) multi-ended and ambiguous links; and (iii) widespread use of distributed characterization/annotation schemes1. The lack of support for these areas can be perceived as shortcomings of the Web. We discuss these perceptions here first, before offering our suggested improvements in later sections.
Lack of Support for Guided Navigation: The Web has typically been used for publishing material that is relevant to a large body of people, and not for providing material that can be individualized to fit a user's particular desires, goals, and personal traits. In order for an author to successfully provide his audience with material to fit their particular needs, the author must develop special enhancements to his Web pages. Such is the case with Walden's Paths [SH95], where an intermediary information server is placed between the user and the Web to filter content and direct the user in a form of guided tour. The user of Walden's Paths is free to choose among a collection of statically constructed paths of relevant information. The author of a path must define which resources are relevant, which leaves little or no room for individualization. In contrast, we offer the UCN concept and its associated tools for providing individualized experiences directly on the Web, with limited infrastructure enhancements.
Lack of Support for Multi-ended Link Development: Web browsing tools enable a user to manually navigate through "hot" links to related resources. One drawback of this traditional style of browsing is that the user is often unaware of relevant aspects of an information space (or, collection of resources) until he has successfully navigated through the space. Often a user will select a link and become surprised, contextually confused or, even worse, unintentionally "lost in cyberspace." This can be particularly problematic when the user is under time, or learning and comprehension constraints. Moreover, with this one-to-one static mapping of a hyperlink to a resource, there is no inherent way to individualize a user's navigation through the information space. UCN requires a more dynamic environment, where navigation is guided and multi-variant.
Lack of Widespread Use of Distributed Characterization/Annotation Schemes: Given the enormous amounts of information being added to the Web each day, we predict the future of Web search and navigation will hinge on successful inferences made from a semantic infrastructure. While annotation schemes (such as META tags) are available today, there is not yet enough widespread use of these schemes on which to base improved search, retrieval and individualization techniques. Today's most popular Web search engines and indexing systems employ one or more "weak" techniques. These include keyword/phrase identification, manually built ontological categorization, and limited HTML-derivable content analysis. While these engines are useful for locating and retrieving information, they are not particularly good at matching a user with individualized information, which would require extensive information analysis. An alternative is to build search engines which apply automated, sophisticated semantic/content-analysis algorithms. Traditional analysis of this sort, regardless of the medium, is notoriously difficult to automate and can be exceptionally inefficient. A scheme lying between the "weak" and more sophisticated methods could prove useful, informative and efficient toward UCN.
We refer to this type of automated guidance as user-centric navigation (UCN). More precisely, UCN is the automated guidance of an individual through an information hyperspace, creating a individualized trail by relating the user's profile to the most relevant information resources 2.
Guided Navigation: There are domains in which it is useful to go beyond the guided tours offered by Walden's Paths [SH95] and NoteCards [TR88]. We envision educational tools that provide individualized tours, where each student is automatically guided on a tour based upon their particular preferences and traits. This is the conceptual basis for UCN. Consider the case where students in the same class, yet with differing reading abilities, need to learn about the same general topic. Web resources with similar content but different instructional reading levels would be useful, especially if reachable through the same familiar means. As a student progresses, the browsing tool guides him on an appropriate path for his abilities, offering the most appropriate resources at each stop along the path. The student still controls his progress through the hyperspace, moving directly on the path or taking side excursions.
We view the UCN mode of guided navigation as a paradigm shift from the more passive modes of hypertext navigation (i.e., where static trails are authored and followed). UCN facilitates the development of a dynamic trail. Once a user initiates an information seeking activity, the UCN browsing tool automatically guides him toward relevant material. Trails would no longer be authored statically; they would be dynamically created for the individual user based upon context, user interaction and user needs. We believe that our suggested enhancements to the Web (described next) move us toward user-centric navigation, although the UCN concept is much greater indeed.
Link Development (COOL Links): We propose a new style of hypermedia link, called a COOL link3, as a means of individualizing navigation. COOL links augment the traditional hypermedia "hot" link paradigm with a dynamic runtime link that associates with a collection of component resources. A COOL link is a special case of a multi-ended link; one in which the importance is placed upon the automatic selection of the "most appropriate" resource for a particular user. Clicking on a COOL link takes the user to one of the component resources specified in the link. Each of the distinct resources in a COOL link collection can provide different benefits to specific users (though each resource may contain information about the same subject or topic).
The burden of choosing a link component from the collection is placed upon the browsing tool at runtime. Various criteria for choosing a component resource can be implemented by link selection algorithms. In the educational domain, the pedagogical "appropriateness," associated grade level, or readability index of a resource in relation to a particular student's profile may weigh heavily on the choice. Likewise, in the marketing domain, the color, size, cost, or style of a particular item in relation to a consumer may affect the selection. We have developed a generic link selection mechanism and incorporated it into our Web implementation. We call the algorithms that facilitate link selection based upon a user's profile and COOL links, user-centric link selection (UCLS) algorithms4.
There are two input parameters to any UCLS algorithm: (i) a comprehensive user profile, specifying the traits and preferences of the user; and (ii) semantic descriptions of the candidate resources.
Individualization Through User Profiles: Without an algorithm to select one resource from a collection, a COOL link is little more than an ambiguous link. Yet, even with an available algorithm, there must still be enough user information available to select one resource from the collection. If a comprehensive user profile is available, the UCLS algorithms can attempt to find a best match between the candidate resources and the individual5. We are currently testing our methods using manually developed user profiles , which are complete enough to allow the UCLS algorithms to make a selection. User profile development is not currently a major focus in our efforts, but is an important future direction.
Individualization through Semantic Descriptions of Resources: The second parameter to a UCLS algorithm offers a way to evaluate the candidate resources themselves. Through the adoption of a distributed annotation scheme, the characteristics of a resource can easily be asserted over the Web. For example, let's suppose there is an encyclopedic description of the planet Saturn published on the Web. The fact that the description is written on a fourth grade independent reading level may be useful information to a UCLS algorithm. This fact could be expressed directly on the Web using, say, the notation:
indep_reading_level : 4.0.
Similarly, suppose that an HTML file, as viewed through a browser contains roughly 45% text (vs. images, etc.). This can be expressed using:
percentage_of_text : 0.45.
As it turns out, both of these characteristics are pedagogically useful [MHM92], [MH95].
Where do such characterizations belong? The SHOE system [LSR96] places similar kinds of semantic information directly in the HTML. There, HTML extensions include a "knowledge markup syntax" which allows authors to "use HTML to directly classify their web pages and detail their web pages' relationships and semantic attributes in a machine-readable form." Though useful, this approach has its drawbacks. HTML is a language best suited to express Web browser and hypertext display directives, rather than semantic content. Extending HTML in this way imposes a severe limitation on semantic expression, namely that it becomes the HTML author's responsibility. It is not clear that the burden of semantic expression should be shouldered by an HTML author, nor that semantic descriptions useful to a wide and varied audience are easily created by a single author. Moreover, non-HTML resources (e.g., audio and video files) cannot be characterized through SHOE-style HTML extensions. What is needed is a uniform characterization scheme that is distinct from HTML, for all Web resource types.
The use of the META tag is another approach for annotating semantic information. Annotations can refer to the HTML file containing the META tags or a separate resource. META tags are tightly coupled to the HTML specification. Using the META tag approach to annotation, semantic information is embedded directly within HTML files; even when META tags are separated from the resource being referenced, the semantic information resembles the syntax of HTML.
We have developed a third approach, called SPI (for Semantic Property Infrastructure), expressly for asserting semantic/content characterizations of resources in a shared, distributed multimedia environment. Such assertions are made in SPI files, which are the core of the infrastructure. With a SPI infrastructure, semantic characterizations are (i) divorced from presentation directives (i.e., HTML); (ii) independent of resource file type; (iii) specified by anyone, not just the resource author; and (iv) distributed anywhere on the multimedia network. The SPI file author describes a resource for use by a specific audience or user community. The best descriptions are those which can be automatically employed to help prune the search, selection and retrieval space of resources for that specific audience. If different characteristics are appropriate for another audience, more than one SPI file can be made available. Likewise, if a more specialized community wishes to use the characterizations of a particular SPI file, they can indirectly reference that SPI file and override or extend properties at their discretion. This is an example of a robust and extensible semantic architecture.
We have integrated SPI and its related tools with other (i.e., non-Web) multimedia educational environments, and found it effective. Therefore, we have also adopted SPI for our Web-based efforts toward UCN. Figure 1 depicts the partial incorporation of SPI files into the Web. Resources of any media type are depicted by the small circles. Thin, black lines represent traditional "hot" links. Solid gray lines denote pointers from a SPI file to a resource; dashed lines to a chained SPI file.
The novelty and promise of this approach is that sophisticated and complete semantic/content characterizations of a resource needn't be supplied at a single source. Instead, the work can naturally distribute itself among the (potentially) many users of a resource. Comprehensive characterizations can be built with very little effort on any one individual's behalf; with SPI files referencing other SPI files in a chain of descriptions about a Web resource. We have found SPI to be informative, efficient, and easy to integrate with the Web.
Implementing COOL Links For demonstration purposes, our Web-based COOL links are implemented with JavaScript. Using the power of JavaScript, we can capture mouse movements over sensitive "cool spots," and then respond to user clicks (i.e., selection). The presence of a COOL link in an HTML file looks somewhat similar to the traditional "hot" link, except that it references multiple SPI files instead of one specific linkable resource. The HTML would look similar to the following:
This is a <a href="" OnMouseOver="this.href=cref(
'http://www.i-a-i.com/spifiles/saturn.spi',
'http://www.nasa.gov/planets/spifiles/rings.spi',
'http://www.stsi.gov/images/satrings.spi')">
COOL Link Example</a>.
Each of the SPI files associated with the link describes a resource appropriately for a specific audience of users. The objective is for UCLS algorithms to select the most appropriate link. The resource that is deemed the most appropriate is the one with which the user will then be hyperlinked. Figure 2 depicts the integration of COOL links into the Web structure using SPI. Each collection of gray arrows stemming from a resource represents a single COOL link collection. Each COOL link component references a SPI file, depicted by rectangles. Each SPI file references directly or indirectly a resource that it describes.
Other features for interacting with COOL links will function as in other systems with computed links (e.g., Microcosm offers a menu of link choices [LDG96]).
Implementing SPI: A simple yet effective way to implement SPI is to add a single new file type to the Web6. (We use the extension .spi for files of this type.) An individual SPI file can reside anywhere on the Web. A SPI file expresses, using a formal language, the semantic content of another Web resource file (e.g., an HTML file or JPEG), either directly or indirectly. The former is achieved by referencing a resource file within the SPI file; the latter by linking to another SPI file, which in turn directly or indirectly references a resource. Both are illustrated below.
An Example: Figure 3 depicts a SPI file which partially characterizes a hypothetical on-line book chapter about the planets.Implementing User Profiles: User profile development is currently handled through standard HTML forms, interacting with a CGI process on our Web server that stores the information in a database. In order to use the demonstration, each user must register in the web space prior to navigating through the space. If a user fails to register, then the consequences are that COOL links operate as ambiguous links. Our profiles are currently sparse, for ease in demonstrating the concepts. User profile development can be enhanced, which remains as a distinct and separate task in our research.
Figure 3: An example (hypothetical) .spi file, called ch8.spi, used to characterize the Web resource found at http://www.acepub.com/book12/ch8.html.
res_name : The Planets of our Solar System /*Resource name*/ res_loc : http://www.acepub.com/book12ch8.html /* Resource location*/ subject : science keywords : planets, solar system, astronomy language : english /*Language or resource*/ The file contains a list of <key>:< value> association pairs which characterize some aspects of the chapter. (This is a simplistic version of a formal SPI language, useful for illustration purposes.) The resource being described is an HTML file located at www.acepub.com/book12/ch8.html, as indicated by the res_loc (resource location) key. Included are characterizations of the book's title, subject, keywords, and language of delivery. (One can easily imagine many other useful keys, as well as a standardization of such keys. We leave the issues of key selection and standardization for future pedagogical research.)
The SPI file in the figure can be used in one of two ways: (i) parsed, interpreted and evaluated as is; or (ii) as a foundation on which to base a more specific characterization of the book chapter it references.
To illustrate the second option, suppose that a school district would like to use the original SPI file, as it provides useful information, but wishes to further characterize the chapter with their particular students and/or curriculum in mind. This is done by creating a new file, depicted in Figure 4, in which the original SPI file is referenced using:
res_loc : http://www.acepub.com/book12/ch8.spi(Notice the .spi extension.) In doing this the school district builds upon the more general, initial characterization to include reading levels, text percentage, etc.
The process can continue making for more and more complete characterizations: A school can reference this latter SPI file (and hence indirectly references the initial file from Figure 3), and so on7.
Figure 4: A .spi file intended to augment or modify ch8.spi, the file depicted in Figure 3.
res_loc : http://www.acepub.com/book12/ch8.spi /*Location of parent SPI file*/ instr_reading_lvl : 5.3 /*Instructional reading level*/ indep_reading_lvl : 6.2 /*Independent reading level*/ percentange_of_text : 65 /*Percentage of text in chapter*/ ped_model : exploratory /*Supported pedagogical model*/ time_inclass : 15 /*In-class time req'd to read chapter*/ time_online : 10 /*Online time req'd*/ stdnt_role : research /*Student's role in using resource*/
Implementing the UCLS Algorithms The UCLS algorithms are implemented as C++ shared libraries, and invoked through a CGI process on our Web server. When a user clicks on a given COOL link, the JavaScript will call the CGI process with the corresponding SPI files in the link collection. The CGI process will then invoke the UCLS algorithms for the current user profile and return the most appropriate link to the client browser. Tie breaking is handled by order of appearance of SPI files in the COOL link.
The techniques, tools and implementations described in this paper help to facilitate the UCN concept, but do not fully encompass UCN. We intend to continue our research in this area, while further developing some of the previously mentioned tools.
One specific area that we intend to devote further research toward is the automated collection of user preferences. We envision sophisticated methods for collecting user characteristics, such as agent-based information gatherers and intelligent reasoners based upon machine learning principles. One would assume that much can be learned through an individual's interaction with their Web browser. We will implement less sophisticated methods in the near future.
We will also determine the proper technology for applying distributed annotations to resources. If our proposed SPI architecture is amenable, then we will need to further develop and standardize the tools associated with SPI. Already under development is an editor for asserting specific domain-dependent characteristics about a resource. This editor is currently being used to produce SPI files, but could easily be used to produce information adhering to other standards (e.g., META tags).
We believe that annotation schemes, such as SPI, are intended to be used by agents and engines, designed to interpret resource characterizations. Their power derives in large measure from the expressibility of the implementation language and from the soundness of the interpretive engines and agents. Thus, the following are issues: (i) The efficiency with which annotation files can be linked and conjoined to build more complete characterizations; and (ii) the sophistication used to interpret those characterizations. We intend to investigate these issues.
If we prove the utility of COOL links, we will then seek to incorporate related link types in HTML. Thus, providing for a more elegant implementation which is simple for HTML authors to use.
Under consideration are new ideas toward UCN. One in particular has to do with the automatic composition of disparate resources into coherent presentations. We call this concept Dynamic Multimedia Composition (DMC). The author of a DMC presentation describes each of the desired components without actually specifying the distinct components themselves (i.e., the actual URLs of the components). When a given user accesses the presentation while browsing, each of the components are selected at runtime, based upon the user's preferences and the author's component descriptions. This type of environment would be a significant leap forward in individualized instruction and requires advanced authoring tools.
The continued success of the Web and other large information hyperspaces are contingent upon automated tools that efficiently guide the information gatherer toward relevant and appropriate material. In this paper, we have briefly described our techniques toward developing individualized experiences over the Web, as part of a larger concept called UCN. UCN is founded upon previous hypertext research, yet the dynamic individualization of a hypermedia trail utilizing COOL links is a novel approach toward enhanced Web navigation.
1 A variety of related topics have been considered by the World Wide Web Consortium in the past [http://www.w3c.org/pub/WWW/DesignIssues/], although we offer further reasoning toward the utility of these techniques.
2 Others [BU45, TR88, ZE88] have laid the far-sighted foundation for our conceptual development. We offer our own description here for clarity and to show implementational movement toward these goals.
3 The term COOL is an acronym for "Collection Of Objects Link."
4 We've found the UCLS algorithms to be particularly suitable for electronic courseware, where COOL links are used within scholastic material to individualize instruction. Although most of our research has been in the educational domain, we believe that there are a number of UCLS algorithms that have the potential to be useful in the Web community in general.
5 Without the availability of a user profile, a COOL link reduces to an ambiguous link. In this case, there is no mechanism by which the UCLS algorithm can select a link, other than at random, or through some other default action. Therefore, the issue of automated user profile development is a large and important one, especially since common practice is to ignore or abandon manual entry of user profiles after first time entry.
6 As noted earlier, our SPI implementation can be replaced by other classification/annotation technologies including SHOE-style HTML extensions, and META tags, either alone or in combination. We chose to use SPI for its clarity and ease of implementation, as well as its ease in integration with our other tools.
7 The formal details of SPI, including: (i) SPI file consistency checking and cycle detection, especially with linked SPI files; and (ii) parsing and interpreting SPI files; are left out for brevity.
This research was supported, in part, by DARPA contract #N66001-95-C-8626.