Dynamic catalogues on the WWW

Maria Milosavljevic^a and Jon Oberlander^b

^aLanguage Technology Group, Microsoft Research Institute,
School of MPCE, Macquarie University, Sydney NSW 2109, Australia
Maria.Milosavljevic@mq.edu.au

^bHuman Communication Research Centre,
2 Buccleuch Place, University of Edinburgh, Edinburgh EH8 9lW, U.K.
J.Oberlander@ed.ac.uk

Abstract: Electronic catalogues are here to stay; however, static WWW documents will not aid the user in finding what she is looking for. We argue for the use of natural language generation techniques to dynamically produce hypertext documents on the WWW, resulting in what we call DYNAMIC HYPERTEXT. A dynamic hypertext document can be tailored more precisely to a particular user's needs and background, thus helping the user to search more effectively. We describe the automatic generation of WWW documents and illustrate with two implemented systems.
Keywords: Natural Language Generation; Adaptive hypertext; Dynamic hypertext; Human–computer interaction; User modelling

1. Introduction

The advent of the World Wide Web (WWW) has led to an explosive increase in the quantity of electronic documents available on-line. From the comfort of home, we can browse through libraries, buy our groceries or even visit a museum. However, the vast majority of the documents available on the WWW are static in nature, unable to be tailored to any particular user's requirements or abilities. Authors must either construct general-purpose documents which are written according to a wide audience model or must write (and continually update) multiple documents for users' anticipated needs.

Hypertext by its very nature opens the door to user interaction with documents. We can capitalise on this interaction by building hypertext systems which adapt the material presented to a user in a WWW document on the basis of an existing user model or the user's previous interactions with the system. However, there are limits to the flexibility that current methods afford. We argue for the incorporation of natural language generation techniques into such systems, resulting in what we call DYNAMIC HYPERTEXT. Dynamic hypertext generation systems draw on research in natural language generation (NLG) to dynamically create and adapt WWW documents to the user's needs.

In this paper, we briefly outline the architecture and the benefits of dynamic hypertext systems on the WWW; we argue that, by making more effective use of a user model and the discourse history, NLG techniques can offer highly flexible WWW documents. We illustrate the advantages of such dynamic hypertext techniques through examples from two implemented systems: the PEBA-II and the ILEX text generation systems, which dynamically produce descriptions of entities as WWW pages.

2. Dynamic hypertext via natural language generation

2.1. Natural Language Generation

The aim of NLG systems is to produce coherent natural language text from an underlying representation of knowledge. NLG can be viewed as a goal-driven planning process, involving the formulation of texts that satisfy some communicative goal such as describe the echidna. NLG systems embody two main processing components: the text planning and surface realisation components.

Text planning typically encapsulates all those decisions involving choices of what to say. Based on the discourse goal(s), the text planner must decide what is relevant in a particular situation, and organise this content in a way that allows realisation of a coherent discourse that guides the hearer's inferences. This is achieved by composing a discourse plan using facts from the knowledge base. For example, McKeown's [6] schema-based approach stores a number of plan outlines in a plan library and fills in the appropriate information from the knowledge base. A model of the user's knowledge can be used to tailor the text to the individual user's knowledge; see [11] for a good example of this approach. In addition, the ongoing discourse with each particular user can be recorded in the discourse history component, enabling the system to adapt future texts to what has been said before (see [9]). The discourse plan is realised as natural language utterances by the surface realisation component. This makes use of knowledge of the natural language's grammar and lexicon to produce well-formed utterances that convey the required semantic content.

2.2. Dynamic hypertext

DYNAMIC HYPERTEXT systems are NLG systems which take advantage of hypertextual interaction to give the user freedom to perform high-level discourse planning, thereby reducing the burden on the NLG system of having to reason more deeply about the user's goals. A key element in any dynamic hypertext system is that the hypertext network and the nodes of this network (the WWW documents themselves) are dynamically created at run-time when the user requests them; there are no existing hypertext documents, and there may not even be any pre-existing representations of what could be documents within the system. This is in contrast to ADAPTIVE HYPERTEXT systems, which adapt the content of documents in a fixed hypertext network, according to the user's knowledge of the concepts within that document. Within such systems, documents may be annotated with the conditions under which particular segments and hypertext links to other concepts are considered appropriate given a particular user's knowledge. This enables the system to present different views of the same document to different users; however, this flexibility is still limited to the author's written text segments. For a survey of existing adaptive hypertext systems and further elaboration of the concepts involved, see Brusilovsky [1].

A dynamic hypertext system operates in a similar way to traditional NLG systems; a knowledge base contains information about those concepts in the domain, and the system selects which elements of the knowledge base are important for creating the required WWW document. The surface realisation component of a dynamic hypertext system must encode HTML links into the text in order to produce a document which can be viewed using a WWW browser interface. The hypertext links represent follow-up questions which the user can ask [10], and are generally concepts (or other entities) that can be described by the system. Dynamic hypertext systems must decide whether a link is justified; that is, whether there is more to say about the concept, or whether all the useful information about the particular concept has already been included in the current document. In operation, the user can effectively perform the high-level discourse planning for the system, driving the system by selecting hypertext links. Each hypertext link indicates a new discourse goal to the system. Knott et al. [4] provide a useful survey and comparison of existing dynamic hypertext systems. For more information on the advantages of such systems see Levine et al. [5], Milosavljevic et al. [7] and Dale et al. [2]. We now introduce two particular systems we have been involved with.

2.2.1. The PEBA-II system

PEBA-II is an NLG system which produces descriptions and comparisons of animals represented in a taxonomic knowledge base. The system makes use of a user model and discourse history in order to produce texts which take into account the user's knowledge. In particular, the system makes inferences about the user's specific knowledge and to draw comparisons with similar or familiar entities, thus building on her existing knowledge. See Milosavljevic et al. [7] and Milosavljevic [8] for more details.

2.2.2. The ILEX system

The ILEX system generates descriptions of objects displayed in the National Museums of Scotland's 20th Century Jewellery Gallery. As well as being accurate, ILEX's labels must convey information which is: important, in the sense of helping educate the visitor more broadly; and interesting, since when the descriptions are boring, the visitor can just walk away. To help meet these criteria, ILEX uses a simple user model, a discourse history, and its own system agenda of communicative goals. Thus, the user has freedom to explore any object in the gallery at any time, but the descriptions produced are constrained, via the system's agenda of educational goals, which it strives to achieve when the opportunity arises. See Knott et al. [4] and Hitzeman et al. [3] for more details.

References

[1] Brusilovsky, P., Methods and techniques of adaptive hypermedia, User Modeling and User Adapted Interaction 6(2–3), 1996.
[2] Dale, R.,. Oberlander, J., Milosavljevic., M and Knott, A., Integrating natural language generation and hypertext to produce dynamic documents, Interacting with Computers, in press.
[3] Hitzeman, J., Mellish, C. and Oberlander, J., Dynamic generation of museum Web pages: the intelligent labelling explorer, Archives and Museum Informatics 11: 105–112, 1997.
[4] Knott, A., Mellish, C., Oberlander, J. and O'Donnell, M., Sources of flexibility in dynamic hypertext generation, in: Proc. of the 8th International Workshop on Natural Language Generation, Herstmonceux, Sussex, UK, 1996.
[5] Levine, J., Cawsey, A., Mellish, C., Poynter, L., Reiter, E., Tyson, P. and Walker, J., IDAS: combining hypertext and natural language generation, in: Proc. of the 3rd European Workshop on Natural Language Generation, Innsbruck, Austria, 1991, pp. 55–62.
[6] McKeown, K.R., Discourse strategies for generating natural-language text, Artificial Intelligence, 27: 1–41, 1985.
[7] Milosavljevic, M., Tulloch, A. and Dale, R., Text generation in a dynamic hypertext environment, in: Proc. of the 19th Australasian Computer Science Conference, Melbourne, Australia, 1996.
[8] Milosavljevic, M., Augmenting the user's knowledge via comparison, in: Proc. of the 6th International Conference on User Modelling, Sardinia, 1997.
[9] Moore, J.D., A reactive approach to explanation in expert and advice-giving systems, Ph.D. Thesis, University of California, Los Angeles, 1989.
[10] Moore, J.D. and Swartout, W.R., Pointing: a way toward explanation dialogue, in: Proceedings of AAAI90, 1990.
[11] Paris, C.L., The use of explicit user models in text generation: tailoring to a user's level of expertise, Ph.D. Thesis, Columbia University, 1987.