An open framework for collaborative distributed information management

D. DeRourea, W. Halla, S. Reicha, A. Pikrakisb, G. Hillc, and M. Stairmandd

aDepartment of Electronics and Computer Science, University of Southampton, U.K.
dder@ecs.soton.ac.uk, wh@ecs.soton.ac.uk and sr@ecs.soton.ac.uk

bDepartment of Informatics, University of Athens, Athens, Greece
pikrakis@di.uoa.gr

cMulticosm Ltd, Southampton, U.K.
gjh@multicosm.com

dParallel Applications Centre, Southampton, U.K.
mas@pac.soton.ac.uk

Abstract
The MEMOIR project supports researchers working with a vast quantity of distributed information, by assisting them in finding both relevant documents and researchers with related interests. It is an open architecture based on the existing Web infrastructure. Key to the architecture is the use of proxies: to support message routing for dynamic reconfiguration and extension of the system, to collect information about the trail of documents that a user visits, and to insert links on-the-fly. In this paper we present the MEMOIR framework and its rationale, and discuss early experiences with the system.

Keywords
Information re-use; Trails; Agents; Open system

1. The open architecture of MEMOIR

Effective access to documents and effective collaboration between researchers can be crucial to the competitiveness of large companies. MEMOIR, which stands for Managing Enterprise-scale Multimedia using an Open Framework for Information Re-use, addresses this problem by providing the user with extra assistance in these tasks in return for minimal extra effort by the user. In order to achieve the goal of supporting collaboration amongst users MEMOIR uses open hypermedia link services, trail bases and software agents that assist the user in the management of their information.
The MEMOIR architecture consists of a set of components that talk to each other using HTTP extended by additional tag/value pairs that express the semantics of MEMOIR messages. The main components are a message router, an interface manager that serves as a proxy as well as a link service, the object oriented database system ITASCA and an agent server.
The key component is the message router which acts as a hub with which other system services register. Its design is based on the Microcosm filter model (Hill et al., 1993). Any component makes its services available by registering details of its services with the message router which in turn can route particular requests of other components to the newly registered one. The model of servicing a component allows the system to be dynamically tailored to specific needs.
Standard Java enabled Web browsers serve as user interface to the MEMOIR system in order to minimise installation and maintenance overhead. The only configuration necessary is to set the so-called interface manager as a proxy server and to connect once to it in order to log in. It is via the interface manager that users view the Web, document management systems and also linkbases. Thus, existing data can be re-used and is not stored within the system (although the meta data such as a document's keywords is).
A dedicated agent server manages a set of software agents (Pikrakis et al., 1998) and provides different agent services that are essentially responsible for data mining and resource discovery tasks such as searching the Web. Users can select services such as finding similar persons based on trail information; they can launch keyword extractions of documents they have visited; or retrieve information which other users' trails contain a document.

2. Collaboration supported by trails, links and agents

The MEMOIR architecture is an evolution of the Distributed Link Service (DLS)  (Carr et al., 1995). While the DLS, like other open hypermedia systems, treats hypermedia links as first class objects, MEMOIR promotes another kind of object: the trail (Bush, 1945; , Nicol et al., 1995). A user's trail is the set of actions on documents that they have visited (such as opening the document) in pursuing a certain task. By matching trails, we match users. MEMOIR lets the user ask questions such as "who else has read this document?" and "what else should I read?". Software agents answer these questions to the user: they do so mainly by mining trail information, by doing automatic keyword extraction and also by maintaining different user profiles. Therefore, answers will depend on the selected profile, e.g., similarity scores of documents or the users' personal profile. At this stage simple profiles have been created to capture the nature of the user's work role, e.g. sales manager or researcher.

3. Summary and conclusion

Distributed information systems can be large and complex: information is difficult to find,  leading to duplication of effort where sharing would be more cost effective. MEMOIR provides a simple, open solution using existing Web infrastructure and unlike other recommender systems, recommends people as well as documents. MEMOIR treats trails as first class objects which can be created, stored and manipulated by the components of the system, including agents.
While MEMOIR focuses on two particular industries — the pharmaceutical and chemical industries — for the trials, we believe the approach to be generic: for example, we are also applying it to historical research using major historical archives including Wellington, Mountbatten and Churchill, to audio archives, virtual museums, virtual art galleries and hybrid libraries, as well as the adminstrative system of the ECS Department at Southampton.

Acknowledgements

We would like to acknowledge the contribution of all the members of the MEMOIR team. The project is supported by the European Union's ESPRIT programme (No. 22153). The work of Sigi Reich has been supported by the Austrian Fonds zur Förderung der wissenschaftlichen Forschung (FWF, grant No. J1507-INF). We would also like to acknowledge the the support by the EPSRC project No. GR/K73060.

References

Bush, V.  As we may think, The Atlantic Monthly, 176(1): 101–108, 1945.
Carr, L., DeRoure, D., Hall, W., and Hill, G., The distributed link service: A tool for publishers, authors and readers, World Wide Web Journal, 1(1): 647–656, 1995.
Hill, G., Wilkins, R., and Hall, W., Open and reconfigurable hypermedia systems: A filter based model, Hypermedia 5(2): 103–118, 1993.
Nicol, D., Smeaton, C., and Slater, A.F.,  Footsteps: Trail-blazing the Web.  in: Proc of the 3rd International World Wide Web Conference, Darmstadt, Germany,  April 1995.
Pikrakis, A., Bitsikas, T., Sfakianakis, S., Hatzopoulos, M., DeRoure, D.,  Hall, W., Reich, S., Hill, G., and Stairmand, M., Memoir – software agents for finding similar users by trails, in: PAAM98, 3rd Intl. Conference on The Practical Application of Intelligent Agents and Multi-Agents, London, UK, March 1998.