Brian Kelly, UKOLN, University of Bath, BATH, BA2 7AY
{b.kelly@ukoln.ac.uk}
This paper describes use of Dublin Core metadata on the Exploit Interactive web magazine to provide enhanced searching services. A lightweight approach has been used which makes use of commercially available software. This approach provides a demonstration of the benefits of use of Dublin Core and shows how commercially-available software can be used to enable the benefits to be easily deployed without needing significant software development expertise.
The Dublin Core metadata initiative [1] was set up to define the metadata needed to support resource discovery. The initiative reached agreement in defining a core set of resource discovery attributes [2]. Despite the agreement we have not yet seen widespread deployment of services which make use of Dublin Core metadata. This is due to several reasons including lack of support for Dublin Core metadata by major search engine vendors, lack of support in popular HTML authoring tools and lack of support in many indexing packages.
The Exploit Interactive web magazine [3] is funded by the European Commission's Telematics for Libraries programme. Exploit Interactive is produced by UKOLN (the UK Office for Library and Information Networking), a small applied research and dissemination unit based at the University of Bath. One of UKOLN's main area of interest is in resource discovery metadata. UKOLN makes use of Exploit Interactive as a test bed for our research interests and to demonstrate innovative technologies.
Due to the limited funding and technical support available it was decided that Exploit Interactive should make use of commercially available software. A Windows NT server running the Microsoft SiteServer software [4] was chosen in order to broaden UKOLN's platform base, which previously had been limited to Unix.
Exploit Interactive runs on Microsoft SiteServer software. Individual articles and navigational elements are stored as HTML fragments. Active Search Pages (ASP) are used to assemble the HTML fragments into a valid HTML document. Further details on the architectural model is available [5].
Unlike article content and navigational elements the metadata is not stored as a HTML fragment. Metadata values (such as author and title) are stored as ASP (Active Server Pages) variables. An ASP script converts the variables into the appropriate HTML format. This decision was made in order to enable the metadata values to be reused (for example, the title information is used in the HTML <TITLE> element and the title and author details are used in the citation details which are displayed at the bottom of every article.
Exploit Interactive provides a simple system for managing Dublin Core metadata. But in order to convince users of the importance of metadata and provide motivation for them to implement metadata on there own services it is necessary to provide a demonstration of the benefits.
Although the Search component of the SiteServer software does not explicitly mention support for Dublin Core, it does allow arbitrary HTML <META> elements to be indexed. (Note that due to a bug in the Search software periods (.) are not allowed in <META> element names, so that the Dublin Core metadata will appear as <META NAME="DC_Creator" CONTENT="John Smith">. However it is possible to replace the underscores by periods in the search catalog schema definition).
A simple search facility which provides a pull-down menu for searching the full text of articles or by title or author is available [6].
Dublin Core metadata can be used to provide more advanced searching than the simple fielded searching. Exploit Interactive has developed a service based on Dublin Core which will enables users to search by issue, by article type, etc. In order to implement such services additional Dublin Core metadata attributes are needed. These are listed below.
Description | Function | Example |
Issue number (e.g. 1) | Searching in a particular issue (or range of issues) | <meta name="DC.Relation.IsPartOf" content="http://www.exploit-lib.org/issue4/"> |
Type of article (Regular, Feature, News, etc.) | Searching for a particular article type(s) (e.g. Regular or Feature article, but not News) | <meta name="DC.Type" content="text.article.feature" scheme="Exploit-categories">> |
Funding body for article, such as "tfl" (Telematics For Libraries). | Searching for articles about projects funded by a particular funding body. | <meta name="DC.Subject" content="tfl" scheme="Exploit-article-funders"> |
An illustration of the enhanced search interface is shown below.
Details of how SiteServer was configured in order to support use of DC metadata are available [7].
This paper demonstrates how shrink-wrapped commercial software can be used to provide enhanced searching services based on Dublin core metadata. If similar approaches using other popular indexing tools are successful this will provide motivation for information providers to make use of Dublin Core metadata.