Introduction
Next: Document preparation
Up: Putting Paper Documents in
Previous: Abstract
Electronic media have many advantages over paper. These
advantages include
- flexibility with regard to editing, displaying,
and annotating existing documents,
- easy and quick transport of documents
by means of computer networks, possibly over long distances,
- support of several (virtual) copies
of a document in addition to the central original
document,
- support of fast and convenient search mechanisms like
full text searches and hypertext links.
Though these advantages of electronic media
are undisputed, paper is still the most important medium
for information exchange.
Partly this sticking to paper documents is done from
force of habit. However, paper is not favoured for
subjective reasons only: in addition, the
creation and reading of paper documents has
been developed for thousands of years. Thus,
methods have been developed that support the
presentation and perception of information
within these kind of documents. Most of us
are used to these methods/mechanisms
providing for cognitive support in
situations like e. g. ``I
can't remember the exact location of the
information, but it is in the last quarter
of the book and on the same page there
is a picture of a WWW browser.''
Therefore, it seems sensible to incorporate
such a cognitive support of readers within
electronic access systems to documents,
especially if paper is the original source
of the document. In this case, the document
was specifically designed for this medium
and each transformation to pure electronic
representation would necessarily
lead to a loss of information.
Due to the huge heritage as well as the
on-going creation of paper documents
this incorporation of cognitive support
in electronic document access systems
by means of facsimile delivery
seems to be necessary. Besides, it may also
provide for a positive side effect:
converting paper documents into electronic
documents involves or - at least with
regard to large quantities of documents -
should involve optical character recognition (OCR).
Though OCR has improved a lot over the past
years, it is still far from being perfect [3].
The display of facsimiles
hides these recognition
errors from the users. Of course,
at the same time methods have to be used
that take these errors into consideration
internally e. g. in case of full text searches (see
section 3.2).
Hypertext representation is advisable in order
to make use of the extended functionality
an electronic document access system may provide.
However, with regard to an electronic library that
consists of a large quantity of documents,
a manual conversion of linear documents
into hypertext documents is not feasible.
In order to manage the automatic conversion
of paper documents into hypertext documents,
we have designed a system called HYPERFACS
that is currently being implemented [8].
This system includes an already existing
elaborated browsing tool with the following
characteristics, i. a.:
- integration of distinct hypertext documents,
- integration of distinct hypertext document types
(ASCII and facsimile),
- public and private hypertext links and annotations,
- manual modification of the link and annotation base,
- full text searches with keyword in context (KWIC)
display,
- graphical browser showing the hierarchical relationship
between different hypertext nodes.
- implementation for SUN SPARCSTATIONS using the XVIEW
toolkit.
Though this system is already available across a
network making use of the X window system, the
additional usage of a different access system, WWW and
MOSAIC, is sensible in the following cases:
- the full functionality of the original HYPERFACS
system is not needed,
- the library system should be implemented locally
without having access to the needed hardware and/or
software (database management system TRANSBASE\)
environment,
- the library should be integrated seamlessly into
the WWW [2].
In these cases, the results of the hypertext conversion
of the HYPERFACS system have to be transformed into data
that can be handled by means of the WWW or HTTP, respectively.
In the following we will describe some of the
functionality of this WWW library. This includes already
implemented, tested and projected solutions
together with some of their advantages or disadvantages,
respectively. Section 2 deals with the
basic process of converting paper documents
into WWW pages. Section 3 focuses on
the access mechanisms to the prepared WWW pages.
These access mechanisms include direct access by
means of URIs (cf. [1]), direct search mechanisms like
full text searches and associative search mechanisms
like links. Section 4 gives some
ideas on possible enhancements of such a system as
well as the environments in which such a system could
be used. All of the screendumps in this paper have been
taken from one hypertext base that was
built from transforming the software manual
``Benutzerhandbuch iXVIEW/SQL''
(Copyright 1991 iXOS Software GmbH).
Next: Document preparation
Up: Putting Paper Documents in
Previous: Abstract