Transformation into WWW data




Next: Document access Up: Document preparation Previous: Generation of the

2.2 Transformation into WWW data

As mentioned in section 1, the conversion of data that is used in our own hypertext system into WWW accessible data is sensible in several cases. Main objective of such a data transformation is the support of mapping most of the original system's functionality onto WWW functionality. Such a conversion could be implemented by means of a gateway translating all incoming HTTP requests into appropriate events and creating all the needed nodes dynamically by means of accessing the original database (see fig. 2). However doing this, one of the goals of the WWW integration would be dismissed: The independence from a fixed software environment, namely from a specific database management system.

Instead, a different approach was chosen: all the needed parts of the database are extracted and transformed into data files before browsing time. Afterwards, these files are directly accessible and usable by means of CGI scripts [6], thus enabling an adequate representation of hypertext nodes and links.

With regard to nodes, this transformation that is done before browsing time, includes two main aspects: For representation purposes, the original TIFF images, as used in the HYPERFACS system, are transformed into GIF images. With regard to the amount of time that is needed for such a conversion, it is sensible to do that in advance (approx. 1-2 secs for a 1000*1000 black/white TIFF image on a SPARCSTATION 10); otherwise it could also be done on-the-fly, of course. For access purposes, specific data is stored with each page:

Of course, within a paper document a page usually has no name. This name is formed by taking the main heading within the page under concern if it exists or the last heading preceding the page otherwise. Together with the page number this artificial page name is unambigous as well as supportive for users.

The name of a full text database can be added if such a tool is available. Thus search scripts may work by means of using a real database instead of a flat file system (see section 3.2). This may be sensible if larger hypertext bases have to be searched, thus being able to reduce the time for searching. At first glance, this may contradict to our goal mentioned above: the independence from a fixed software environment. However, as the data stored in this database is rather restricted (to full text data), the effort that is required for adaptation of the scripts to a specific database management system is very small.

Considering the data for nodes as mentioned above, it would also be possible to encode this data directly in a WWW page using HTML. However, this approach is not sensible, because one and the same node may look different depending on how it is accessed: via normal access, using zoom-in or zoom-out mode, or using highlighting mode (see sections 3.1 and 3.2). Instead, the stored data is used by CGI scripts that create nodes dynamically.

With regard to links, the transformation again includes aspects of representation and content. With respect to the representation of links a static approach has been chosen: the bounding boxes of link regions are inserted into the images during the conversion from TIFF to GIF. This is done mainly for two reasons: first, the manipulation of images takes time and should be avoided whenever possible. Second, the link base cannot be modified using a WWW browser, thus modifications/updates of the images for these reasons are not even necessary. This lack of modification ability is due to the fact that WWW works as a stateless system, whereas link modification whithin HYPERFACS makes strong use of states: e. g. during the manual creation of a link, four basic states are used (thereby disregarding additional states for fixing reading and modification rights):

  1. before the start of the link's source is fixed,
  2. before the end of the link's source is fixed,
  3. before the start of the link's destination is fixed,
  4. before the end of the link's destination is fixed.

With respect to content, an approach similar to normal clickable images is taken: the regions of link sources are stored within map files that are unique for each node. However, these files are slightly enhanced. In addition to the shape, the coordinates and the destination URI of a link region, the map files contain entries that describe the link type as well as the name of node/page to which the link points. Thus, it is possible to give the user additional information, before transfering to a new page (see section 3.3). Furthermore, the map file for a specific page contains the coordinates of a link's destination region, thus enabling highlighting.

___________________________________________________

___________________________________________________


Figure 2: Possible gateway to HYPERFACS system database




Next: Document access Up: Document preparation Previous: Generation of the