Towards a World-Wide Data Base

by

Erik Sandewall

Department of Computer and Information Science

Linköping University

58183 Linköping, Sweden

Consistent use of WWW technology

HTML pages may access conventional databases.
But why should the database technology be taken as given?

Think of "WWW technology" as the use of large structures, built from small, interlinked text files.

(HTML pages use one particular syntax in those text files; they may co-exist with other syntaxes).

Our proposal: organize a database using WWW technology.

That means:

Example of database object

France:countries =

   { CAPITAL   ~  Paris,

     LANGUAGE  ~  french,

     CURRENCY  ~  FRF,

     NEIGHBORS ~  { Belgium, Germany, Switzerland,
                    Italy, Spain, Andorra },

     ANTHEM-FIRSTLINE ~ "Allons enfants de la Patrie",

     ANTHEM    ~  [http://vir.liu.se/lit/anthems/France.html] }

Contents of the present talk

First application: electronic colloquium

Functions of an ordinary scientific colloquium: An electronic colloquium does this using a structure of WWW pages.

Electronic colloquium = display of dynamic data structure

Types of objects: String-valued properties: Text-valued properties:

Fundamental observation 1:

In a well-designed colloquium, consisting of a browsable set of HTML pages, the same data appears many times.

Examples in Electronic Colloquium on Spatial and Temporal Reasoning (ECSTER)

This requires an underlying database representation, from which the HTML pages can be generated. Generation can be done on demand (query mode) or on update (derivation mode). Our colloquium application uses generation on update.

Examples of database objects for an article and an author

Baral96A : confpapers =
  {FIRSTAUTHOR  ~ Baral,
   OTHERAUTHORS ~ {Gabaldon, Provetti},
   TITLE ~ "Formalizing narratives using nested circumscription",
   CONFERENCE ~ CommonSense96,
   WEBPAGE ~ [http://www-formal.Stanford.EDU/tjc/96FCS/Final-Papers/alfredo@cs.utep.edu.ps],
   RESTOPICS ~ {filtering}}

Baral : authors =
  {FIRSTNAME ~ "Chitta",
   WEB ~ [http://cs.utep.edu/chitta/chitta.html],
   EMAIL ~ [chitta@cs.utep.edu],
   AFFIL-GROUP ~ ElPaso-KRG,
   PAPERS ~  {Baral95A:confpapers, Baral95B:confpapers,
              Baral96A:confpapers},
   GENERATORS ~ {contents ~ (ALLMYTENANTS indexpaper-full)},
   BRC ~ [/brs/A/Baral/descrip.html]}

Fundamental observation 2:

Generation of HTML pages in a well-designed colloquium requires attention to lots of small details.

Conclusion: a straight-forward query language is not enough. One needs to use programs, in particular, large numbers of small program modules (object-orientation) and/or a rule-based system.

Our colloquium application uses Lisp as programming language.

Architecture of the implementation

    /-------------\    /-------------\    /--------------\    
    |             |    |             |    |              |
    | text files, |    | database    |    | colloquium   |
    | one per     |--->| browser/    |===>| as browsable |
    | object      |    | generator   |    | structure of |
    |             |<===| written in  |<***| HTML pages   |
    |             |    | CommonLisp  |    |              |
    |             |    |             |    |              |
    \-------------/    \-------------/    \--------------/
                              *          /        |
                              *   /-----/         |
                              *  /                |
                         maintainer           end user

Information structure design: important issues

1. Straightforward notation of discrete mathematics is used

2. Some syntactic richness for data elements (strings, URL's, etc)

Example:
Baral96A : confpapers =
  {FIRSTAUTHOR  ~ Baral,
   OTHERAUTHORS ~ {Gabaldon, Provetti},
   TITLE ~ "Formalizing narratives using nested circumscription",
   CONFERENCE ~ CommonSense96,
   WEBPAGE ~ [http://www-formal.Stanford.EDU/tjc/96FCS/Final-Papers/alfredo@cs.utep.edu.ps],
   RESTOPICS ~ {filtering}}

How does one identify the URL corresponding to a particular symbol?

Information structure design: important issues

3. Objects are referenced by name + type

4. URL location of an object is with type or in concierge

The former alternative works if all objects of that type are stored together. Example:
authors : types =
  {ACCESSMODE ~ SUBDIR,
   ACCESSPATH ~ [file:/info/www/ext/brs/A],
   PROPERTIES  ~ 
      {LASTNAME ~ STRING,
       FIRSTNAME ~ STRING,
        ...
       WEB ~ STRING}, 
   TYPICAL-MEMBER ~ Pattern}
This defines that the location of the descriptive text file for the object Baral:authors is
   file:/info/www/ext/brs/A/Baral/descrip.horl

Information structure design: important issues

4. URL location of an object is with type or in concierge (cont:d)

The latter alternative is used if the objects of that type are not to be stored together. Example (for objects of type types):
system-core : concierges =
   {TENANTS ~
       {articles:types ~ 
            {ACCESSPATH ~ [file:/home/erisa/wpd/cb]},
        authors:types  ~ 
            {ACCESSPATH ~ [file:/info/www/ext/brs/cb]},
        cities:types   ~ 
            {ACCESSPATH ~ [http://www.isy.liu.se/wwdb/geogr]},
           ...
        webpages:types ~ 
            {ACCESSPATH ~ [file:/info/www/ext/brs/cb]} }}
Note the functionality of a concierge!

Scenario for world-wide data base

In the World-Wide Web In the World-Wide Data Base

"If I were to use this..."

"... should I use your implementation or write my own?"

Whichever. The present program is free, of course, but it is also small. It is easy to re-implement such a database browser/generator, if you have a programming language that is good for data structures.

"... and does it run fast enough?"

No problems. Click here for latest statistics.