Towards a World-Wide Data Base
by
Erik Sandewall
Department of Computer and Information Science
Linköping University
58183 Linköping, Sweden
Consistent use of WWW technology
HTML pages may access conventional databases.
But why should the database technology be taken as given?
Think of "WWW technology" as the use of large structures, built
from small, interlinked text files.
(HTML pages use one particular syntax in those text files; they may
co-exist with other syntaxes).
Our proposal: organize a database using WWW technology.
That means:
- use an object-oriented database;
- represent each object by its own text file;
- let database processor(s) load objects dynamically, like a
WWW browser.
Example of database object
France:countries =
{ CAPITAL ~ Paris,
LANGUAGE ~ french,
CURRENCY ~ FRF,
NEIGHBORS ~ { Belgium, Germany, Switzerland,
Italy, Spain, Andorra },
ANTHEM-FIRSTLINE ~ "Allons enfants de la Patrie",
ANTHEM ~ [http://vir.liu.se/lit/anthems/France.html] }
Contents of the present talk
- The first application: electronic colloquium
- Architecture of the implementation
- Important issues in information-structure design
- Scenario for a World-Wide Data Base
- Can I use it?
First application: electronic colloquium
Functions of an ordinary scientific colloquium:
- Focuses on a narrowly defined topic of research
- Presentation of recent results inside and outside the group of
attendants
- Information about recent and forthcoming scientific events
- Share the work of keeping up-to-date about results
- Awareness of "who is who"
An electronic colloquium does this using a structure of WWW pages.
Electronic colloquium = display of dynamic data structure
Types of objects:
- publication (article, paper, book)
- conference or journal where published
- publisher of journal or proceedings
- author
- affiliation (research group, etc)
- research issue within selected topic (repres. by keyword)
String-valued properties:
- title of publication
- E-mail address of author
- URL of author, conference, etc.
Text-valued properties:
- full text of publication
- abstract
- open reviews
- commentary and debate
Fundamental observation 1:
In a well-designed colloquium, consisting of a browsable set of
HTML pages, the same data appears many times.
Examples in Electronic Colloquium on
Spatial and Temporal Reasoning (ECSTER)
This requires an underlying database representation, from which the
HTML pages can be generated. Generation can be done on demand
(query mode) or on update (derivation mode).
Our colloquium application uses generation on update.
Examples of database objects for an article and an author
Baral96A : confpapers =
{FIRSTAUTHOR ~ Baral,
OTHERAUTHORS ~ {Gabaldon, Provetti},
TITLE ~ "Formalizing narratives using nested circumscription",
CONFERENCE ~ CommonSense96,
WEBPAGE ~ [http://www-formal.Stanford.EDU/tjc/96FCS/Final-Papers/alfredo@cs.utep.edu.ps],
RESTOPICS ~ {filtering}}
Baral : authors =
{FIRSTNAME ~ "Chitta",
WEB ~ [http://cs.utep.edu/chitta/chitta.html],
EMAIL ~ [chitta@cs.utep.edu],
AFFIL-GROUP ~ ElPaso-KRG,
PAPERS ~ {Baral95A:confpapers, Baral95B:confpapers,
Baral96A:confpapers},
GENERATORS ~ {contents ~ (ALLMYTENANTS indexpaper-full)},
BRC ~ [/brs/A/Baral/descrip.html]}
Fundamental observation 2:
Generation of HTML pages in a well-designed colloquium requires
attention to lots of small details.
Conclusion: a straight-forward query language is not enough.
One needs to use programs, in particular, large numbers of small
program modules (object-orientation) and/or a rule-based system.
Our colloquium application uses Lisp as programming language.
Architecture of the implementation
/-------------\ /-------------\ /--------------\
| | | | | |
| text files, | | database | | colloquium |
| one per |--->| browser/ |===>| as browsable |
| object | | generator | | structure of |
| |<===| written in |<***| HTML pages |
| | | CommonLisp | | |
| | | | | |
\-------------/ \-------------/ \--------------/
* / |
* /-----/ |
* / |
maintainer end user
Information structure design: important issues
1. Straightforward notation of discrete mathematics is used
2. Some syntactic richness for data elements (strings, URL's, etc)
Example:
Baral96A : confpapers =
{FIRSTAUTHOR ~ Baral,
OTHERAUTHORS ~ {Gabaldon, Provetti},
TITLE ~ "Formalizing narratives using nested circumscription",
CONFERENCE ~ CommonSense96,
WEBPAGE ~ [http://www-formal.Stanford.EDU/tjc/96FCS/Final-Papers/alfredo@cs.utep.edu.ps],
RESTOPICS ~ {filtering}}
How does one identify the URL corresponding to a particular symbol?
Information structure design: important issues
3. Objects are referenced by name + type
4. URL location of an object is with type or in concierge
The former alternative works if all objects of that type are stored
together. Example:
authors : types =
{ACCESSMODE ~ SUBDIR,
ACCESSPATH ~ [file:/info/www/ext/brs/A],
PROPERTIES ~
{LASTNAME ~ STRING,
FIRSTNAME ~ STRING,
...
WEB ~ STRING},
TYPICAL-MEMBER ~ Pattern}
This defines that the location of the descriptive text file for
the object Baral:authors is
file:/info/www/ext/brs/A/Baral/descrip.horl
Information structure design: important issues
4. URL location of an object is with type or in concierge (cont:d)
The latter alternative is used if the objects of that type are not to
be stored together. Example (for objects of type types):
system-core : concierges =
{TENANTS ~
{articles:types ~
{ACCESSPATH ~ [file:/home/erisa/wpd/cb]},
authors:types ~
{ACCESSPATH ~ [file:/info/www/ext/brs/cb]},
cities:types ~
{ACCESSPATH ~ [http://www.isy.liu.se/wwdb/geogr]},
...
webpages:types ~
{ACCESSPATH ~ [file:/info/www/ext/brs/cb]} }}
Note the functionality of a concierge!
Scenario for world-wide data base
In the World-Wide Web
- Contributions are dispersed world-wide
- Contributions may be added locally and used from anywhere
- Contributions contribute because of the use of HTML (~ standardized)
- Contributions are easy to display, but...
- ... extraction of the contents of contributions is difficult
In the World-Wide Data Base
- Contributions are dispersed world-wide
- Contributions may be added locally and used from anywhere
- Contributions contribute because of the use of HORL (> standardized)
- It is straightforward to extract and process information contents
from contributions
- Contributions are displayed by generating HTML pages, which may
be easy of very sophisticated.
"If I were to use this..."
"... should I use your implementation or write my own?"
Whichever. The present program is free, of course, but it is also
small. It is easy to re-implement such a database browser/generator,
if you have a programming language that is good for data structures.
"... and does it run fast enough?"
No problems. Click here for
latest statistics.