Phoenicia: A Model for World-Wide Web Based Campus Information Systems.
Biological Sciences Division Academic Computing, The University of Chicago.
The popularity of the World-Wide-Web has spawned a proliferation of web-based
campus information systems, providing enhanced ease of access and
publication of a variety of campus information resources. History has
shown, however, that more data does not always translate into better information. As the size and complexity of web-based campus information systems continues to
increase, a major challenge has become to provide a consistent presentation
interface and set of enabling utilities to enable information users and providers to maintain and transform information into structured knowledge.
Phoenicia is a prototype web-based campus information system developed in the
context of the Phoenix Project at the University of Chicago. In its current incarnation, Phoenicia consists of a suite of CGI Perl scripts and supporting structured data-stores, developed around an extensive object-oriented model of campus information data. It provides users with personalized on-line access to a wide variety of information resources, ranging from administrative resources
(registrar data, promotional/expository information, event calendars, grant and funding opportunities, current research activities, faculty publications) to class materials (syllabi, bulletin boards, lecture notes, quizzes), to general Internet resources. In addition, it supports distributed information management by enabling users to interactively maintain both the information accessed
through Phoenicia's interface as well as the interface and structure of the Phoenicia environment themselves.
In this paper we describe: the design considerations that have guided our
development efforts thus far;
the general features of the Phoenicia architecture; and our plans for
further extensions to the system, in support of teaching, research and clinical care.
1. Introduction & Background
Not long ago, computer-based Campus Wide Information Systems (CWISs) were
novel entries to the University's Information Technology environment.
Therefore, focus was appropriately placed on the often arduous and complex processes involved in assembling and delivering information to users via some network-based communication channel. Due to the success of these efforts, CWISs have quickly grown to include a wide range of information resources including: event calendars (sports, music/film/theater, lectures/seminars/workshops, academic calendars); promotional/expository information (books, pamphlets, newsletters, directories, primarily describing campus functions and services); lists (job openings, available housing, class schedules); grant and funding opportunities; current research activities; faculty publications; and access to the on-line library catalog.
As described by Judy Hallman[1], the benefits provided by a CWIS are manyfold:
a single "window" into a broad range of information resources;
24-hour access; easier maintenance and timeliness; distributed access capability to larger audiences; ability to present archival data; and cost savings in printing and continuing phone and personnel support.
Most recently, advancements in distributed hypermedia systems have enabled CWISs to become "standardized" to the point where users can now easily and routinely access both local and remote Campus information resources. These systems typically are built using a client-server architecture and employ graphical user interfaces to support rich text, graphics, and other media presentations.
1.1 Gopher
The Gopher system, developed in 1991 by the University of Minnesota Microcomputer, Workstation, Networks Center, has been the most popular of these systems. Over 100 Colleges and Universities maintain Gopher-based CWIS. The popularity of Gopher is due to three reasons: Gopher does not place high demands on the client or network (the information transferred and displayed is ASCII text); establishing a Gopher server is an easy undertaking; and Gopher provides access to other Internet information delivery methods, including File Transfer Protocol (FTP), Wide Area Information System (WAIS) servers, and CSO phone book servers, amongst others.
A keyword-based utility for searching most gopher-server menu titles in the entire gopher "space" is available. Called Veronica (Very Easy Rodent-Oriented Net-wide Index to Computerized Archives), its availability has improved the process of managing access to the abundance of information available through distributed Gopher servers. Veronica is only a first step, however, in the types of tools a user needs to locate and use information in an easy to use, relevant, and timely manner.
1.2 The World Wide Web
Though Gopher and supporting indexing and searching tools such as Veronica have dramatically improved the functionality of CWISs, it has been the World Wide Web (WWW) that has had the greatest impact in redefining the very potential CWISs can offer users. Popularized by the availability of clients such as Mosaic from the National Center for Supercomputing Applications (NCSA), the WWW has opened new vistas in the type of information presented and the manner in which that information is made available to users on campuses and around the globe.
The WWW marries the ability to easily support and deliver multiple data types (rich text, graphics, sound, and animation) with the use of hypermedia as its method of presentation. Like Gopher, the WWW browsers can access many existing
data systems via existing protocols (FTP, NNTP) or via the Web's native protocol, HTTP, and a gateway. Thus, the WWW has become a useful "meta-viewer" for the majority of
Internet-based information sources. With this powerful combination of capabilities, the WWW has been adopted by many Colleges and Universities, and is being aggressively pursued as a "next generation" platform for CWISs.
1.3 The Biological Sciences Division Office of Academic Computing & the Phoenix Project
The Biological Sciences Division Office of Academic Computing (BSDAC) at the University of Chicago is the primary resource for support and development of instructional and research computing applications in the biomedical sciences. We are driven by the goal to unify traditionally isolated teaching, research and clinical computing resources. As this paper demonstrates however, our mandate often takes us beyond the discipline-centric boundaries relating to "biological information," and instead impels us to address issues which are at the very heart of CWISs.
Two significant development projects have been initiated in support of the over-arching effort we call the Phoenix Project. The first is an effective X-Windows based What-You-See-Is-What-You-Get (WYSIWYG) HTML browser/editor that allows users to easily author and publish information on the WWW, and has been described elsewhere.[2]
The second project has been to develop a robust data model to integrate the wide variety of information contained in the world of Phoenix. This data model, and its subsequent presentation via the WWW, we call Phoenicia.
2. Phoenicia Architecture
Our perspective in developing Phoenicia has been to consider the Web as a presentation medium, rather than as a storage or data model; indeed, over 90% of the HTML served from our site is dynamically generated, rather than stored
as HTML flat files. The heart of Phoenicia resides, instead, in the data model we have developed to integrate the wide variety of available campus information resources. This information model, implemented with a suite of server-side scripts and a supporting database, is an object-oriented representation of campus information. It supports conventional academic entities such as departments and lectures, and allows users to interactively create and maintain information relating to them through a Web interface. It further enables users, however, to create their own entities either by `re-modeling' existing ones or by creating them de novo. This approach has enabled us to provide members of the university community with a highly flexible and dynamic information environment, through which they can easily access and maintain instructional, research, and administrative information, as well as refine the structure of the environment itself.
2.1 General Features
The basic architecture of Phoenicia is outlined in Fig. 1. It consist of a four tiered structure comprising: an http server; a suite of CGI-compliant scripts; a Sybase relational database; and a number of supporting data stores and information services. The http server (NCSA httpd 2.14) performs its customary role of serving HTML and authenticating clients. It is supported by a suite CGI Perl scripts, which act as the gateway between the http server and the heart of Phoenicia -- the database server. They resolve client data requests, query and update the database and/or other data stores, and ultimately return HTML to the server.
Our database server, a Sybase relational database, serves dual function as a
data store as well as a complex data index. While much of the
data maintained through Phoenicia resides in disparate external data
stores, the structure of the Phoenicia environment is primarily stored in the
database itself. It maintains all object definitions, as well as
small atomic data, such as names, telephone numbers and the like. In its role,
as an index, it maintains both the relative position or coordinates of objects
within the Phoenicia environment, and the addresses of externally stored
data.
Database records corresponding to a faculty member's research
description, for example, contain data specifying the relationships between the description and other objects within the Phoenicia infrastructure, as well as direct and
indirect pointers to the text of the research description itself.
2.2 The Information Model. Demo1
The object model we have adopted in Phoenicia is purely a logical model, and is
not the outgrowth of a particular conceptual schema of campus information. The constraints it imposes on information objects is the nature of the logical relationships between them, rather than any semantic preconception about the structure of the modelled information. This approach was selected to provide users with as much latitude as possible in determining the behavior of the environment, and to enable them, thereby, to define for themselves its semantic structure. Our model is thus a low-level construction; it is based upon a distinction between base and composite objects, and an associated organization of these objects into classes. All of these constructs are editable by users, subject to appropriate permissions, including the class definitions themselves.
2.2.1 Base objects
Base-objects are defined are defined as atomic objects, objects which do not
comprise other objects. They are of the following basic types: text; image; audio; video; URL; or SQL. They consist of an identifier, a name, and a single value or attribute. The identifier uniquely specifies the identity of the object within Phoenicia, and the name is a `human-friendly' descriptor. The value of the object in the database is either a text or numeric variable (in the case of internal storage), or a pointer to the external storage address where the value of the object is actually stored.
A telephone-number base-object, for instance, consists of an ID, a name, and an internally stored character string of value "xxx-xxx-xxxx". A photo-of-student object consists of an ID, a name, and the file path of the corresponding image file within the file system.
2.2.2 Composite objects
Composite-objects are, as their name suggests, aggregate objects composed of other base and/or composite-objects. They occur either as lists (collections of like objects belonging to the same class), or as assemblies (collections of dissimilar objects).
Thus, the address-book list-object might contain all addresses base-objects, whereas a person assembly-object might contain telephone-number, and photo base-objects, and an address-book list-object.
Composite objects differ from base-objects in that they do not have values or attributes per se; rather attributes are assigned to them by association (specified by their class membership), through a cascade of pointers to base-objects. This feature endows the data model with exceptional flexibility, as objects are thereby enabled to share common attributes.
The robs-address-book and John-CV composite-objects can thereby share the John-telephone-number attribute. Both objects will thus be kept current whenever John's phone number is updated through either object.
2.2.3 Class definitions
All objects, be they of base or composite type, are defined by their derivation from a particular object class. It is the object class definition associated with a particular object that confers on the object its general properties and behaviors. Class definitions endow the data model with two significant features.
First, they define an object's associated attributes and methods when the object is initially instantiated or created, and thereby enforce integrity between the objects in a given class at creation.
Second, because objects in the Phoenicia data structure only have attributes by
association, the model can support dynamic inheritance of class definitions by their member objects.
Thus, if a user modifies the class definition of the address-book class by substituting the home-phone-number object for the work-phone-number object, all member address-book objects will inherit that change.
2.2.4 Display and Manipulation of Objects
Objects can be explicitely embedded into served Phoenicia documents for display or editing purposes, through the mark-up extension to HTML described below.
<OBJ src="object identifier list" SURROUND="display
expression">
The object identifier list specifies a single, or a set of, object IDs.
These are provided as a comma delimited list of IDs, or as an equivalent SQL statement. The display expression consists of an HTML string with embedded place-holders ($1, $2,...), into which the retrieved object values are to be inserted, according to their order in the object identifier list .
Thus the following mark-up defining the display of an address-book object:
<OBJ src="name-ID, Address-ID, Phone-number-ID" SURROUND="<B>Name:i</B>$1<BR> <B>Address:</B>$2<BR><B>Phone:</B>$3">
Would correspond to the following display presentation of the entry:
Name: name
Address: address
Phone: phone
The mark-up can also be used by Phoenicia scripts as general presentation templates. In such cases, the object identifier list
is composed only of class-object IDs; the IDs for the particular instance
of the class for which the template is being invoked, is passed to template by the HTML-generating script.
2.3 Information Management Demo2
One of the foremost concerns in designing Phoenicia was to remedy the common redundancies, and resulting inefficiencies, associated with the maintenance and manipulation of campus information at the University of Chicago. These
typically result from failures in either the technical or the administrative integration of data. Consequently, our approach was twofold: to develop tools that would support the technical integration of data across differing hardware and software platforms; and to develop both a model of campus information and a corresponding information management protocol through which the benefits of technical integration could be effectively applied.
2.3.1 Technical considerations
Current university information services are supported on a range of hardware as broad in scale as in technological sophistication (from mainframe to desktop computer). Of principal interest to our Web development effort are: our registrar's Student Information System (PIC operating system running on a mainframe); the Biological Sciences Division's BSD-MIS server (Sybase running under UNIX IV and V on Sun Sparcstations); and a variety of local information stores such as departmental information and faculty data-archives (running on Windows/DOS/Mac desktop machines).
The technical integration of these information sources in Phoenicia is relatively straightforward; wherever possible, live data feeds are used. This includes remote SQL transaction processing with the BSD-MIS Sybase server, as well as file-system sharing using NFS mounted volumes. In some cases, however, we must rely on deferred updates, as in the case of our daily data feed (NFS) from the registrar's mainframe.
2.3.2 Administrative Integration
The absence of technical solutions to the integration of campus information has compelled information consumers to also become information duplicators and maintainers, in addition to functioning as information providers. In as much as the bulk of a user's consumption of information outweighs their own information output, any lack of integration leads to a significant drain on the user's resources. The attendant gradual degradation and corruption in data integrity resulting from such duplication is no less significant an injury to the business of information management. These patterns of use, defined though they may be by dated technological delivery systems, continue to present a significant impediment to the effective management of information.
Along with the provision for technical integration of campus information resources, we are also revising current information management practices to take advantage of Phoenicia's enhanced information environment. In particular, our provisions for the distributed access to and maintenance of information resources at the object level make it possible to reconsider common preconceptions governing the creation and maintenance of information resources. The 'document' need no longer constitute the standard information denomination. Instead, Phoenicia supports the direct ownership of data objects, ranging in size and complexity from a simple text field, to a sophisticated description of a university division, encompassing thousands of distinct information objects; users are considered to be both consumers as well as providers of these information objects, and their respective information management duties are assigned to them accordingly.
Fig.3 illustrates this distinction between the conventional data management
model (upper panel), and that which we are implementing through Phoenicia (lower panel). Two administrators -- Persons A and B -- are each charged with the maintenance of a report -- reports A and B -- each containing common information elements. In the conventional scheme each administrator maintains copies of both their own and their counterpart's reports (maintenance of their own report requires knowledge of the value of their counterpart's data objects), thereby duplicating all shared data. In our scheme objects are embedded in each of the reports and their maintenance is assigned to their respective owners. Thus, any modification of a red object, belonging to person A, will dynamically be updated in all presentations of that object, including document B.
3. Future Directions
Phoenicia has evolved out of our need for an
object-based
information model through which to integrate distributed authorship
and maintenance of academic information at the University of Chicago. The World-Wide Web plays a pivotal role in this environment by providing a flexible cross-platform presentation and
manipulation medium for information maintained on our server. Our development of an object model and supporting toolkit outside of the Web, however, results from the current absence of provisions for object support, through HTML and http.
It seems likely that provisions for such direct object support will be unavailable through the Web over the short term, and that we will
need to pursue the approach we have adopted for the forseable future. Our development plans do include, however, the extension of our current scheme to
support the distribution of Phoenicia object servers across campus. In this regard we shall be looking to extend our object-specifier 'HTML mark-up' to a generalized form, akin to the syntax of the URL
markup in HTML. We are also considering extendinng the interoperability of Phoenicia at the object level by integrating support for the Open Doc, Object Linking and Embedding (OLE),
and Open Database Connectivity (ODBC) standards into Phoenicia.
[1] CAMPUS-WIDE INFORMATION SYSTEMS: Judy Hallman; May 19,
1992, University of North Carolina at Chapel Hill.
[2] Lavenant, M. G. and Kruper, J. A. "The Phoenix Project:
Distributed Hypermedia Authoring" in Proceedings of the First International World-Wide Web Conference, Geneva, 1994.
John Kruper is Director of the Office of Biological Sciences Division
Academic Computing (BSDAC) at the University of Chicago This group was founded two years ago to refashion instruction and training in the Biological Sciences by applying new technologies to the teaching and learning process. With a multidiscipinary team consisting of programmers, curriculum specialists, and media designers, BSDAC seeks to make it possible for physicians, scientists, and students to work and study in a fundamentally new manner by electronically liking the classroom, the research laboratory, the clinical exam room, and other remote locations including the home office.
Dr Kruper is also a Lecturer in the Biological Sciences Collegiate Division, where he teaches classes in Genetics, recombinant DNA Technology, Simulation & Modeling in the Biological Sciences, and Molecular BioComputing.
Dr. Kruper received undergraduate degrees in biochemistry and molecular and cell biology from the Pennsylvania State University, a Master's degree in Molecular Virology from the University of Chicago, and a Doctorate of Arts
degree in Biology Education from the University of Illinois at Chicago. He also did post-doctoral research with William Wimsatt in the Department of Philosophy at the University of Chicago before assuming his current role as Director of BSD Academic Computing.
Dr. Kruper's research interests include distributed database and hypermedia systems, the use of simulation and model building to support science education, and (with the DNA Learning Center of Cold Spring Harbor Laboratory) characterizing the diffusion of curriculum innovation.
Marc Lavenant has been Lead developer on the Phoenix Project, at the University of Chicago, since its inception two years ago. Prior to joining the project he pursued research on the molecular biology of protein structure and cellular computation. His primary research interest is the design of self-assembling knowledge systems. He is currently
responsible for designing and coordinating the development of Phoenicia.
Email correspondence should be addressed to m-lavenant@uchicago.edu