JESSICA: an object-oriented hypermedia publishing processor

Robert A. Bartaa and Markus W. Schranzb

aEUnet Austria,
Diefenbachg. 35, 1150 Vienna, Austria

rho@Austria.EU.net

bDistributed Systems Group, Technical University of Vienna,
Argentinierstr. 8–184/1, 1040 Vienna, Austria

schranz@infosys.tuwien.ac.at

Abstract
The lifecycle of Web applications covers the design, implementation, and maintenance of the services. First generation Web development tools just concentrated on the creation of single pages. Later Web engineering tools have been integrating the management of complete Web sites and the navigation model. But only few attempt to cover all the aspects of the lifecycle, and especially the maintenance task which is essential on a dynamic medium as is the case on the Web. To increase the manageability and introduce flexibility to large Web services we introduce the JESSICA engineering system that employs an object-oriented abstraction model for the hypermedia information. An object-oriented language describes components of the Web service that are easy to manage, reusable, highly dynamic and of polymorphic type, covering all elements of a complete Web site. The objects are accessible throughout the lifecycle for management and maintenance activities. A compiler maps the abstract service description to the file-based repository of a standard Web server. We demonstrate the feasibility of the engineering system on managing the Vienna International Festival Web site, a multilingual database Web application on culture and arts, containing 300+ static pages and several interactive services.

Keywords
Web service engineering; Object orientation

1. Introduction

With the wide spread acceptance and installation of the World Wide Web during the last years hypermedia services have become available to a large community of users. The growing information medium has been attracting providers of multiple areas to reach their customers and demonstrate innovative services. Such Web applications vary in size and nature of contents. Since the early steps of the WWW at the beginning of this decade, several generations of Web applications have been emerging. At first static services dealing with publication-oriented hypertext came up, later database integration offered content dynamics. More recently highly interactive services using server-side and client-side scripting as well as dynamically generated hyperlinks were dominating the scene on the WWW.

Managing and maintaining such Web services becomes non-trivial when the size of the system exceeds a certain limit. Keeping the data organization, the information mapping to Web pages, and the navigation design manageable while providing a consistent interface in terms of layout and usability are basic requirements to successful Web services. From a software engineering perspective the Web is a new application domain. The lifecycle of WWW services [6,12] is comparable to standard software development and Web engineering tools have been developed in parallel to the Web services. A first generation of development tools concentrated on the creation of single pages for Web services. Examples for this systems are the HotMetal Editor [8] or Microsoft Frontpage [5]. More advanced editors tried to integrate the development of a whole Web site, like NetObjects Fusion product [11]. Those systems were restricted to passive hypertext presentation, supporting consistent layout across pages with only small support for the navigation model. Other tools concentrate on the semantics of the service, introducing support for multiple languages [7].

Some more advanced systems support the development of dynamic applications [2,3]. They provide the integration of reusable interaction components and the maintenance of state in session-oriented applications. Tools from the area of hypermedia design address a wider scope of the application lifecycle. The RMCase tool [4] employs strategies defined in the RMM methodology [10] to implement a hypertext system. To make use of this approach, the Web engineer must define the service in terms of entities and relations.

Currently the main research area for Web service engineers is the maintenance of WWW applications. A basic problem is caused by the nature of WWW servers that structure the information in file-based Web resources. The best design approaches are useless unless there is support for the mapping of the design step to the final implementation of the service. Recent engineering tools support the management of hypertext information on an abstract object-oriented level. The WebComposition system [6] tries to reuse components, e.g. design artifacts, on multiple pages and manage the service maintenance through handling abstract objects. The approach of dealing with objects to support highly manageable Web services can also be found in the W3Objects [9] system and HTML++ [1,13].

In order to provide Web engineering support for a complete Web site, further efforts have to be made. To cover both the lifecycle of the Web application and the integration of static information pages, multimedia data and interactive scripts, a system needs to be able to manage abstract objects of polymorphic types. We propose the JESSICA system, that describes components of a hypermedia system in an object-oriented language and provides a compiler to map the abstract system definition to highly dynamic Web services. The abstract modeling of the service supports the separation of layout and content information and the reuse of multiply used artifacts such as page designs for both static Web pages and dynamic script outputs. Web service maintenance is reduced to the manipulation of named objects with less size and supporting direct access to relevant content data compared to the final file-based Web service information. The expanding of the defined objects and the mapping to the Web server's repository is managed by the JESSICA compiler, the hypermedia publishing processor.

We will discuss the benefits of the JESSICA system showing a popular case study. First we explain the object-oriented approach by describing the syntax of the JESSICA language in Section 2. Section 3 discusses the JESSICA system's implementation and its support for the lifecycle and flexibility of Web applications. The system has been used to implement the Vienna International Festival 1998 Web service (WFW). The strength of JESSICA will be explained on the approaches and concepts implemented in the festival's Web site in section 4. The final section describes the status of our work, plans for future improvements and concluding remarks.
 
 

2. The language specification

JESSICA uses a special object-oriented language to manage its objects and describe a complete set of information. A perfect playground for employing the JESSICA language is the engineering of WWW sites, containing a set of multimedia documents and interactive scripts. The syntax is based on SGML since this provides future compatibility with editors supporting markup language documents. The goal of the language design is to keep the syntax small and simple but nevertheless comprehensive and descriptive. We avoided imperative constructs like flow control statements and sequence ordering. The only restrictions that can be made are implicitly stated by the compiler and some PRAGMA directives. The language consists of OBJECTS, TEMPLATES, PACKAGES and PRAGMAS and elements to describe the interrelationship between them.

2.1. JESSICA objects

Objects describe a document or parts thereof. An object consists of an object header, containing a set of named attributes and the optional object body. Each object can have some of the following attributes:
Object name
The object can be explicitly named or have assigned an automatically generated name by the compiler. Names can be used somewhere else to create a reference to the object.
Object destination
The destination attribute controls the target where the object is to be created. For simplicity let us only consider the file-instance of the DST attribute. Using DST = "file:index.html," the user can control force the name of the document described by the object to be "index.html".
MIME type
Each object has a MIME type describing the type of the information that is described therein. The default type is "/text/html".
 
Fig. 1. JESSICA object.

In Fig. 1 a simple example of a JESSICA Object is presented. Running the JESSICA Compiler on this example will generate the corresponding HTML page using the DST attribute to deduce the filename. An object can have several DST streams. Since there is no attribute DST given, the compiler will generate a unique name for this object.

2.2. JESSICA templates

Templates in JESSICA are "incomplete" objects, i.e. objects with variable parts not yet bound to values. The variable parts (VAR) have to be named to be addressed later. Templates need to be instantiated somewhere else in the JESSICA document.
 
Fig. 2. JESSICA template.

Figure 2 shows a JESSICA Template with the variable title. The compiler keeps track on variable bindings in all objects and substitutes the defined values on object instantiation. One may even assign a default value to variables that are used unless the instance defines a content for it.

2.3. Template instantiation

Objects can be instances of templates and inherit parts of the information described there. The inheriting object uses the attribute SRC to name the template which it is instantiating. Figure 3 shows the instantiation of the previous template.
 
Fig. 3. JESSICA object instantiation.

The sourcing (SRC) creates a new instantiation of the template CuteLayout. Afterwards HelloWorld has the same structure as the template, with the variable parts defined through the local object. This part can be addressed as HelloWorld.title.

The local object inside the package in Fig. 3 is anonymous (without a name defined), it only feeds a value via DST to the title component of HelloWorld. The example also shows, that nesting of objects is allowed in the JESSICA language. The package is necessary to avoid that the inner object is interpreted as content of the outer object. Actually, this provision of the value could have happened somewhere else as well.

Since templates in JESSICA are describing incomplete objects, the nesting of the latter provides instantiation of other templates. This nesting we call template inheritance, where a template is instantiated, but new variable parts are added. In this case, not all variable parts may be bound to values, thus the object compilation will result in another template. The template inheritance provides a clean subtyping concept for related objects.

2.4. Object import

In the latter examples we just used templates in the SRC attribute. The SRCing of objects other than local JESSICA templates makes them addressable in the current scope. Figure 4 explains the import of AustraliaNews in the local object scheme.
 
Fig. 4. JESSICA object importing.

One can specify several SRC attributes. In general, the compiler will try to match all streams against each other using a MIME sensitive pattern matching process. If the pattern does only match a substream, the match will be repeated to the rest of the stream segmenting it. This can be exploited for iteration over subobjects.

The prime use will be a match of a complete object against one (or more) incomplete objects (templates). With this mechanism existing objects can be restructured and matched components can be reused somewhere else for digests. Pattern matching honors optional, multiple and arbitrary attributes in objects while following on a longest-match first algorithm. There is also a notation for addressing bindings.

The MIME type is implicitly imported together with the object (as it is the case with HTTP). Or, one can explicitly specify a MIME type as in SRC="file(text/plain):somefile.txt". This MIME type and that of the environment controls the interpretation of the stream.

2.5. Non-textual objects

JESSICA allows the management of objects of any MIME type. This is particularly important for the engineering of a complete Web site, including multimedia data, images and interactive scripts. Figure 5 shows examples of non-textual objects in JESSICA.
 
Fig. 5. Non-textual object.

For portability reasons the language allows only 7-bit characters (us-ascii). Binary objects can be encoded arbitrarily but the conversion process has to be specified. If the object's MIME type does not match the SRC MIME type, an implicit conversion is necessary. By default the target MIME type is the same as the objects MIME type.

2.6. JESSICA packages

In an earlier example, the concept of packages was used. JESSICA Packages introduce a new scope to avoid name space pollution. References to local objects and packages can be done prepending the package name.
 
Fig. 6. A JESSICA package.

Figure 6 shows a global package for all objects in the JESSICA document (the local scope). Packages can be global, local, or local to another package. They may be named or anonymous.

Packages (containing their local objects) can be imported by specifying a SRC attribute. Package exporting is done by specifying a DST attribute. The generation of a perl package is done by DST="(application/x-perl-package) file: WF98.pl," triggering the currently defined embed process for the MIME type conversion.

2.7. Object references

The JESSICA system keeps track of all objects and their names (explicitly given or assigned automatically by the system). JESSICA allowsan object to address other objects (local or non-local) inside a logical name space, and dead links can be detected at compile time.
 
Fig. 7. A JESSICA reference.

When such a reference (see Fig. 7) has to be resolved within another object of type text/html, then an HTML anchor will be generated (<A HREF="index.html">here</A>). For other document types different behaviors can be defined using pragmas.

2.8. JESSICA pragmas

Pragmas are used to specify the system's behavior in special environments. If special attributes (e.g. link colour) should be used for a link this knowledge can be bound into a pragma. The mapping of the abstract objects into file based storages (as it is used for Web services) may require different destination directories for passive HTML documents, multimedia data or active server-side scripts. This information can also be controlled via pragmas.

By default a pragma has global scope. Every pragma may also use a SCOPE attribute to define to which object the pragma refers to.
 
 

3. The JESSICA system

The engineering of mid-size and large WWW applications requires some methodological approach to keep the process manageable. As proposed in [6] and [9] an object-oriented approach is feasible to manage a large set of information. Since WWW services contain replicated parts (corporate design layouts, sponsor logos on multiple pages), object reuse can help to dramatically reduce the size of the service description needed to design and create the application.

The object-oriented JESSICA language provides a suitable abstraction for engineering large WWW sites. The implementation step of the WWW application's lifecycle is performed by the JESSICA compiler. The language, together with the compiler, builds the JESSICA System.

3.1. Implementation

The JESSICA compiler is implemented in Java and currently tested as a prototype at the Technical University of Vienna. It is based on the HTML++ system [1], developed by the authors. Major updates on the language specification and the compiler resulted in this Java-based system for engineering hypermedia services.

JESSICA uses compiler generators (JLex, JavaCup) to scan and parse the documents written in the JESSICA language. The compiler interprets packages and objects and tries to expand incomplete elements in order to create the corresponding output. The prime use of the JESSICA system is to manage large WWW sites. In such cases, objects describe documents and scripts on Web servers and their interrelations. Global packages are used to group objects to complete Web sites and ease their management.

References in objects may be expanded to HTML links. This feature provides a simple management of the hyperlink navigation model of a Web site, since relations within the Web site can be implemented using the abstract naming scheme of the JESSICA objects.

The Java prototype of the JESSICA compiler has mainly three tasks; reading and parsing the source document encoded in the JESSICA language, retrieving and expanding SRCes within the recognized objects and deparsing the generated contents to the presented destinations. Thus, the architecture of the Java system can be represented in the model shown in Fig. 8.
 

Fig. 8. The architecture of the JESSICA compiler.

The compiler follows a fix point strategy in expanding incomplete objects until the list of incomplete objects is either empty or the object could not be resolved at all. Since incomplete objects can contain unretrieved objects in the SRC attribute, the expanding step may take several iterations until the DeParser can map the objects and packages to its file-based representation in the Web server's repository.

3.2. Lifecycle of WWW services

The strength of JESSICA is the support for all steps of the lifecycle of WWW applications. As its predecessor HTML++, the JESSICA system follows the RMM methodology, where the specific steps for hypermedia information systems are discussed. A closer look and deeper explanations in the lifecycle implementation in HTML++ is presented in [12].

The data organization follows the entity-relationship approach, where all the information is structured in entities and their relations. JESSICA supports this approach in using abstract objects to define and describe the data for the hypermedia system. RMM's slice design to break up large entities into suitable objects for current hypermedia platforms is reflected in providing templates for WWW pages and script outputs. The navigation model is designed using the JESSICA object name space that is transferred to actual hyperlinks in the implementation step.

The actual implementation, i.e. the mapping of the abstract information description based on objects into file-based structures required by standard WWW servers is provided by the JESSICA compiler. Some quality assurance features allow the online checking of the navigation model (undefined links, unreachable documents) and the version management of the hypermedia application.

A very important point in engineering hypermedia systems is the maintenance of the services and applications. This major issue is often neglected in WWW service management and is very time and cost-intensive. The object-oriented approach adds a level of abstraction to the data management, thus allowing the service manager to work on named objects and identify data easier than in the final hypermedia system. The integration of external sources like database information, system call outputs and remote documents supports dynamic services on the WWW. Using the import feature properly combined with database controlled content management can add high flexibility to the hypermedia application [13]. A case study, based on these properties is presented in Section 4.

3.3. Flexibility and polymorphism in JESSICA-based systems

The management of abstract objects and the integration of external information through sourcing external objects raises JESSICA to a powerful engineering system for highly flexible hypermedia services. Dynamic documents in object's SRC attributes may change the complete Web site on a single run of the compiler. Handling the object contents through comfortably editable sources (e.g. databases) saves time and money.

An additional strength of the JESSICA system is the integration of static hypermedia documents with dynamic (even interactive) server-side scripts. Providing this combination, a Webmaster is able to manage his/her complete site only by working on the abstract description level of the JESSICA language. Basically this is possible through the handling of multiple MIME types and the type conversion capability of the JESSICA compiler. This allows the handling of both static information objects and dynamic scripting applications as well as hypermedia contents in JESSICA objects. The compiler transfers the polymorphic objects into the corresponding presentations on the Web server, occasionally using some pragma directives. JESSICA objects may have even multiple destination defined, which forces the compiler to create multiple (polymorphic) copies of objects. These can be different representations of the same content, based on the presented MIME-types.
 
 

4. Case study: the Vienna International Festival

The Wiener Festwochen (Vienna International Festival) Web presence (WFW) [14] has been managed by the Distributed Systems Group of the Technical University of Vienna since 1995. The WFW presents the top cultural event in Vienna, Austria on the Internet. Major parts of the application with more than 300 static pages and various interactive scripts refer to the online schedule of the spring event, mainly theatre and music theatre programs, concerts, lectures and exhibitions over more than 10 weeks. Events are categorized into different groups of performances and each performance stores information on actors, biographies, venues, descriptions, multimedia data and references to related events. The ticket reservation facility introduces Internet commerce to the case study. Since 1997 we have been using a secure Web server to protect the online ticket ordering system from eavesdropping. Ticket order information is dynamic since availability may change on any single request. A news service provides up-to-date information on performances as well as the latest press releases. Such information and content data of each performance may change during the festival period and need to be updated online. Using the JESSICA system, the WFW application is flexible enough to handle such changes without restructuring and re-implementing the complete Web site.

In this section the special problems of dynamic services and the benefits gained from using a service like JESSICA shall be discussed. We will explain, how the abstract data management using objects and the importing of external structured information from databases are promoting the flexibility of the WFW application. It is not the purpose of this paper to list JESSICA code for the WFW. To stress the benefits of the JESSICA systems we discuss some approaches and concepts implemented in the case study.

4.1. Separation of Web service content from layout design

The abstract object-oriented data management in JESSICA allows the separation of layout design and HTML specific information from pure content. This provides a clean and easy manipulation of data units. The information manager and the application engineer concentrate on the data content rather than dealing with layout details and Web document formats.

Layout information defined as JESSICA templates can be reused for multiple content objects to have a similar appearance on the browser. In the WFW each performance of the festival was based on the same design layout reusing one JESSICA template. The separation also reduced the source code to one third of the final WWW document's size.

The separation strategy makes the WFW flexible to general application changes like the introduction of browser frames or sponsor logos. The integration of the new information into layout objects of the database changes any affected WWW document on the next JESSICA compiler run or database script access. Extensions of the existing service are handled through insertion of the JESSICA objects and templates. A single recompilation step transfers the updated objects into the desired service.

4.2. Abstract data model for content management

The content information on the performances presented at the Vienna International Festival are managed in a relational database. The abstract database level provides easy and specific manipulation to particular data portions. Forms based content updates change only the significant portions of data. The current version of the information in the database is brought to the Web on a run of the JESSICA compiler.

The abstraction of the data allows the reuse of snippets for Web documents of even different types. In the case of the WFW layout design, data from the database is used for both creating static HTML documents as well as for the appearance of server-generated script documents that show dynamic data or results of interactive queries.

4.3. Multilingual support and security

As an international service the WFW offers information to cultural events in more than one language. Users can select from within the service whether they prefer to see the content in German or in English. Once selecting a language track the application will maintain the users choice unless he changes it again.

The engineering system needs to support multilingual services and language specific static and dynamic services. Both the language-independent information — like layout design — and language-specific sections are stored in the WFW database. The abstract JESSICA objects guarantee a proper navigation model maintaining a pre-selected language track. Object References in JESSICA provide a simple modeling scheme for hyperlinks and allow easy maintenance of language tracks. The service engineer simply uses the language-dependent name of the related object in the REF tag to create the proper navigation model. The object-based approach to information management provides high flexibility to new language integration and language-specific data management. Information in further languages can be added without reorganization of the Web application.

4.4. Dynamic data integration

The WFW includes data that are highly dynamic and changing. Navigating through WFW, the user may access the interactive calendar, where s/he can select performances of specific categories at particular dates. The result to this query is a list of available performances that depends on data from the WFW database. Since this result is dynamically generated, the application may provide or refuse ticket reservation to particular performances according to temporal changes. On selection of the ticket reservation option tickets of available categories are offered to the user.

The ticket reservation system is available on both standard HTTP server and a secure Web server to protect user data from eavesdropping. The usage of secure transmission guarantees privacy for specific user data. The state of the current ticket availability is triggered by the information manager of the Vienna International Festival. S/he changes the set of tickets manually via the access-restricted content management form. The application may deactivate the ticket reservation functionality on a temporal basis. An update-based rerun of the JESSICA compiler rebuilds the WFW to guarantee its high flexibility to dynamic data.

4.5. Data size and manageability

From our experiences with the WFW in the last three years and some local services implemented in JESSICA and its predecessor HTML++, we calculated figures to compare the size of the final Web service to the amount of data stored in the abstract JESSICA objects. Due to the reuse of parts of the pages (corporate design, sponsor logos, navigation elements), the abstract source covers only 32% of the size of the final WFW application that is generated on a JESSICA compiler run. Also the content management on named objects and relevant parts without having to deal with HTML details is easier than changing the final pages one by one with a standard HTML editor.
 
 

5. Conclusion and future work

WWW service engineering is a complex task for large WWW sites that is supposed to cover the complete lifecycle of a Web application. We introduced the JESSICA system, that provides support for the design, development and maintenance of Web services based on an abstract object-oriented view of the hypermedia information set. The object-oriented model defines artifacts of WWW data that manage objects of polymorphic type, including parts of Web pages, hypermedia data and interactive scripts. The system allows the definition of templates for Web pages and thus the reuse of components for replication on multiple Web pages, including passive information as well as dynamic script output. The separation of content and layout information offers a modularized description of the service. A special naming scheme provides easy management of related information and the navigation model. Through the integration of external data repositories, the JESSICA system adds high flexibility to Web applications. Updating content information through databases and changing content information on the objects eases the maintenance of the Web service.

Currently we employ the JESSICA system on several Web servers to prove its feasibility on mid-size and large Web applications. We have been engineering the popular Vienna International Festival since 1995 and using JESSICA to implement the recent version. Benefits in the administration of Web sites as well as code reduction through reuse and the integrated navigation management confirm the employment of JESSICA for engineering such Web applications. For the acceptance of the system, which we plan to open for the public, we are currently implementing an editor in JAVA. The object-oriented approach will significantly benefit from an interface that provides OO design facilities which we are currently investigating. Comprehensive debugging information and the editor will improve JESSICA to become a portable engineering environment for large Web services.
 
 

References

[1] R. Barta, What the heck is HTML++, Technical Report TUV-1841-95-06, Technical University of Vienna, Distributed Systems Group, October 1995. 
[2] A. Crespo and E. A. Bier, WebWriter: a browser-based editor for constructing Web applications, in: 5th International World Wide Web Conference, Paris, France, May 1996. 
[3] The WebObjects HomePage, NeXT Software Inc., http://www.next.com/WebObjects/ 
[4] A. Diaz, T. Isakowitz, V. Maiorana, and G. Gilabert, RMC – A tool to design Web applications, in: The World Wide Web Journal, 4th International World Wide Web Conference, December 1995. 
[5] FrontPage Home Page, Microsoft Corp., http://www.microsoft.com/FrontPage/ 
[6] H.-W. Gellerson, R. Wicke, and M. Gaedke, WebComposition: an object-oriented support system for the Web engineering lifecycle, Computer Networks and ISDN Systems, 29(8–13): 1429–38, April 1997.
[7] D. Hillbrecht, MuLaW – the Multi Language Web authoring System, http://www-c.informatik.uni-hannover.de/~dh/mulaw/, November 1997. 
[8] The HotMetal HomePage, SoftQuad Inc., http://www.sq.com/ 
[9] D.B. Ingham, S.J. Caughey, and M. C.Little, Supporting highly manageable Web services, Computer Networks and ISDN Systems, 29(8–13):1405–16, April 1997.
[10] T. Isakowitz, E.A. Stohr, and P. Balasubramanian, RMM: a methodology for structured hypermedia design, Communications of the ACM, 38(8), August 1995.
[11] NetObjects Fusion Home Page, http://www.NetObjects.com/ 
[12] M. Schranz, Lifecycle of WWW services: an experience report, in: 9th International Conference on Software Engineering and Knowledge Engineering, Madrid, Spain, June 1997. 
[13] M. Schranz, Engineering flexible World Wide Web services, in: Symposium on Applied Computing, Atlanta, Georgia, February 1998, to appear. 
[14] The Vienna International Festival/Wiener Festwochen Homepage, http://www.festwochen.or.at/ 
 

URL

HTML++, http://www.infosys.tuwien.ac.at/HTML++/
WebObjects, http://www.next.com/WebObjects/
FrontPage, http://www.microsoft.com/FrontPage/
MuLaW, http://www-c.informatik.uni-hannover.de/~dh/mulaw/
HotMetal, http://www.sq.com/
NetObjects Fusion, http://www.NetObjects.com/
Vienna International Festival / Wiener Festwochen, http://www.festwochen.or.at/
 

Vitae

Robert A. Barta is information manager at EUnet Austria. He received a MSc in Computer Science in 1991 and a PhD in Computer Science in 1995 from Vienna University of Technology. His current research interests include formal specification techniques and distributed information management.

Markus W. Schranz is currently preparing a PhD thesis on Web service engineering at the Distributed Systems Group within the Information Systems Institute, Vienna University of Technology (VUT). He obtained a MSc in Computer Science in 1994 and is currently affiliated as research and teaching member with the Distributed Systems Group. His general research interest is in distributed information systems engineering and the World Wide Web.