Fifth International World Wide Web Conference
May 6-10, 1996, Paris, France

    HTTP_CM: (simple_http, initialise, bscw.gmd.de, 80)

    (bscw_http, handle_request, [HTTP_CM], {`URL': `^\/bscw\/?'})
    (bscw_http, get_icon, [HTTP_CM], {`URL': `^\/icons\/?'})

Figure 3. Basic interest-configuration for the HTTP IS to AM mapping

The interest configuration file has two parts separated by a blank line; a header part which specifies the Communication Modules to be initialised, and the body which describes the mapping from requests to calls to Application Modules. Each entry in the header specifies the name of a CM to register with the system and about which interest information may be provided. If any parameters are given the IS will invoke the named module and initialisation method and pass the specified initialisation parameters (represented as strings). In the example above the CM simple_http will be initialised with a function call of the form:

    simple_http.initialise(`bscw.gmd.de', `80')

The CM will then initialise--in this case starting up a server process to listen for HTTP connections at bscw.gmd.de:80. It is expected to handle any errors on initialisation itself--for example, if the IS is being signalled to re-parse its configuration file and the server is already running on port 80. (To re-start the CM itself it is necessary to kill the CM process and either signal the IS to re-parse its configuration file or start the CM by hand from the command line).

Details of the linkage between requests, formatted as request objects by a CM, and calls to specific Application Modules are specified in the body as mappings between AM handler functions and regular expressions. These take a similar format to the CM descriptions with the name of the module and function to invoke when the IS matches a request object with a regular expression--the request object is usually the only parameter which is passed in this function call. From the configuration above requests to the simple_http CM, where the request URL takes the form */bscw/* (`*' is a wildcard), result in the following AM invocation:

    bscw_http.handle_request(aRequestObject)

Evaluating the configuration file from top to bottom, the IS attempts to pattern match request objects with each regular expression, and if a match is obtained the associated AM handler function will be invoked. It is therefore possible for more than one AM to be invoked for the same request, in which case the output from the first AM is passed on an optional second parameter to the subsequent AM. In practice, we have found that this is rarely needed, as AMs can be defined which perform the coordination themselves; however one particularly useful technique for which we have used this mechanism is to re-parse the HTML output from AMs, transparently to the AMs themselves, by passing the output and original request to an HTML formatting module. This module uses details of the W3 client found in the request object, in association with a table of client capabilities, to ensure the HTML returned can be interpreted correctly--for example, substituting preformatted text for an HTML 3.0 table. Further uses might include encoding of output, automatic logging and so on.

The AMs implement the linkages to the applications themselves. Typical AMs might implement a simple routine, invoke an application using a command line call, or provide a simple client for a database server; the developer can choose the most appropriate implementation and has the full details of the request available. Should authentication be required the AM can detect the absence of authentication information in the request object and return a request for authentication which is forwarded to the CM by the IS. The CM will then request authentication from the client and forward the authentication information to the AM.

It is possible to register multiple interests for an AM with the IS allowing AMs to handle different kinds of requests. This approach also allows multiple applications to be registered with the same IS by defining AMs to provide the request-application mapping. To add or remove an AM requires only modification of the IS configuration file and signalling the IS to re-read the interest information. In addition to the IS we have implemented a number of simple CM toolkit components as well as more general routines--for example, components for basic authentication, access control, directory listing and so on which have allowed us to construct a simple W3 server using our toolkit.

The toolkit has been implemented entirely in Python. Although this greatly simplifies the problems of invoking and calling functions from CMs and AMs it would be possible to modify the interfaces between these components so that parameters were passed as environment variables or on a stream in a similar manner to CGI. Another option would be to integrate the ILU system to allow function calls to AMs written in different languages and possibly located on remote machines; as an ILU binding is available for Python, this would allow access from the Python IS to AMs written in C, C++ or Modula 3 (and further language bindings are currently being developed).

Re-deploying the BSCW system with the toolkit

As an example of the toolkit in use we now briefly describe a re-deployment of the BSCW Shared Workspace system using the toolkit components (Figure 4). We have implemented a simple CM which parses HTTP requests into a request object and forwards these to the IS. Using the interest configuration shown in Figure 3 the IS passes the request object to a handler function in the bscw_http AM. This then extracts the necessary details, such as action to be performed, and invokes the relevant function in the BSCW core, passing the request object and a flag to indicate that an HTML response is required. The BSCW system returns a page of HTML to the AM which is forwarded to the client via the HTTP CM.

In our implementation of the HTTP CM, all parameters for a GET or POST request are parsed into attribute-value pairs and stored in the request object. Further information such as PATH_INFO is also stored in this manner. For example, the following URL:

    http://bscw.gmd.de/a_workspace/a_folder/?op=show

(which requests the operation to generate a listing of the contents of a BSCW workspace folder) would be represented in the request object as:

    `REQUEST_METHOD': `GET'
    `PATH_INFO': `/a_workspace/a_folder/'
    `OP': `show'

(plus a number of other object attributes).

Figure 4. Re-deployment of the BSCW system using the toolkit

As shown in Figure 4 we have also used the toolkit to enable access to the BSCW system from electronic mail. This required writing a CM which was executed on receipt of a BSCW-specific mail message. Such email was delivered using a Unix .forward file, and the job of the CM was again to extract attribute-value pairs, in this case from the header and body of the mail message, and forward these to the IS. Only two lines were added to the IS interest configuration file to support email requests (Figure 5), and a special AM written, mainly to map email addresses taken from the mail message header into BSCW user login names. The only changes required to the BSCW system itself were to modify the output routines to optionally generate plain text as well as HTML.

    HTTP_CM: (simple_http, initialise, bscw.gmd.de, 80)
    MAIL_CM: (simple_mail) 

    (bscw_http, handle_request, [HTTP_CM], {`URL': `^\/bscw\/?'})
    (bscw_http, get_icon, [HTTP_CM], {`URL': `^\/icons\/?'})
    (bscw_mail, handle_request, [MAIL_CM], {`SUBJECT', `^bscw$'})

Figure 5. The configuration file enabling HTTP and email access to BSCW

Conclusions and Further Work

We have highlighted a problem with the current, API-based approach to enabling applications for the W3. It is our belief that enabling applications through CGI or other APIs is too restrictive for many applications and too `heavyweight', in that many of the features provided by the W3 server are not required. On the other hand, W3 servers are well-suited to their basic task of serving static documents and pages of information, and the features they provide are specialised and optimised for this mode of operation. An attempt to increase their flexibility for application deployment by opening up their implementations through a more powerful API could result in larger and more complex servers and may compromise their performance.

In contrast, we have described a `component-based' approach to enabling applications for access from the Web in a more flexible manner using an application integration toolkit. This facilitates more modular and `lightweight' application deployment where specific modules can be integrated as required. Further, separation of the protocol-specific communication handling components from the application modules and provision of an interest-based integration service allows enabling of applications for a variety of other access methods such as electronic mail with few or no changes to the applications themselves.

Our current activities include the extension of the toolkit to provide further Communication Modules--for example, to enable access via FTP. We are also looking at methods to improve performance, such as replacing the request-handling of the IS with a multi-threaded implementation rather than the current forking approach. In the longer term we would like to investigate the possibilities for re-implementing the toolkit with a full object-oriented solution, perhaps using ILU and/or a CORBA-compliant implementation. This may be a step towards bridging the current gulf between the W3 and object-oriented technologies--an area receiving much attention from the W3 consortium [10] and the W3 community in general.

Acknowledgments

We are grateful to Wolfgang Prinz and Richard Freund for information on the POLITeam World-Wide Web interface, and to our colleagues in the BSCW project within which this work took place. (More information on the BSCW project is available from http://bscw.gmd.de/)

References

Lycos Inc., Lycos FAQ, available from http://www.lycos.com/info/faq.html
Netcraft Ltd, The Netcraft Web Server Survey, March 1996, available from
http://www.netcraft.co.uk/survey/
Information Dimensions Inc., BASISplus Product Brief, available from
http://www.dmo.hp.com/gsyinternet/productbriefs/infode.html
Bentley, R., Horstmann, T., Sikkel, K. and Trevor, J., Supporting collaborative information sharing with the World Wide Web: The BSCW Shared Workspace system, in Proceedings of the 4th International WWW conference, Boston, O'Reilly, December 1995, pp 63-74 (also available from http://orgwis.gmd.de/~bscw/papers/boston-95/BOSTON.html)
Robinson, D., The WWW Common Gateway Interface Version 1.1, January 1996,
available from http://www.ast.cam.ac.uk/~drtr/cgi-spec.html
K. Klöckner, P. Mambrey, M. Sohlenkamp, W. Prinz, L. Fuchs, S. Kolvenbach, U. Pankoke-Babatz and A. Syri, POLITeam: Bridging the gap between Bonn and Berlin for and with the users, in Proceedings of ECSCW'95, Stockholm, Sweden, 10-14 September, 1995, pp 17-32
Everitt, P., The ILU Requester: Object Services in HTTP Servers, available from
http://www.w3.org/pub/WWW/TR/WD-ilu-requestor
Informations Väverna AB, The Spinner WWW Server, available from
http://spinner.infovav.se/
Xerox Corporation, Inter-Language Unification--ILU, available from
ftp://ftp.parc.xerox.com/pub/ilu/ilu.html
W3C, Integrating Object Technology and the Web, available from
http://www.w3.org/pub/WWW/OOP/

Author Details

Jonathan Trevor, Richard Bentley and Gerrit Wildgruber

German National Research Centre for Information Technology (GMD FIT.CSCW)
Schloss Birlinghoven
D-53754 Sankt Augustin
Germany

Email: {trevor, bentley, wildgruber}@gmd.de

Fifth International World Wide Web Conference May 6-10, 1996, Paris, France

Exorcising daemons: A modular and lightweight approach to deploying applications on the Web

Author Details

Last Modified: 10:51am MET, March 08, 1996

Fifth International World Wide Web Conference
May 6-10, 1996, Paris, France