WWWinda: An Orchestration Service for WWW Browsers and Accessories

Yechezkal-Shimon Gutfreund
John Nicol
Russel Sasnett
Vincent Phuah
GTE Laboratories
Waltham, MA 02254
proj380@gte.com

Abstract

Most current WWW browsers are constructed as one large monolithic program. There are several problems with such an approach:

It is hard to add new functionality to a browser.
Viewers for new media types are added as external programs - instead of as embedded viewers.
A monolithic browser runs on one appliance. A user of such a browser is tethered to the computer/appliance that is running the browser - unable to move to new locations.
One cannot easily have different components of the browser running on a PDA, beeper, cellular phone, projection screen, remote control, etc.
Current monolithic browsers restrict the protocol and communication channel to their slave media viewers. That is, one cannot have a live continuous stream (e.g. live video) delivered via a WWW browser to a video viewer.

We are therefore creating a new WWW browser architecture. In this architecture WWW browsers are constructed out of a "flock" of distributed accessories. Each accessory implements a different function: HTML viewer, forms entry, graphical navigation tool/mapper, HTML editor, etc.

To bind these accessories to form a unified interface we have created the WWWinda Orchestration service. WWWinda is an extended form of the Linda programming language [2] tailored for WWW applications. It is implemented in TCL/TK [12] and TCL-DP [14] and runs on both unix boxes and PCs. It provides orchestration functions for coordinating distributed processes such as shared variables, resource sharing, messaging, and synchronization services.

Problem Overview

Two emerging trends in WWW browsers are modular browsers and embeddable browsers. Modular browsers are composed out of several independent programs, processes, or threads. Each component is allowed to independently implement its features, speak its own inter-application protocol, and provide custom display and GUI functions. This independence allows each sub-component to provide optimum functionality for display and interaction with each media type.

Embeddable browsers, on the other hand, value tight integration. People want the their WWW browsers to be able to present compound documents that consist of text, forms, audio, video, and pictures. These different media must appear as a single integrated document. The look and feel, the operation of the GUI, and the layout should look like a professionally unified document. All embedded components, whether continuous media, text, or forms should mesh together seamlessly.

At first glance, these two competing needs: the heterogeneity of modular browsers, and the homogeneity of embeddable browsers, may seem utterly irreconcilable. However, we believe we can achieve a fusion of these two goals. And that this can be done not only on a browser running on one platform, but also for a browser that is running on several distributed platforms running different operating systems.

However, before describing the details of our implementation, it would probably be worthwhile to give an overview of the architecture and some sample applications. In this way we hope to give more concrete substance as to how and why one creates an architecture with such outlandish goals.

Architecture Overview

The current Mosaic browser architecture allows for extensions via the external viewer mechanism. While this is a useful mechanism, it has its limits. External viewers operate asynchronously from the main viewer. There is no coordination between what is happening on one viewer and another, either visually or algorithmically.

We are interested in creating WWW browsers that have symbiotic accessories. These accessories have continuous communication channels with each other. Unlike external viewers, who receive one message (a MIME object) at start-up - accessories can communicate as long as they are active. These communication channels are used to propagate events (e.g. end of video, start of audio, or button presses), to co-ordinate mapping on the screen, and provide a variety of orchestration services.

Each accessory implements a different part of the WWW browser. One (or more) accessories serves as an HTML viewer. Another as an HTML editor. Another can be a graphical WWW road-map showing the documents reachable directly, or indirectly from the current document. Some accessories act as allocators of shared resources. For example, there is an accessory to manage screen layout, an accessory to manage mouse and keyboard ownership, and an accessory to police access to shared Internet connections to insure that continuous audio and video have priority over datagram fetches.

Related Work

The concept of an accessory should be familiar to people who work with TCL/TK [12] and to a lesser degree with those who work with Microsoft Windows [10] applets. In these systems one migrates small constrained interactive functions to accessory applications. For example, interactive drawing tools, chart tools, debuggers, paint tools, etc. are implemented as applets or TK tools that operate in tandem with each other.

Maintaining a communications channel between application components is an important part of these systems. With applets, one uses OLE to invoke and communicate information, while in TCL/TK one uses the send mechanism (X11 property lists) or the TCL-DP [14] RPC mechanism.

This model of software, where compound applications are fabricated out of a flock of inter-communicating sub-application tools, is also gaining acceptance in the CORBA [11] community.

There have also been some efforts at creating modular embedding mechanism for WWW browsers, most notably the W3API [1]. Other work in this area include the x-exec protocol [13] and an embedded multimedia browser [3]. However, these techniques do not appear as elegant, simple, and concise as the WWWinda approach (at least in the authors' opinion).

Sample Applications

One feature many people have requested to be added to a WWW browser is a USENET news reader. However, a full function news browser needs a history list, kill file, and GUI functions for posting, replying, and sending mail. Although the current Mosaic browser allows one to view a newsgroup, there is no mechanism for reading a history list and using this to prune out articles. Using WWWinda accessories, one can have one accessory for the news viewer, another that reads the history list, another that fetches the news articles from the NNTP server, and other that prunes the articles, and sends the resulting list to the news viewer.

We can add a new accessory each time we have to: (a) speak a new protocol, (b) open new files or databases, (c) implement new GUI functions. In this way we extend our WWW browser in a modular fashion. When we need new functionality we implement a new accessory and plug it into our flock of currently executing WWW browser accessories. Used in this manner, accessories can be thought of as plug-and-play add-on modules.

One major advantage of a modular accessory-based approach is that many of these components can be re-used to create other applications. For example, the news reader accessories could easily be re-used to create a full-function HTML multimedia mail tool, or a multimedia database browser, or a multimedia authoring and editing suite.

Another function for WWWinda accessories is to provide automatic forms fill-out. Frequently one finds forms requesting such information as name, mail address, etc. It can be tiresome to keep filling in the same information. It would be nice to have an accessory that is always attached to one's WWW browser that would use a personal database to fill out any recognized fields in a form. This, however, would require some sort of standard tagging convention for form entry fields. Once this occurs, a filter accessory could automatically notice standard fields, request the necessary information from a database, and return a form pruned to only unfilled fields. The auto-fillable information would be added via hidden entry fields.

Accessories do not have to reside on the same computer. One would like to be able have both full feature accessories for high-end workstations, and low-end accessories for PDAs such as Apple's Newton or IBM's Simon. Intelligent filter accessories would provide reasonable translations to the lower-grade features available on the PDA.

PDA-based accessories are one step toward providing ubiquitous access to the Internet and WWW. PDA-based accessories allows one to migrate from home to office to remote office while maintaining continuous contact. One transfers the current state of one's work to the PDA when one moves from station to station. One can also use the PDA to keep track of expense's on trips, and maintain personal information such as: diaries of news articles read, electronic signatures, security codes and passwords, and other information that one would like to be able to "plug" into whatever workstation one's nomadic life takes one to. It would also seem to make a lot of sense to keep information that one uses to personalize one's browser (.mailcap, etc.) in the PDA so that wherever one goes, these personalizations are present.

WWWinda

The WWWinda architecture is based on Linda-like [2] co-ordination facility. Information and events that are to be shared between accessories are deposited in a TupleSpace. TupleSpace is a software implementation of a shared distributed memory. TupleSpace appears as a single shared blackboard where accessories can both deposit tuples. Accessories are notified when the values of tuples change or when tuples are created. In this way information structures can be shared and events can be passed.

WWWinda allows for multiple TupleSpaces. This allows shared information to be modularized. Accessories can restrict their interest and access to only the TupleSpaces that they are interested in.

In the current implementation, TupleSpaces are managed and controlled by a single program called the Orchestrator. This was done only for speed of implementation. There is no reason why TupleSpace cannot be implemented among a flock of Orchestrators each running on a separate machine. Or we could take a more Linda-like approach and have no central repository. However, since we have not seen any significant performance problems, there is no immediate need to change the implementation. If we do decide to change this at a later date - these changes will be invisible to the accessories - since it will not require any changes to their interface.

A diagram of the current implementation can be seen in figure 1.

Figure 1: Current WWWinda Architecture

The interface to the Orchestrator is as simple and minimalistic as we could make it. Each accessory uses the connect command to connect to a particular TupleSpace. The syntax of the connect command is:

	connect TupleSpace

For example, connect GLOBAL could connect the accessory to a TupleSpace where global information about other TupleSpaces is kept. If the TupleSpace already exist, the Orchestrator makes a connection to the accessory. If the TupleSpace does not exist, the Orchestrator will create it first. There is also a:

	disconnect TupleSpace

command that the accessory uses to detach from a TupleSpace. Tuples are placed into a TupleSpace via the out command:

	out TupleSpace Tuple

Unlike Linda there is no in command. Once a tuple is placed in a TupleSpace it is automatically forwarded to each accessory that is connected to that TupleSpace. Tuples appear in the accessory's global variable space with the name: TupleSpace(tuple). The Tuples themselves are composed as TCL list objects. For example, the tuple {"timestamp" 55.46} or {"url:" http://kesser.gte.com/ size 5733} would be tuples with 2 and 4 elements respectively.

Accessories can use the TCL trace command to notice new tuples or a change in value in an old tuple. For example, to call the procedure called Update every time the VIEWER(url) tuple is written:

	trace variable VIEWER(url) w Update

The Update procedure will receive the name of the tuple and its value every time the VIEWER(url) tuple changes.

The Orchestrator also has a TupleScope, which allows one to monitor and edit tuples interactively while WWWinda is running. Scripts can also be added to the Orchestrator to perform additional orchestration services. These scripts can either be pre-built into the Orchestrator or added on-the-fly by accessories.

Analysis

WWWinda is a very small minimalist system. There are only 4 commands. Implementation of the core system took only 2 weeks including creating a visual TupleScope for examining and editing TupleSpace. Most of the speed of implementation came from implementing the system on top of TCL/TK and TCL-DP. WWWinda can be implemented directly in "C", but implementing in TCL allowed for speedier development and a faster port from UNIX to PCs.

Performance is quite acceptable. The speed of sending a tuple is only slightly more than for a normal RPC. Message traffic is minimized by having multiple TupleSpaces. Accessories only need to receive the subset of tuples they are interested in.

However, while tuple-based inter-application messaging is fast - it should NOT be used for binary continuous data (e.g Audio/Video). A/V data has real-time latency and throughput requirements which are not suited to tuple-based communication. Communication via tuples should be thought of as out-of-band communication. In-band communication is better handled by protocols such as ATM level 5 (which is what we use in our laboratory). In the figure (figure 1) we indicate in-band communication with a thick line and tuple communication with a thin line.

This means that one can view WWWinda tuple-communication as a signaling protocol. It is useful for signalling events such as end or start of video. The actually heavy duty transport protocol for the video should be done via a protocol tuned for video. WWWinda tuples are also good for sharing shared data structures such as a screen and window allocation map.

The Orchestrator allows one to implement scripts that look for patterns of distributed activity. For example, we can implement load-balancing algorithms based on patterns of activity. Alternatively we can place the screen management control centrally in the Orchestrator. However, there is no reason why this logic cannot also be migrated to an accessory. The WWWinda architecture should be seen as a flexible approach providing for a variety of implementation paradigms.

Applications

WWWinda is actually the fourth generation of Orchestration mechanism we have created at GTEL. Previous versions have been presented at NOSSDAV [5] and at ACM MultiMedia/Siggraph''94 [4]. The first use WWWinda was put to was to new implementation of the distributed karoke systems that was shown at Siggraph'94.

Figure 2: Distributed Karoke System

In this application, each instrument is an accessory running a different machine (Harpsichord on hal.gte.com and Oboe on albion.gte.com). Each instrument maintains a wavetable of sounds that were synthesized using CSound [15]. There are wavetables for Cello, Flute, Harpsichord, Oboe, Piano, Violin, and Sine Wave.

Each accessory was implemented in TCL using a special shell composed out of the TCL-DP, WWWinda, and AudioFile interfaces. AudioFile [9] is a distributed audio server. It works much like the X-Windows server. It allows remote clients to connect to it and send audio data. The AudioFile server will mix the audio data together and play it out of a speaker. We modified AudioFile to provide support for CD-Quality sound (44.1Khz 16-bit Stereo), and support transport of data over TCP-IP over ATM Level 5.

Figure 3: Architecture of Distributed Karoke System

Each instrument/accessory operates independently. One can press a button to select a new wavetable or select a different speaker (left, right, mono, stereo) for the instrument at any time. Even in the middle of a note. No instrument is aware of how many other instruments there are, nor which speakers they are playing on. Commands to play a note are received via a tuple called KAROKE(note). When a new note appears in this TupleSpace, all instruments play it according to their current wavetable and speaker configuration. This independence leads to a very dynamic system. New accessories can be added, deleted, or migrated in response to user requests, or network and host load changes.

Each instrument is being directed what to play via signals being sent from the Orchestrator. These signals are in the form of tuples. The actual in-band audio information is being sent from the instrument to the AudioFile server via a separate ATM channel.

The keyboard accessory provides the notes that are to be played. When the user presses a key on the keyboard accessory a note tuple is placed in TupleSpace. The keyboard is also capable of playing a sequence of tuples. For example: a harmonic scale, a Bach fugue, or Scott Joplin's "The Entertainer".

This Karoke duet could easily be extended. One can add more instruments on the fly (running on different machines). Or one can add a score accessory that will display the musical score and highlight the current note that is being played. A more interesting extension would be a microphone accessory. Digital Signal Processing (DSP) software inside the microphone accessory could be used to interpret voice input as cues for which note to play. The user could then sing into the microphone and the instruments would accompany him (as is possible today on some electronic keyboards instruments).

This last application shows some of the power of the WWWinda architecture. It is possible to uses cues and information coming from live audio or video sources to drive and control distributed accessory software. We view this as an important feature in providing more lively and reactive interfaces.

The entire Karoke implementation in WWWinda took less than one week (using pieces from the previous ToolTalk-based implementation that was shown at MutiMedia'93).

WWW-TV

We are currently using WWWinda to extend WWW browsers. We wish to provide live video and audio embedded inside a WWW browser. This could be done by modifying the Mosaic browser, or when Tk V4.0 arrives, we can use TkWWW (which is our current plan). However, we have been able to do some preliminary experiments using Mosaic.

Our first experiment was to provide in-line audio and video. That is, when one fetches an HTML document, one can have audio and video objects play on start-up - rather than having to manually press links that point to MIME audio and video objects.

The technique used to do this is less elegant than what we will be able to do with Tk V4.0, but it seems to work quite well. Rather than modify Mosaic sources (which change rapidly) we decided to use the proxy mechanism to "wrap" the Mosaic application. We start Mosaic with it's HTTP proxy to be a special WWWinda accessory which is called: proxy. This accessory is responsible for actually fetching all HTTP accessible documents. If the fetched document has special comments of the form:

	<--proxycommand arg1 arg2 .... -->

Then the proxy accessory will execute that command. Since the proxy is written in TCL, these commands can be any command in the TCL scripting language. Thus one can run programs, send messages, etc. In our case, we use the proxy command mechanism to send messages to the Orchestrator to start up video and audio viewers in parallel. The proxy accessory will return the HTML document (including comments) and video and audio viewers will start up at the same time.

This experiment took less than one day to implement. However, there are still some significant gaps. For example, there are significant security holes opened up by allowing the proxy to executed unrestricted TCL code. This problem can be alleviated by restricting the proxy commands, as is done in SAFE-TCL.

Figure 4: WWW-TV snapshot of a late-breaking story

Another application we are working on is called WWW-TV. In this example a user will walk around with a PDA equipped with a beeper attachment. A scanner-robot accessory will be continuously running on the user's host machine. The scanner-robot will examine TV-news captions, e-mail messages, WWW newslines, newsgroups, IRC, etc. The scanner-robot will be programmed to look for interesting key phrases, in this case the user has programmed it to look for references to President Bill Clinton.

When the scanner discovers an interesting event it will beep the user's PDA, with the item. This can be seen on the simulated PDA accessory at the bottom of figure 4. The user then can walk to any nearby machine connected to the Internet. The scanner accessory has annotated the beeper message with a list of live-TV and internet news sources that are carrying the story. It has also placed "bookmarks" in the TV stories so that the user can watch the story from the beginning. The user checks off (on his PDA) the which sources he feels provides the best quality (as shown on in figure 4). The resulting WWW-TV-news journal is created by compositing the live news images, still images, and text from the newspapers into a single unified document show on the left of figure 4. If the user was not near a Internet box with a high speed line, then a low-grade display could be shown on his PDA.

WWW-TV demonstrates a principle we are calling InformationInterface-on-demand. One is able to migrate from one location to another, and the interface will scale dynamically based on network capacity and interface quality. This is made possible because accessories are modular "plug-and-play" objects. They can dynamically re-connect to different accessories (based on changing network connectivity and topology) and be dynamically replaced (to allow for higher quality display).

Figure 5: Architecture for WWW-TV

Conclusions

The WWWinda architecture allows us to create much more powerful WWW browsers. We can extend the WWW browser to include embedded video and audio viewers. We can augment WWW browsers to provide continuously running scanning robot accessories. We can have portions of the WWW browser running on a PDA. If we are at a remote site, we can rely on the PDA to give us low-grade access to the Internet. However, when we come in the vicinity of a better connected terminal we should be able to increase the bandwidth of our interface. The transition in interface will feel like the change from an 8mm movie to 70mm VistaVision.

The result will be an untethered interface to the Internet. We will be free of stationary computer appliances. One will have ubiquitous access to all the information available on the Internet, but be able to nomadically move from location to location.

The key element to this is symbiotic accessories. Accessories that can fully and continuously communicate with each other. Plus the usage of a very simple put powerful communications mechanism (TupleSpaces) for these distributed accessories to signal events and share data structures.

Acknowledgments

The authors gratefully acknowledge their thanks to the MultiMedia and TCL-DP Team at Berkeley: Larry Rowe, Brian Smith, and Gordon Chaffee. WWWinda is heavily leveraged on top of the TCL-DP facility. Thanks also go to John Ousterhout for creating TCL/TK. We also wish to thank Bert Bos, who has encouraged our work in this area and to NCSA Mosaic team who ignited the Internet community to the potential of WWW (kudos to the CERN WWW team) and the Internet.

References

Bert Bos. W3a: WWW Applets API proposal. http://www.let.rug.nl/~bert/W3A/W3A.html.
Nicholas Carriero and David Gelernter. How to Write Parallel Programs: A Guide to the Perplexed. ACM Computing Surveys. November, 1989.
Michael D. Doyle. Embedded Program Objects in Distributed Hypermedia Systems. Inquiries to: Martha Luehrmann, UC Office of Technology Transfer, 510-748-6611, martha@ott.ucop.edu, Reference: UC Case 94-108. http://visembryo.ucsf.edu/
Yechezkal-Shimon Gutfreund, Jose Diaz-Gonzalez, Russell Sasnett, and Vincent Phuah. CircusTalk: An Orchestration Service for Distributed MultiMedia. Proceedings of: ACM MultiMedia'93. August 1-6, 1993. Anaheim California. pp. 351-357.
Yechezkal-Shimon Gutfreund, Jose Diaz-Gonzalez, Russell Sasnett, and Vincent Phuah. Dynamicity Issues in Broadband Network Computing. In Network and Operating System Support for Digital Audio and Video, Second International Workshop, Heidelberg Germany, November 1991. Springer-Verlag 1992. 209--216.
Yechezkal-Shimon Gutfreund. ThinkerToy: An Environment for Decision Support. Ph.D. Thesis. Computer and Information Sciences Department, COINS Technical Report 88-72. 1988.
Yechezkal-Shimon Gutfreund. ManiplIcons in ThinkerToy. Proceedings of OOPSLA'87 Object Oriented Programming Systems, Languages, and Applications. Norman Meyrowitz editor. October 4-8 1987. Kissimee, Florida. SIGPLAN Notices, Vol. 22, No. 12, (December 1987) 307--317.
Matt Hodges and Russell Sasnett. Multimedia Computing: Case Studies from MIT Project Athena, Addison-Wesley, Reading MA, 1993.
Thomas M. Levergood, Andrew C. Payne, James Gettys, G. Winfield Treese, and Lawrence C. Stewart. AudioFile: A Network-Transparent System for Distributed Audio Applications. Proceedings of USENIX Summer Conference, June 1993.
Microsoft. Microsoft Windows for Workgroups. Microsoft, Redwood WA. 1994.
OMG. The Common Object Request Broker: Architecture and Specification. Object Management Group, Framingham, MA. 1991.
John K. Ousterhout. TCL and the Tk Toolkit. Addison-Wesley. Reading, MA. 1994
George Phillips. The x-exec: URL scheme. http://www.cs.ubc.ca/doc/world/exec/intro.
Brian C. Smith, Larry A. Rowe, and S. Yen. Tcl Distributed Programming, Proceedings Tcl 1993 Workshop, Berkeley, CA, June 1993.
Barry Vercoe, MIT CSound package. http://www.leeds.ac.uk/music/Man/c_front.html

Biography of Principal Author

Yechezkal-Shimon Gutfreund has been programming for over 26 years. In 1974 while at the University of Illinois/Champaign-Urbana (and working on a BSCS) he discovered PLATO. Since that encounter he has had an abiding interest bit-mapped graphics, distributed collaboration (upwards of 40 people at a time would interact at one time in PLATO games (e.g. airfight), and multimedia (PLATO had a music synthesizer called a Gooch box). He still receives royalties for some of the software he created for PLATO.

From 1978-1982, he was first a member of the DEC RSTS/e operating system group, and then later a member of the DEC Corporate Research Group. At DEC CRG he created one of the first Ethernet-based distributed OS's and the first implementation of Smalltalk-80 outside of Xerox PARC.

From 1982-1988, he was at the University of Massachusetts/Amherst where he got a Ph.D. (CS) for his thesis: ThinkerToy [6]. This was a visual programming and simulation environment for creating and microworlds. The environment was used to create microworlds to model and explore visual-oriented problems in many diverse fields of investigation: GIS, queuing theory, and statistical analysis.

Since 1988 he has been at GTE Laboratories working with a small group on distributed MultiMedia. One of the members of this group is: Russ Sasnett, co-creator of MIT Athena Muse [8].

Email

Comments about this paper can be sent to sgutfreund@gte.com