Copyright is held by the author/owner(s).
WWW2002, May 7-11, 2002, Honolulu, Hawaii, USA.
ACM 1-58113-449-5/02/0005.
Network and server-centric computing paradigms are quickly returning to being the dominant methods by which we use computers. Web applications are so prevalent that the role of a PC today has been largely reduced to a terminal for running a client or viewer such as a Web browser. Implementers of network-centric applications typically rely on the limited capabilities of HTML, employing proprietary ``plug ins'' or transmitting the binary image of an entire application that will be executed on the client. Alternatively, implementers can develop without regard for remote use, requiring users who wish to run such applications on a remote server to rely on a system that creates a virtual frame buffer on the server, and transmits a copy of its raster image to the local client.
We review some of the problems that these current approaches pose, and show how they can be solved by developing a distributed user interface toolkit. A distributed user interface toolkit applies techniques to the high level components of a toolkit that are similar to those used at a low level in the X Window System. As an example of this approach, we present RemoteJFC, a working distributed user interface toolkit that makes it possible to develop thin-client applications using a distributed version of the Java Foundation Classes.
Few would argue that the explosive growth in the computer industry is not closely tied to the rise in popularity of networks and, in particular, the Internet. In 1990, an average person purchased a computer for running application software such as word processing, spreadsheets, and perhaps a drawing program. Today, that same person would undoubtedly purchase a computer to access the Internet. However, while computing is heading in radically new directions, the techniques used to present graphical user interfaces have remained the same.
The vast majority of user interfaces are written using visual components (often called widgets or controls) that are gathered together in libraries that are usually referred to as user interface toolkits (Binding88). The most popular toolkit is MFC (Microsoft Foundation Classes) (MFC), which, as its name implies, is used to construct user interfaces for the various flavors of Microsoft's Windows operating system. A more recent toolkit that is gaining popularity because of its ability to create cross-platform compatible graphical user interfaces is JFC (Java Foundation Classes) (JFC). Complementing these are the more mature user interface toolkits built for the X11 Window System (Scheifler-Gettys86a), such as Athena (McCormack91), Motif (Motif), and Tk (Ousterhout94).
A user interface toolkit provides an abstraction layer for the low-level drawing and interaction routines made available to programmers by the graphics subsystem that is usually bundled with the operating system. This abstraction allows programmers to quickly create commonly used visual components, such as buttons, scrollbars, menus, and text fields. End users also benefit, since most of the applications they run on a particular operating system will have roughly the same ``look and feel'' because the applications are all built out of components from the same user interface toolkit. However, despite these advantages, the tight binding of the user interface toolkit to the underlying graphics subsystem presents significant challenges when creating distributed applications in which the application logic execution and user interface presentation occur on different computers.
Many approaches have been researched academically and deployed commercially to support a distributed computing paradigm in which the network separates the presentation of the user interface from the application logic. Phrases such as ``server-centric computing,'' (Lewis95) ``network computing'' (Baratloo98, Golick99), ``thin clients'' (Zukowski97, Schmidt99, Golick99), ``distributed presentation'' (Golick99), and ``remote presentation'' (Rody95) are prevalent in the literature. In the remainder of this paper, we first discuss in Section 2, two approaches to thin-client computing that are commonly used in both industry and academia: Web-based appplications and graphics pipeline interception. We then describe an alternative, distributed user interface toolkits, in Section 3, and introduce in Section 4 our distributed user interface toolkit, RemoteJFC, describing how it addresses many of the issues that arise from using existing systems. Next, in Section 5, we present a performance comparison, and, in Section 6, conclude with some remarks about the directions in which our research is heading.
One of the most widely deployed approaches to thin-client computing uses HyperText Transfer Protocol (HTTP) (RFC1945) and HyperText Markup Language (HTML) (RFC1866,RFC2854) to interact with the server and display document database, commonly known as the World Wide Web. The popularity and wide availability of browsers that provide the front-end to view HTTP/HTML content has brought about the rapid development and deployment of applications that are accessible only through this format. The architecture of an application developed using a Web-based methodology is depicted in Figure 1.
One severe limitation of a Web-based application that relies solely on HTTP/HTML is the ``pull-only'' data transfer methodology. Such applications are prevented from generating events and thus cannot provide a rich user experience. For example, when a user executes a search on a Web search engine, the engine must ideally complete the search in its entirety within a few seconds of when the request was made because the user is expecting an immediate response and the engine cannot notify the user that better results have been found after the initial page has been displayed. A second problem is that HTTP is stateless, which makes it difficult for programmers to create even a simplistic notion of persistence between page accesses. In addition, the user interface toolkit (HTML forms (RFC1866)) is also extremely rudimentary, providing only a handful of the most commonly used components.
Many attempts have been made to address these problems, including sending entire applications over HTTP (e.g., Java applets (Applet)), designing browser ``plug-ins'' that interpret their own language to provide a richer user experience (e.g., Macromedia Flash and Shockwave (Flash)), creating a 3D world in which the user can navigate (e.g., VRML (VRML)), and providing an application programmer interface (API) for storing persistent session identification data (e.g., cookies (RFC2109)). All these attempts to address the problems with HTTP/HTML give rise to a host of new problems.
Java applets raise numerous security concerns because HTTP is used to transport executable code to the client. Although the byte codes transmitted across the network are in compiled form, Java decompilers are readily available that will allow any user to have access to the source code of the application. In addition, the use of Java applets typically violates the thin-client principle of not running any application logic on the client. Flash and VRML define richer languages that have been built with user interactivity in mind, but suffer from the problem that mature browsers for anything other than the Microsoft Windows desktop operating systems are generally not available. HTTP cookies raise numerous security concerns because they permit the server program to write data to the permanent storage device on the client. In addition, HTTP cookies have been the target of severe criticism due to a recent surge in public awareness regarding privacy concerns when using the Internet. These issues make HTTP cookies an unattractive method for programmers to add server-side state to the HTTP protocol.
A second approach to delivering applications in a thin-client environment that has been gaining popularity involves intercepting rendering commands sent to the graphics pipeline. One simple way to implement this is to create a virtual frame buffer in the RAM of the server on which the application can render it's GUI and then transporting the resulting raster image to the client. A more complicated implementation would try to send higher level commands (e.g. draw a line from xi, yi to xj, yj) whenever possible in order to reduce network traffice.
In essence, this approach attempts to bring the server's desktop to the user and thereby permits a full range of user interactivity. Products such as Citrix MetaFrame (Citrix), Insignia NTrigue (Insignia), SCO Tarantella (SCO), Graphon RapidX (Graphon) and Symantec PC-Anywhere (Symantec) are among those that have been providing this type of functionality for many years as an extension to the underlying operating system. A recent explosion in the popularity of this approach occurred when AT&T released their cross-platform VNC (Li00) system to the public free of charge. Microsoft has now made this capability a standard part of their Windows 2000 and XP operating systems (TerminalServices). The architecture of an application that employs the remote frame buffer approach for presentation on a thin-client is displayed in Figure 2.
Although the approach of intercepting the graphics pipeline addresses many of the problems with a Web-based approach that uses HTTP/HTML, it also introduces a number of other problems. While the Web-based approach using HTTP/HTML is capable of operating reasonably well over relatively low-speed modem network links, the graphics pipeline interception approach demands high-bandwidth connections. This is because transporting the virtual frame buffer from the server to the client is essentially sending a video stream of computer-generated graphics. Although the use of advanced lossy video compression algorithms (e.g. MPEG (MPEG)) has been proposed (Richardson98), none of the existing systems employ such techniques. This is because real-time encoding of MPEG streams usually requires special hardware that can only handle one or two streams at a time, thereby eliminating the possibility of using the remote frame buffer approach on a current shared server. In addition, the use of lossy compression techniques would introduce unwanted compression artifacts into the display, reducing the system's usability, particularly when working with text and detailed graphics.
Some systems (e.g. RDP-based Citrix Metaframe and Microsoft Windows Terminal Services) attempt to reduce bandwidth consumption by trying to intercept high level drawing commands. However, this doesn't necessarily result into better performance. For example, if the GUI employs many image labels for buttons, the bandwidth consumed by transferring the images every time the display needs to be redrawn dwarfs everything else. In addition, these systems often employ other optimizations such as not transferring display updates that are thought to be ``unimportant'' (e.g. animated cursors). Although this technique appears to reduce the the bandwidth consumed independent of all possible factors, it almost always causes the display on the client to not be updated properly, ultimately resulting in user frustration. Users that experience this will often start moving the mouse or windows around in order to force the system to update the display. This can actually result in more bandwidth being consumed than if the system had simply sent the updates using the more naive approach. Finally, these systems suffer from the overhead of bringing the entire user's desktop from the server to the client rather than just the application.
The existence of server-side state and asynchronous event generation by the server permits the graphics pipeline interception approach to provide a rich level of user interactivity that a Web-based approach using HTTP/HTML cannot. However, there is a practical limitation caused by network latency. Figure 4 shows a typical client ``viewer'' (the graphics pipeline interception analogue of the Web browser) that displays two mouse pointers. One mouse pointer represents where the cursor should be pointing, and is tied to the local mouse. A second mouse pointer, which typically lags behind the first, displays where the mouse position is on the server. When a simple remote frame buffer system is run on anything other than a high-speed LAN connection, there is always a noticeable difference in position between the client (virtual) and server (real) mouse positions. More advanced graphics pipeline interception implementations (e.g. RDP systems) generally do not have the same mouse pointer lag issue, but still stuffer from a similar problem when a window is dragged. On a slow modem link, this makes highly interactive user interfaces difficult to control, and, in extreme cases, may even make the system unusable.
One important advantage of graphics pipeline interception systems are that they tend to be binary-compatible with a large set of existing software packages intended for use with desktop computers. Many industrial and academic institutions employ graphics pipeline interception systems in production environments to provide users with thin-client access to some subset of the enterprise or campus computing infrastructure. By intercepting graphics routine at the operating system level, little or no programming effort is involved in deploying the system.
Distributed user interface toolkits address the issues that arise when employing Web-based HTTP/HTML and remote frame buffer approaches by allowing a server to manipulate user interface toolkit components directly on the client. The server can create, modify, and delete any of the components available in the distributed toolkit as if it were working with a local application. One might think of this as an implementation of a remote frame buffer with an extremely efficient, lossless compression algorithm. Instead of sending pixel data rendered on the server across the network, the distributed user interface toolkit sends the semantics necessary to render that pixel data on the client. In addition, since the mouse is handled locally on the client, there is no additional perceived latency beyond that caused by the processing that is necessary to service users requests when the application is running locally.
The concept of creating architectures and toolkits that support the development of distributed applications is not new. For example, the X Window System (Scheifler-Gettys86a) and the Network extensible Window System (NeWS) (Gosling:1989:NBI) were built with the network in mind, although they transport low-level drawing commands. If a high-level user interface toolkit is used with X or NeWS, it appears as if the high-level commands are being transported across the network, although this is not what is happening. Under X, the high-level user interface toolkit commands (e.g., draw button) are actually translated into low-level commands (e.g., lines and rectangles) before being transmitted across the network. In addition, the X Window System stores state on the computer that is presenting the output (unfortunately called the server). Consequently, it is very difficult to ``share'' X Window System sessions between multiple users, and if the X Window System server (running on the client computer) fails, the user session is lost. It is for these reasons that a remote virtual frame buffer system, such as VNC, is often employed to transport an X Window System desktop from a UNIX server to an X Window System viewer running on a UNIX workstation, rather than relying on the built-in networking facilities of X.
Many research efforts have attempted to make systems more network aware. For example, the mobile computing community has looked into modifying the operating system to support network-aware applications (Jing99,Joseph95). The collaborative computing community has also contributed numerous architectures and toolkits (Edwards97, Hill94, Prakash94, Schuckmann96). In addition, the user interface community has been exploring how the network can enhance the functionality of user interface toolkits (Hudson97, Anderson86), while the virtual reality research community has developed distributed user interface toolkits for 3D graphics (Saar99, MacIntyre98). These architectures and toolkits are all designed with the premise that most, if not all, of the application logic executes on the client.
To determine how a distributed user interface could be constructed, and to explore the advantages that it could make possible, we have developed RemoteJFC (RJFC). Our primary goal for RJFC was to create an API that tracks the design pattern and functionality of the standard JFC API closely as possible, with the exception that the presentation displays on a remote client, rather than the local frame buffer. To accomplish this, we needed to create a well-defined API and corresponding software development kit (SDK), establish a protocol for client-server communication, and develop a Java-based viewer that provides a graphical context on the client. Figure 5 shows the architecture of a RJFC application.
public void registerDisplay(RJFrame d, RJFCFactory f) throws RemoteException { RJTextArea TheArea = f.getRJTextArea(20,20); TheArea.addKeyListener(new TextAreaKeyListener()); RJScrollPane Pane = f.getRJScrollPane(); Pane.setViewportView(textArea); RJTextField StatusBar = f.getRJTextField(); StatusBar.setEditable(false); RContainer rc = d.getContentPane(); rc.setLayout(new BorderLayout()); rc.add(StatusBar, BorderLayout.SOUTH); rc.add(CreateMenu(), BorderLayout.NORTH); rc.add(Pane, BorderLayout.CENTER); }
(a)
public MyJFrame() extends JFrame { JTextArea TheArea = new JTextArea(20,20); TheArea.addKeyListener(new TextAreaKeyListener()); JScrollPane Pane = new JScrollPane(); Pane.setViewportView(textArea); JTextField StatusBar = new TextField(); StatusBar.setEditable(false); Container c = this.getContentPane(); c.setLayout(new BorderLayout()); c.add(StatusBar, BorderLayout.SOUTH); c.add(CreateMenu(), BorderLayout.NORTH); c.add(Pane, BorderLayout.CENTER); }
(b)
When building the RJFC system, we wanted to make sure that the system would be appealing to users. To accomplish that goal, we concluded that we would need to provide an API that was both familiar and rich in functionality. In other words, we need to emulate as much as possible of the JFC user interface toolkit provided by Sun. Like many contemporary user interface toolkits, the JFC API is extremely complex, with over 600 individual source files, each providing between 10 to 100 methods for the programmer to use. It would be a daunting task for a small research team to write wrappers by hand for all of this code. Since the source code to JFC is readily available, we created a code generator, using the Java Doclet API (Doclet), that reads in the JFC source and produces RJFC for each JFC element. This approach allows us to generate different versions of our RJFC system for various implementations and releases of the Java SDK, making it possible to handle a broad range of supported JVMs.
An example of code written using the RJFC API is shown in Figure 6. We tailored our code generator so that the resulting RJFC API is sufficiently similar to JFC that a programmer fluent in creating applications with JFC need only know that a capital ``R'' must be prepended to the name of the toolkit component being referenced. Although we could have defined component creation using the ``new'' keyword, we use a RJFCFactory object for performance reasons (discussed in the Section 5).
Manipulation of the RJFC components (e.g., changing the text of a label) and association of event handlers are syntactically identical to the JFC API. Although each RJFC component has an actual associated JFC component that lives in the viewer's memory space, the programmer interacts with the display solely by making calls on the RJFC components. The actual JFC components that are used to create the display on the viewer are hidden from the programmer. Since RJFC components track the JFC API and follow the Sun Java Beans standard, they may also be easily used in graphical user interface builders such as Sun Forte for Java (Forte), Borland JBuilder (JBuilder), and WebGain VisualCafe (VisualCafe).
When a RJFC component is instantiated, modified, or deleted on the server, the RJFC toolkit transparently informs the attached RJFC viewer of the event that has occurred using remote method invocation (RMI). The RJFC viewer then reacts to this by performing the exact same action on the viewer that would have occurred on the server if the JFC API were used. For example, if the server requests that a new RJButton be created, the RJFC toolkit would transmit that command to the viewer Java VM. The viewer Java VM then creates a JButton using the standard JFC API, thus causing the actual button to be rendered on the client. Similarly, when the RJFC server installs an event handler into a RJFC component, the server uses RMI to tell the viewer to install a proxy JFC event handler into the associated JFC component that is actually being displayed.
One key performance optimization in the RJFC Protocol is the use of a RJFCFactory to create JFC components in the viewer's memory space while returning a RJFC reference to the server. The RJFCFactory is a remotely accessible object (it extends UnicastRemoteObject and implements an interface that extends Remote) that lives in the viewer's memory space. When a viewer connects to a RJFC server, the viewer passes a reference to the RJFCFactory into the display registration method implemented on the RJFC server. Once the RJFC server has a reference to the RJFCFactory, the server can create JFC components that live in the viewer's memory space and receive a remote reference to the associated RJFC wrapper object rather than creating an object in the server's memory space, sending the serialized object to the client and then sending a remote reference to the wrapper object back to the server. Our measurements show that a RMI call consumes five Ethernet packets, whereas sending a serialized JButton consumes more than ten times that number.
The RJFC protocol uses a similar approach to accomplish event handling. When an event handler is installed into a RJFC component on the server, RJFC uses RMI to send a simple message that tells the viewer to install a proxy event handler in the associated JFC object. The proxy event handler makes a RMI call to the server whenever a new event is generated on the client side. The actual semantics of the event handler as defined by the application logic is executed on the server when the server receives the RMI call from the client. Server-generated events are supported by simply having the RJFC server retain the reference to the RJFC component returned by the RJFCFactory after the display initialization is completed. With a remote reference to the RJFC component, the server is free to asynchronously generate events at will.
The RJFC viewer, shown in Figure 7, provides a context in which the RJFC server application can manipulate the client frame buffer. The viewer is a hand-coded application that uses JFC and emulates the functionality found in a typical thin-client system. The user of the system invokes the viewer, at which point a JFrame window is created with a form that allows the user to connect to a server. Once a connection is established, another JFrame window is created for the server to manipulate remotely. The server may also request that additional windows be created by asking for dialogue boxes using the RJFC API. Figure 8 shows screen shots of several small applications being run in the RJFC viewer.
Event | To Srv | To Client | To Srv | To Client | To Srv | To Client |
Connect | 724613 | 12553 | 12261 | 27343 | 69181 | 56623 |
Log In | 0 | 0 | 82152 | 4687 | 0 | 0 |
Open Application | 39607 | 9364 | 24613 | 4509 | 0 | 0 |
Idle (1 min, static mouse) | 0 | 0 | 12660 | 6200 | 0 | 0 |
Idle (1 min, anim. mouse) | 0 | 0 | 24810 | 9217 | 0 | 0 |
Idle (1 min, no mouse) | 0 | 0 | 6390 | 2200 | 0 | 0 |
Idle (1 min, full screen) | 0 | 0 | 1709 | 2960 | 0 | 0 |
Typing (1 min, 382 chars) | 377159 | 135392 | 79617 | 74304 | 0 | 0 |
Cut Paragraph | 125969 | 38786 | 79618 | 74304 | 0 | 0 |
Paste Paragraph | 91811 | 29430 | 1437 | 1899 | 658 | 461 |
Copy Paragraph | 153062 | 41224 | 2508 | 2979 | 295 | 460 |
Find in Paragraph | 154306 | 40750 | 5157 | 2312 | 1390 | 1965 |
Save Document | 187768 | 49384 | 10621 | 5684 | 1238 | 2004 |
New Document | 60940 | 19498 | 1360 | 2123 | 689 | 875 |
Open Document | 114686 | 25120 | 6590 | 3144 | 1423 | 1396 |
Resizing from full screen | 741322 | 60056 | 180576 | 16185 | 0 | 0 |
Drag 1/4 size window | 697433 | 64530 | 134016 | 212275 | 0 | 0 |
Drag mouse across screen | 308324 | 99300 | 1471 | 3726 | 0 | 0 |
Tear down | 9618 | 3006 | 1779 | 2097 | 1667 | 2210 |
The Web-based thin-client approach using HTTP/HTML consumes very little bandwidth because HTML represents a presentation's semantics at an extremely high level. While a relatively small amount of information is transported, this approach suffers from the problem that HTTP was not designed for implementing remote applications, but rather for sharing static data. In contrast, the remote frame buffer approach operates on the premise that compatibility with existing applications is paramount at the expense of network bandwidth. This is because many of the remote frame buffer implementations were designed for corporate or lab network environments whose administrators are trying to move users away from desktop computers to a thin-client subsystem with a lower total cost of ownership.
The RemoteJFC distributed user interface toolkit attempts to combine the benefits of both approaches without their performance and usability issues by transmitting the high-level semantics of a display using a standard toolkit API. Intuitively, one would expect the network bandwidth consumed by RJFC to be closer to that of the Web-based approach using HTTP/HTML than that of the remote frame buffer approach, while permitting rich user interaction without artificially introduced latency. Figure 9 is a comparison of the bandwidth consumed by RemoteJFC and the AT&T VNC remote frame buffer system.
All thin-client systems need some kind of software browser or viewer that must reside in permanent storage on the client computer. Because the Web-based approach and the RemoteJFC approach both transmit high-level information across the network, the size of the client software package is much larger than that of the VNC viewer. The size of a typical Web browser download is about 25 megabytes, as compared to the VNC viewer, which can be about 110 kilobytes. The RemoteJFC viewer lies somewhere in between: the library adds 2.5 megabytes to a Java Runtime Environment, which can vary in size from 3 to 15 megabytes. In addition, the VNC viewer memory image when attached to an 800x600 desktop computer consumes 1.5 megabytes of RAM, whereas both the Web browser and RemoteJFC viewer require approximately ten times that amount. This also results in faster startup times for the VNC viewer than a Web browser or the RemoteJFC Viewer.
Overall, the remote frame buffer approach is much ``thinner'' than the Web-based and RemoteJFC approaches and is capable of running on less powerful hardware, but requires much more network bandwidth to operate effectively.
We believe that the distributed user interface toolkit approach, as embodied in RemoteJFC, represents a powerful competitor to the existing methods of providing thin-client applications to the user. We are currently considering a number of possible directions in which to extend our work.
One attractive possibility is the implementation of another Doclet API code generator to automatically convert desktop JFC applications to thin-client RemoteJFC applications. We will also attempt to add new features to the RemoteJFC protocol. For example, since our protocol is client-side stateless, it is possible to create a collaborative groupware version of the RJFC toolkit. We would then be able to compare our system directly with the rich body of research in that area. In addition, we will consider augmenting the RemoteJFC toolkit to support a hybrid client capable of handling some events locally on the client while transmitting other events to the server.
Finally, we believe that exploring how to optimize the RJFC protocol could provide insight into the information complexity of a user interface. By knowing exactly how much information we are transmitting across the network, we can gain a better understanding of a user interface's efficiency and how to improve it.
This research is supported in part by NSF Grant IIS-98-17434 and gifts from Microsoft and Intel. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors, and do not necessarily reflect the views of the NSF or any other organization supporting this work.