Global Cooperative Computing

Andre DeHon

Jeremy Brown

Ian Eslick

Jake Harris

Lara Karbiner

Thomas F. Knight, Jr.

Abstract:

Pervasive network connectivity changes our computational landscape. Each computer now resides within a global computer web rather than existing in isolation. Until recently, use of the network has been manual and primitive, requiring sophistication and effort on the part of the user to exploit the resource. The World-Wide Web transcends this traditional model, providing native support for the global computer web. To date, however, the WWW has been used primarily to publish information. Global cooperative computing represents the next step in exploiting wide-scale network connectivity wherein distributed users and computers globally cooperate on common computing problems. In this paper we focus on how the World-Wide Web can be used to rapidly advance the state of the art in computation and change the way software is developed, maintained, debugged, optimized, and used. The cooperative computing model we advocate supports systems which continually adapt to user needs and global information. This adaptability delivers considerable performance and convenience advantages to both developers and end users.

Overview

As network connectivity becomes an integral part of all computing systems, we can change many of our computing paradigms to take advantage of this resource. Exploiting the resource fully, however, requires a paradigm switch away from the models of computing employed when isolation was the norm rather than the exception. We explicitly consider the benefits of redesigning basic processes to gain advantage from a global web of computers. In particular, we focus on the implications of this model for:

Individual computing users in our world often share common goals. Users want their applications to run robustly and quickly. Some want to extend and improve the applications they use, and almost all users want to take advantage of enhancements others make in the applications they run. To the extent that groups of individuals run common applications, they have common goals. Common applications include editors such as emacs or Microsoft Word, browsers like Lynx or Mosaic, and operating systems such as Linux, OSF/1, MS-DOS, or Windows. Experience with the performance, security, reliability, and robustness of a common application is of interest to the entire user base. Similarly, improvements which increase any of these aspects of an application may benefit the entire user base.

Global cooperative computing considers how to exploit pervasive network connectivity to allow individuals to gain maximum benefit from the community of users who share common computing goals. By changing the computational paradigm to encompass the shared user base, tools can automatically collect and assimilate information about common use and common defects -- e.g. bugs which cause the program to crash. Improvements can be managed in a more automated way, and users can easily transition to enhanced versions of an application. Further, once identified, critical bugs or security issues can rapidly be brought to the attention of a user base, often along with the capability to automatically correct the problem.

Changing Computational Landscape

While computers have been networked for decades, it is only recently that wide-spread connectivity has left the corporate and university research labs. Network support will soon be part of the least-common denominator machine configuration and global network connectivity is not far behind.

Nonetheless, the predominate use of wide-area networking over the past decade has been very manual. The usage is manual in the sense that the network and individual machines are visible to the user. The user must be very aware of specific features of the network and the individual computers in it to get any benefits. Predominant examples of network use include file transfer, remote login, and electronic mail.

Remote file-systems -- e.g. NFS, AFS [HKM +88], printers, and job servers provided slightly more automatic use of the network in local-area domains. To date, however, these automated network uses have been limited to pre-arranged sets of machines within an individual department or organization.

World-Wide Web

The development of the World-Wide Web helped automate network use and began to abstract away the details of network structure and machines. By providing native, cross-machine links which transcend machines and local networks, the web deals uniformly with data spread across the global network. The web, in essence, provides a uniform view of ``the global computer'' into which the network connects a user's machine. The combination of native, fine-grained, cross-machine links and common interfaces to distributed services begins to bring the benefits of wide-scale connectivity to the user in a more seamless manner.

To date the World-Wide Web has primarily been used to publish and collect information, and on this front it has made great strides toward building a global web of information. This model, however, has made less progress towards providing a global computation web which brings direct benefits to individual computational processes.

HyperCode

We use HyperCode [BHK +94] as a springboard to facilitate migration from a global web of information to a global web of computation. At the entry level, HyperCode brings world-wide, distributed, HyperText technology to program source code. Program source code is richly annotated with direct links to relevant information including definition sites, program genealogy, quality annotations, and documentation. Basic HyperCode provides explicit function Call Callee-source links which offer a number of key benefits. They allow programs to precisely indicate their dependencies making it easier for people to share code. Further, the links identify a canonical home site for each piece of code. This home site serves as a rendezvous location for collection and dissemination of information about the code. At this primitive level, HyperCode fits into the existing model of WWW applications, providing a convenient form for publishing code.

Human Cooperative Development

HyperCode can be readily extended to provide a substrate for cooperative computing. As noted above, the HyperCode representation of an application will live on a canonical home server. This server can automatically collect contributions of code alternatives and enhancements. HyperCode links show the relation between the newly submitted code and the existing code. In database fashion, version dependencies are explicitly represented in the HyperCode format. In essence, the HyperCode server acts as a code management system tracking versions, dependencies, and alternatives. Significantly, it facilitates wide-scale collaboration on code development. Much of the tedium associated with collecting and integrating code improvements is automated by the code server.

The automatic support for cooperative submission of alternatives removes the burden of coordinating low-level improvements. Individuals who wish to augment a work need only link their contribution as an alternative to the original without involving the original author. At the same time, the improved versions of the code incorporating these changes immediately become available to all consumers.

Comparison with State-of-the-Art

While code management systems at the local network level are not new, HyperCode cooperative code development represents a significant improvement over the current state-of-the-art in shared software development and distribution. Today, contributors make whole programs available via FTP [PR85]. When potential consumers find the code, they can manually pick it up and install it. Such programs often implicitly depend on other programs and the consumer is burdened with understanding this relationship and making sure all the implicit dependencies are supported. Consequently, a huge cognitive burden is placed on potential consumers to properly install the software and prepare it for use.

At the same time, complicated dependencies within a piece of shared code make it impractical to reuse portions of shared or published software without undertaking significant effort to understand the context required for the code to operate. Coordinating bugs and improvements is also a huge logistical problem. These factors make shared code maintenance and evolution an enormous undertaking. Despite this overhead, many programmers do come to understand the shared code and can provide improvements. Nonetheless, most of these improvements are not available for public use due to the enormous human effort currently required to verify and coordinate the evolution of shared code. A HyperCode system for code management will automate much of the coordination and dependency tracking, simultaneously lowering the transaction cost of using shared code and lowering the effort required to evolve shared code.

Human and Computer Feedback

The canonical home repository can also collect and integrate both explicit and implicit feedback from users and program usage. Explicit feedback comes from human users specifying supplementary information about existing code. Implicit feedback can be taken directly from applications as they are used.

Explicit Feedback

Users of code fragments can post comments to the HyperCode repository. The server integrates these explicit user comments with the code, making them available for developers and consumers. Such comments may include critiques of the code, warnings about potential assumptions and limitations, or additional documentation on code behavior. These comments allow a user to report on his experience with a piece of code, sharing his observations with the entire community of users. Often, after working with an application or routine for some time, the user has a new perspective on the code and its behavior which will be valuable to other users and code maintainers. By further assigning semantic tags to comments, tools can automatically make use of this user provided feedback. Some sample tagged comments might include:

In this role, HyperCode associates feedback directly with the code, communicating knowledge about applications readily among consumers.

Implicit Feedback

Valuable feedback information can also be taken directly from application use without burdening the user. Every time a program is run the user generates interesting information about the program:

This kind of information can be used to optimize programs, determine reliability, and decide what programs are most important to users. Since the consumer is online and connected with the canonical home repositories, this information can be seamlessly collected and returned to the home site without involving or impacting the user.

For example:

  1. Usage -- By simply monitoring invocations, runtime, and resource usage for an application, a computer can determine the actual usage patterns which the application encounters. Each user's machine can keep a local record of its own use to determine the applications which are most relevant to that user, and hence the applications which it should consider most carefully. Each home site can log aggregate usage for the application. This information allows users and potential users to determine the maturity of a piece of software, the size of the user community, and the typical workloads.

  2. Coredumps and program crashes -- In some systems, every time a program crashes it generates useful information which can be used to metric the robustness of the program and to zero in on areas which need improvement. Each user's machine can report a synopsis of the crash to the canonical home site. This alerts developers and consumers of potential problems in code. These problems may arise from different datasets than those expected by the original developers and hence the feedback can help developers identify and rectify previously unforeseen problems. Further, by normalizing aggregate crash incidents for a program against aggregate runtime or invocations, one can determine crude measures of robustness.

  3. Profiling -- Profiling information, such as prof or gprof [GKM82] [FS93], can be collected and reported, providing a more fine-grained view on program usage and bottlenecks. Such profiling information is useful in improving code. Aggregated across wide-spread use, developers or performance tuners get a fairly comprehensive view of deployed performance. Further, segregated by machine types, one can see the effects of various machine differences on the relative efficiencies and the sources of program bottlenecks.

  4. Gross Dataset Statistics -- Datasets can be monitored to determine the real common cases seen in deployed applications. Again, developers and compilers can produce more highly optimized code when they know the cases of typical use [Lam84]. Further, gross differences in dataset statistics may suggest the need for different versions of code optimized and specialized to different users and usages. If the user's machine also keeps its own statistics on local application usage, the machine can automatically pick the best version of a piece of code to provide the user with highest performance.

Dissemination and Software Evolution

The canonical home site for an application serves as a common point for information collection. In this role it also serves as a central point for dissemination of relevant information back to the consumers of the application. As noted above, the home site collects information on usage, program improvements, quality, and user experience. This information can be beneficial to the entire user base in various ways.

Each application knows its own home site. As applications are run, the user's computer can check in with the home site to see if there is new and relevant information which should come to the attention of the user or, at least, the user's machine. This routine check provides a path for targeting relevant information back to the user base.

Following are some examples of the ways in which collected information benefits consumers:

One theme that appears here is the opportunity to allow a user's software to adopt new developments in a seamless manner. By annotating the home site, new developments are communicated to the entire user base. With explicit links tracking dependencies among applications and code, the user's computer can automatically acquire and install upgraded versions of software in a consistent manner. When applications depend on other applications, or even particular versions of other applications, for proper installation and execution, the HyperCode dependency framework tracks these requirements. The framework automatically identifies the proper versions allowing the system to install them as necessary. The human user need only be involved in this automatic upgrade process to the extent that he needs to approve the adoption of updates and installation of new software.

At some level human intervention is required to approve the adoption of new software. This need does not imply that every user must be burdened with the approval process. In some cases a consumer may designate a system manager, a vendor, or remote maintenance site to handle and provide approval for automatic updates. The consumer may then give the system blanket authority to automatically incorporate any code or transforms provided by these designated individuals or sites. Cryptographic authentication and checksums can serve to guarantee the uncorrupted and unforged delivery of code and its associated approval.

Implications

The aforementioned model begins a trend toward truly global, cooperative computing. The connected network becomes a source not just of information, but also of cooperating peers who assemble improvements of mutual benefit.

Applications are no longer static entities with no knowledge of their world. The application and its users know about its home site and use it as a global bulletin board to share information. Most of the interaction with the outside network is handled by the user's computer, providing the user with the benefits of network connectivity and cooperation without burdening him with the details of the operation. At the same time, much valuable feedback is extracted transparently and directly from an application's use. The user's application is empowered to evolve quickly, tracking improvements with minimal imposition on the user.

Each machine now becomes a part of a global computer. Each user still maintains autonomy over his machine configuration, machine utilization, applications, and data. When users have common goals, their resources are efficiently pooled to their mutual benefit. In this paper we have explicitly mentioned shared goals to improve application functionality, quality, performance, and security. Cooperative computing endeavors, however, are not limited to code and application development. We expect this cooperative application development to lead the way toward a new class of cooperative computing tasks.

What's Next

The scenarios described here are really only the beginning. There are numerous problems which have the properties that:

We further assume that individuals are willing to dedicate their ``idle'' computer time to solving problems which benefit them. Compare, for example, the recent collaborative effort to factor RSA-129 [Leu94]. In these cases, the global computer can co-opt volunteered resources to solve hard problems for everyone's benefit. The fact that solutions are verifiable in reasonable time allow individual computers to double-check the solutions they get from others to assure that the proposed solutions are, in fact, correct.

Code optimization is a good example of a problem which has this property. Finding the optimal sequence of instructions to perform a function is -hard. As a result, we generally do not even try. However, once an optimal or improved solution is found, it benefits everyone who uses the code and does so in proportion to their frequency of use. Highly optimized versions of heavily used subroutines are especially good candidates. Work on superoptimization has demonstrated the opportunity for gains on short code segments [Mas87] [GK92]. With feedback on code usage, computers can automatically determine the most benefitial sequences to optimize. A large user base cooperating to cover a huge search space such as the one entailed here can reasonably attack far larger problems than a single, isolated computer.

In a similar manner, many optimization problems ( e.g. Travelling Salesman, VLSI Layout, boolean factoring and optimization, AI Search, planning) have these properties. Global cooperative computing and wide-area databases can attack these problems in the same way. Distributed repositories collect problems, interest, and coordinate the distributed optimization and search. Individual machines work on the problems which benefit their owners most and communicate the results back to the coordinating repository. Everyone benefits in proportion to the size of the user base interested in solutions to a given problem. Once an optimal solution is found, the solution becomes part of the world-wide knowledge base available to everyone. In this manner, a world-wide knowledge base evolves directed by the collective interests of all participants.

References

BHK +94
Jeremy Brown, Jake Harris, Lara Karbiner, Massimiliano Poletto, Andr'e DeHon, and Jr. Thomas F. Knight. HyperCode. In The Second International World-Wide Web Conference 1994, October 1994.

FS93
Jay Fenlason and Richard Stallman. GNU gprof. Free Software Foundation, Inc., 675 Massachusetts Avenue, Cambridge, MA 02139, January 1993.

GK92
Torbjorn Granlund and Richard Kenner. Eliminating Branches using a Superoptimizer and the GNU C Compiler. In Proceedings of the ACM SIGPLAN 1992 Conference on Programming Language Design and Implementation, pages 341-352. ACM SIGPLAN, ACM Press, June 1992. SIGPLAN Notices, Volume 27, Number 7.

GKM82
Susan L. Graham, Peter B. Kessler, and Marshall K. McKusick. gprof: a Call Graph Execution Profiler. In Proceedings of the SIGPLAN '82 Symposium on Compiler Construction, pages 120-126. ACM SIGPLAN, ACM, June 1982. SIGPLAN Notices, Volume 17, Number 6.

HKM +88
J. H. Howard, M. J. Kazar, S. G. Menees, D. A. Nichols, M. Satyanarayanan, R. N. Sidebotham, and M. J. West. Scale and Performance in a Distributed File System. ACM Transactions on Computer Systems, 6:55-81, 1988.

Lam84
Butler W. Lampson. Hints for Computer System Design. IEEE Software, 1(1):11-28, January 1984.

Leu94
Kristin Leutwyler. Superhack. Scientific American, 271(1):17-18, July 1994.

Mas87
Henry Massalin. Superoptimizer -- A Look at the Smallest Program. In Proceedings of the Second International Conference on Architectural Support for Programming Languages and Operating Systems, pages 122-126. ACM, IEEE Computer Society Press, October 1987.

PR85
J. Postel and J. Reynolds. File Transfer Protocol (FTP). RFC 959, USC/ISI, Information Sciences Institute, University of Southern California, 4676 Admiralty Way, Marina del Rey, California, 90291, October 1985.

Andre DeHon is a doctoral candidate at the MIT Artificial Intelligence Laboratory working under the direction of Principal Research Scientist Thomas F. Knight, Jr. For PhD research, Andre is developing computer architectures which tightly couple rapidly reconfigurable logic with conventional processing engines. Andre holds S.B. and S.M. degrees from MIT and worked with the MIT Transit Project on the design and construction of high-performance, fault-tolerant interconnection networks for large-scale multiprocessors.

Jeremy Brown is an M.Eng. candidate at MIT, working under Principal Research Scientist Thomas F. Knight, Jr. at the MIT Artificial Intelligence Laboratory. His M.Eng. thesis is aimed at developing automated specialization of C code. Jeremy is supervising the development of HyperCode. Jeremy is an NSF Graduate Research Fellow. Previously, Jeremy worked on the MIT Media Lab's Project ALIVE, a non-intrusive VR project first demonstrated at SIGGRAPH, 1993.

Ian Eslick is an M.Eng. candidate at MIT, working under Principal Research Scientist Thomas F. Knight, Jr. at the MIT Artificial Intelligence Laboratory. Ian has been involved with the construction of the multiprocessor prototype in the MIT Transit Project.

Jake Harris is an MIT Sophmore who is helping to develop HyperCode as part of the Undergraduate Research Opportunity Program.

Lara Karabiner is an MIT Sophmore who is helping to develop HyperCode as part of the Undergraduate Research Opportunity Program.

Thomas F. Knight, Jr. is a Principal Research Scientist at the MIT Artificial Intelligence Laboratory. Knight got his PhD from MIT in 1983. He was an Associate Professor at MIT from 1983-1991. Knight was a founder and technical director at Symbolics, Inc. Knight directed the MIT Transit Project and now oversees several projects at the MIT AI Lab including the Reinventing Computing Project.

Contact Author: andre@ai.mit.edu

andre@ai.mit.edu