What after Java? From objects to actors

Carlos A. Varela and Gul A. Agha

Open Systems Laboratory, Department of Computer Science,
University of Illinois at Urbana-Champaign, Urbana, IL 61801, U.S.A.

cvarela@uiuc.edu and agha@cs.uiuc.edu

Abstract
In this paper, we discuss some drawbacks of the Java programming language, and propose some potential improvements for concurrent object-oriented software development. In particular, we argue that Java's passive object model does not provide an effective means for building distributed applications, critical for the future of Web-based next-generation information systems. Specifically, we suggest improvements to Java's existing mechanisms for maintaining consistency across multiple threads (e.g. synchronized), sending asynchronous messages (e.g. start/run methods) and controlling resources (e.g. thread scheduling). We drive the discussion with examples and suggestions from our own work on the Actor model of computation.

Keywords
Actors; Concurrent object-oriented programming; Distributed systems; Java

1. Introduction

Java uses a passive object model in which threads and objects are separate entities. As a result, Java objects serve as surrogates for thread coordination and do not abstract over a unit of concurrency. We view this relationship between Java objects and threads to be a serious limiting factor in the utility of Java for building concurrent systems. Specifically, while multiple threads may be active in a Java object, Java only provides the low-level synchronized keyword for protecting against multiple threads manipulating an object's state simultaneously, and lacks higher-level linguistic mechanisms for more carefully characterizing the conditions under which object methods may be invoked. Java programmers often overuse synchronized and resulting deadlocks are a common bug in multi-threaded Java programs.

Java's passive object model also limits mechanisms for thread interaction. In particular, threads exchange data through objects using either polling or wait/notify pairs to coordinate the exchange. In decoupled environments, where asynchronous or event-based communication yields better performance, Java programmers must build their own libraries which implement asynchronous message passing in terms of these primitive thread interaction mechanisms. Although active objects, or actors [1], can greatly simplify such coordination and are a natural atomic unit for system building, they are not directly supported in the current version of Java.

Finally, we find Java's position on thread scheduling to be inadequate. While it is reasonable to not require applications to use fairly scheduled threads, we believe that system builders should have the option of selecting fair scheduling if necessary. The lack of fairness in thread scheduling is a particularly devious source of race conditions, whereby correct application behaviour depends upon relative rates of progress among threads and nondeterministic thread preemption. Not having fair thread scheduling makes debugging multi-threaded applications even more difficult. For a further discussion on liveness problems in Java, we refer the reader to [7].

In the remainder of this article, we elaborate on each of these criticisms and describe potential solutions.

2. Linguistic support for synchronization

Synchronization in Java is necessary to protect state properties associated with objects. For example, the standard class java.util.Hashtable defines a synchronized put method for adding key-value pairs, and a synchronized get method for hashing keys. Both methods are synchronized to avoid corrupting the state when methods are simultaneously invoked by separate threads. This mechanism works well for classes like Hashtable because methods in these classes have relatively simple behavior and do not participate in complex interactions with other classes.

A side-effect of the convenience and simplicity of synchronized, however, is that it tends to be overused by application programmers: when software developers are not certain about the context in which a method may be called, a rule of thumb is to make it synchronized. This approach guarantees safety in Java's passive object model, but does not guarantee liveness and is a common source of deadlocks. Typically, such deadlocks result from interactions between classes with synchronized methods. For example, consider threads t1 and t2 in the following figure. Thread t1 executes the synchronized method m which attempts to invoke the synchronized method n in an object of class B. Similarly, thread t2 executes the synchronized method n which attempts to invoke the synchronized method m in an object of class A. In a trace in which both threads first acquire their local locks, this simple example results in a deadlock.

class A implements Runnable{
  B b;
  synchronized void m() {
    ...b.n();...
  }
  public void run() { m(); }
}
class B implements Runnable{
  A a;
  synchronized void n() {
    ...a.m();...
  }
  public void run() { n(); }
}
class Deadlock {
  public static void main(String[] args){
    A a = new A();
    B b = new B();
    a.b = b;
    b.a = a;
    Thread t1 = new Thread(a).start();
    Thread t2 = new Thread(b).start();
  }
}

We view the synchronized keyword as too low-level for effective use by application developers. Specifically, requiring developers to implement sophisticated synchronization constraints in terms of low-level primitives makes programming error prone and resulting code difficult to debug. Synchronizers [3, 4] are linguistic abstractions which describe synchronization constraints over collections of active objects, or actors. In particular, synchronizers allow the specification of message patterns which are associated with rules that enable or disable methods on actors. Synchronizers may also have state and predicates may be defined which use this state in order to enable or disable methods, as shown in Fig. 1.


Fig. 1. Synchronization constraints over a collection of actors.

Note that synchronizers are much more abstract than the low-level synchronization support currently provided in Java. Synchronizers may be placed on individual actors as well as on overlapping collections of actors. Moreover, separating synchronization into a distinct linguistic abstraction, rather than embedding it in class definitions, allows constraints to be reused over different classes. Additionally, synchronization constraints can be more properly inherited in subclasses than it is currently possible in Java. As a simple example of how synchronizers may be specified linguistically, consider two resource managers, adm1 and adm2, which distribute resources to clients. We wish to place a bound on the total number of resources allocated collectively by both managers. This can be achieved by defining the synchronizer given in the figure below. The field max determines the total number of resources allocated by both managers.

AllocationPolicy(adm1,adm2,max)
    { init prev := 0
      prev >= max disables (adm1.request and adm2.request),
      (adm1.request xor adm2.request) updates prev := prev + 1,
      (adm1.release xor adm2.release) updates prev := prev - 1
    }

Distributed environments, in which wide varieties of synchronization properties may be required, argue for an approach more similar to synchronizers than the current Java solution of embedding low-level synchronization within classes.

3. Messages and asynchronous method invocations

Distributed, heterogeneous systems require the ability to asynchronously participate in interactions in order to take advantage of available local concurrency. Because Java uses a passive object model, threads on a single virtual machine may interact either by polling on shared objects, or by using wait/notify. Although these heavily synchronized methods of interaction are the most common in Java applications, asynchronous interactions may be implemented by spawning extra threads to handle interactions (see code excerpt below).

class C {
  void m(){...}
  void am(){
    Runnable r = new Runnable {
      public void run(){
        m();
      }
    }
    new Thread(r).start();
  }
}
class AsyncCall {
  public static void main(String[] args){
    C c = new C();
    ...
    c.am();   // asynchronous method call
    ...
  }
}

As in the case of programming synchronization, requiring the application developer to explicitly code complex interaction mechanisms is also prone to error. Asynchronous interactions are an important basic service that we believe should be standard in a distributed programming environment. Thus, we argue for higher-level linguistic support in Java to provide such interaction mechanisms.

Asynchronous interactions are best supported by an active object model such as that provided by actors [1]. In such a model, messages (potential method invocations) are buffered in a mailbox and handled in a serialized fashion by a dedicated master thread. Active objects are thus a natural unit of concurrency and synchronization. Moreover, such objects need not be strictly serialized: intra-object concurrency may be added by allowing the master thread to spawn new threads which access specific internal methods. This form of intra-object concurrency differs from that in Java in that the master thread controls the conditions under which multiple methods may be active, rather than allowing arbitrary threads to execute in an object.

Syntactically, we could represent objects and actors as follows:

class ObjectsAndActors {
  public void main(String[] args){
    Object o1, o2;
    Actor a1, a2;
    ...
    // A traditional object's synchronous method invocation.
    o2 = o1.method1(args);  
    ...
    // A message is sent asynchronously to actor a1 who must reply
    // to its acquaintance actor a2 asynchronously via a second message.
    a1:message1(args)->a2:message2;
    ...
  }
}

Notice that while subsequent local computation after the call to the object's method1 must wait for that method to be processed; the message message1 is sent asynchronously to the actor a1, allowing local computation to immediately proceed.

4. Resource control

A final concern with using Java to develop concurrent systems is the lack of effective support for controlling system resources. A particular example is the ability of application programmers to control thread scheduling. While the Java language specification [5] encourages language implementors to write fair schedulers, this rule is not enforced. Therefore, there is no language guarantee that a runnable thread with an appropriate priority will eventually be active. Hence, different environments may provide different schedulers emphasizing particular applications. A common solution is to favor threads which are responsible for maintaining graphical user interfaces. However, while such an approach may be logical for certain applications, it may be unfeasible for others. Unfortunately, Java provides no mechanism for selecting features of the scheduler, leaving the task of implementing custom scheduling to application developers.

One possible solution is to include standardized thread scheduling libraries which may be invoked by applications desiring more control over scheduling. However, a user-level approach may not apply to certain critical threads in a system. For example, Java's RMI [6] package handles remote invocations using a separate, non-user-controlled thread which invokes methods on user-defined objects. Because this thread is not under user control (and hence not subject to a user-level scheduling solution), unexpected preemption and deadlock may result. It is possible to ``hack'' around this problem by modifying the RMI-created thread's properties once within a user-defined method. However, this may have unexpected side-effects since the thread was originally created for use by RMI. As a specific solution, we favor the inclusion of lower-level policy selection which allows application developers to specify their scheduling needs. At a more general level, application developers should be able to specify abstract policies which govern more general classes of resources [2].

Acknowledgements

Carlos Varela is a doctoral candidate in Computer Science at the Open Systems Lab, where Professor Gul Agha serves as director. This research has benefited tremendously from discussions with present and past members of the group. In particular, we would like to thank Mark Astley, Nadeem Jamali, Wooyoung Kim and Bill Zwicky for their very fruitful discussions on the Actor model of computation and Java.

References

[1]
G. Agha, Actors: A Model of Concurrent Computation in Distributed Systems. MIT Press, 1986.
[2]
G.A. Agha, M. Astley, J.A. Sheikh, and C. Varela, Modular heterogeneous system development: a critical analysis of Java, in: Proceedings of the Heterogeneous Computing Workshop (HCW '98), March 30, 1998, Orlando, FL (to appear).
[3]
S. Frølund, Coordinating Distributed Objects: An Actor-Based Approach to Synchronization, MIT Press, 1996.
[4]
S. Frølund and G. Agha, Abstracting interactions based on message sets, in: Object-Based Models and Languages for Concurrent Systems. Lecture Notes in Computer Science. Springer-Verlag, 1995.
[5]
J. Gosling, B. Joy and G. Steele, The Java Language Specification. Addison Wesley, 1996.
[6]
JavaSoft Team, Remote Method Invocation Specification, available at http://www.javasoft.com/products/jdk/1.1/docs/guide/rmi/spec/rmiTOC.doc.html
[7]
D. Lea, Concurrent Programming in Java. Addison Wesley, 1996.