Experiences Converting a C++ Communication Software Framework to Java

By Prashant Jain and Douglas C. Schmidt

pjain@cs.wustl.edu and schmidt@cs.wustl.edu
Department of Computer Science
Washington University, St Louis

This work is supported in part by a grant from Siemens Medical Engineering, Erlangen, Germany. The article appeared in the January 1997 C++ Report magazine.


1. Introduction

Over the past year, the Java programming language has sparked considerable interest among software developers. Its popularity stems from its flexibility, portability, and relative simplicity (compared with other object-oriented programming languages). Many organizations are currently evaluating how to integrate Java effectively into their software development processes.

This article presents our experiences applying the Java language and its run-time features to convert a large C++ communication software framework to Java. We describe the key benefits and limitations of Java we found while converting our C++ framework to Java. Some of Java's limitations arise from differences between language constructs in C++ and Java. Other limitations are due to weaknesses with Java itself. The article explains various techniques we adopted to workaround Java's limitations and to mask the differences between Java and C++. If you're converting programs written in C++ to Java (or more likely, if you're converting programmers trained in C++ to Java), you'll invariably face the same issues that we did. Addition insights on this topic are available from Taligent.

1.1. Background

Java is commonly associated with writing applets for World Wide Web (WWW) browsers like Netscape Navigator and Internet Explorer. However, Java is more than a language for writing WWW applets. For instance, Java's Abstract Window Toolkit (AWT) is a GUI framework that provides the core functionality for developing platform-independent client applications. [AWT:96]. In addition, Java's native support for multi-threading and networking makes it attractive for building concurrent and distributed server applications.

The Java team at Sun Microsystems intentionally designed the syntax of Java to resemble C++. Their goal was to make it easy for C++ developers to learn Java. Despite some minor differences (e.g., new keywords for inheritance and RTTI), C++ programmers will be able to ``parse'' Java syntax easily. It took us about a day to feel comfortable reading and writing simple Java examples.

Understanding the syntactic constructs of the language, however, is just one aspect of learning Java. To write non-trivial programs, C++ developers must understand Java's semantics (e.g., copy-by-reference assignment of non-primitive types and type-safe exception handling), language features (e.g., multi-threading, interfaces, and garbage collection), and standard library components (e.g., the AWT and sockets). Moreover, to truly master Java requires a thorough understanding of its idioms and patterns [Lea:96], many of which are different from those found in C++ [Coplien:92].

1.2. Converting from C++ to Java

We believe the best way to master Java (or any other programming language) is to write substantial applications and reusable components. Therefore, the material in this article is based on our experiences converting the ACE framework [Schmidt:94] from C++ to Java. ACE is a freely available object-oriented network programming framework that makes considerable use of advanced C++ features (like templates and pointers-to-methods) and operating systems features (like multi-threading, shared memory, dynamic linking, and interprocess communication). The following figure illustrates the architecture of the C++ version of ACE:

Figure 1. The C++ ACE Architecture

The ACE source code release contains over 85,000 lines of C++. ACE has been ported to Win32, most versions of UNIX, VxWorks, and MVS OpenEdition. Approximately 9,000 lines of code (i.e., about 10% of the total toolkit) are devoted to the OS Adaptation Layer, which shields the higher layers of ACE from OS platform dependencies.

As part of a project with the Siemens Medical Engineering group in Erlangen, Germany, we've converted most of ACE from C++ to Java. The primary motivation for converting ACE to Java was to enhance Java's built-in support for networking and concurrent programming functionality. Many ACE components (such as Task, Timer Queue, Thread Manager, and Service Configurator), along with traditional concurrency primitives (such as Semaphore, Mutex, and Barrier) were converted from C++ to the Java version of ACE. These ACE components provide a rich set of constructs layered atop of Java's existing libraries. In addition, the Java version of ACE includes implementation of several distributed services (such as a Time Service, Naming Service, and Logging Service).

The following figure illustrates the architecture of the Java version of ACE:

Figure 2. The Java ACE Architecture

The Java version of ACE shown in Figure 2 is smaller than the C++ version of ACE shown in Figure 1. There are several reasons for the reduction in size and scope:

In general, converting the ACE framework from C++ to Java was relatively straightforward. Most of our effort focused on (1) how to map C++ features onto Java features and (2) how to modify the design of ACE accordingly. Our goal was to keep the ACE interface as consistent as possible with the original C++ version of ACE. Fortunately, since the C++ version of ACE already existed, we used it to guide the decomposition of classes in the Java version of ACE. This illustrates the reuse of design patterns and software architectures, many of which other developers can use when converting their C++ programs to Java.

1.3. Topics Covered in this Article

The following is a synopsis of the topics covered in this article:

This article focuses on the design and programming issues that arise when converting a large communication software framework written in C++ to Java and how to resolve these issues. Although the performance of Java relative to C++ is a very important topic, it's beyond the scope of this article. We'll present performance measurements of the Java version of ACE compared to the C++ version of ACE in a subsequent article.


2. Key Benefits of Java

The Java programming language provides a rich set of language features and reusable components. The following are the key benefits of Java we identified in converting ACE from C++ to Java:


2.1. Simplicity

One of the primary goals of the Java design team was to keep the language simple and easy to learn. This was done by making it look similar to C++, but by omitting many C++ constructs. For example, several error-prone aspects of C++ (such as pointers and operator overloading) have been deliberately omitted from Java to make it smaller, easier to learn, and easier to use. We found Java to be a relatively concise language that offers the benefits of object-oriented programming, plus a useful set of standard library components.

Learning Java and using it to convert ACE from C++ to Java was a relatively smooth process for us. It only took a few days to convert major portions of ACE from C++ to Java. Since many Java constructs are similar to C++, converting some of ACE to Java was mostly a matter of mapping the C++ constructs to their Java counterparts. Having done this, it was relatively easy to build the higher-layer ACE communication services and example applications using Java.

For example, it took us less than half a day to reimplement the ACE distributed Time Service using the Java version of ACE. Of course, one reason for our productivity was that we'd already figured out how to design and implement the distributed Time Service in C++. However, the simplicity of Java also contributed significantly to this rapid development cycle.


2.2. Portability

One of the most time consuming aspects of implementing the C++ version of ACE has been porting it to dozens of OS platforms and C++ compilers. This is a challenging task since not only must we encapsulate platform-specific system calls, but we must wrestle with many broken C++ compilers (templates and exception handling are particularly problematic in our experience).

In contrast, Java offers the promise of platform portability. This, of course, is due to the fact that Java is more than just a language -- it defines a Virtual Machine (the JVM) on which programs implemented in the language will execute. Therefore, unlike C++ (which generally treats platform issues outside the scope of the language definition), the Java programming language can make certain assumptions about the environment in which it runs. This is one of the key factors that changes the idioms and patterns used by Java programmers.

Java programs now run on many OS platforms (such as Solaris, Windows NT, Windows '95, and OS/2) without requiring major modifications. However, as new Java development environments are released by different vendors (who add their own bugs and extensions), it may be hard to maintain such a high degree of portability. In addition, since Java doesn't support preprocessors or conditional compilation it's unclear how to encapsulate the inevitable platform differences that may arise in practice.


2.3. Built-in Threading and Networking Support

Java's built-in support for threading and networking was crucial to our conversion of ACE from C++ to Java. Although the Java environment also provides standard support for other common programming mechanisms (such as event-driven user-interfaces), we concentrate on Java's threading and socket mechanisms in this article since our focus is on converting communication software frameworks from C++ to Java.

The java.net package and java.lang.Thread class provide useful classes for creating concurrent and distributed client/server applications. Using these classes simplifies client/server application development because the Java wrappers shield programmers from tedious and error-prone threading and socket-level details. For example, the following code shows how a simple ``thread-per-connection'' server application can be written in Java:

 
// A simple concurrent server that accepts connections 
// from clients and creates a ServiceHandler to do the 
// required processing for each connection.
// Note that each ServiceHandler runs in its own 
// thread of control. The code for the ServiceHandler
// class has been left out for brevity.

public class Server
{
  // Main entry point.
  public static void main (String args[]) {
    int port = DEFAULT_PORT;
    if (args.length == 1) {
      try {
        port = Integer.parseInt (args[0]);
      } catch (NumberFormatException e) {
         System.err.println (e);
         System.exit (1);
      }
    }
    new Server (port);
  }

  public Server (int port) {
    ServerSocket acceptorSocket;
    try {
      // Create a new server listen socket.
      acceptorSocket = new ServerSocket (port);
    } catch (IOException e) {
       System.err.println (e);
       System.exit (1);
    }
    System.out.println ("Server listening on port " + port);

    try {
      // Event loop for accepting client connections.
      for (;;) {
        Socket s = acceptorSocket.accept ();

        // Create a handler for every accepted connection 
        // and then have the handler run in its own thread 
        // of control.  This assumes that the ServiceHandler 
        // class implements the Java Runnable interface.

        new Thread (new ServiceHandler (s)).start ();
      }
    } catch (IOException e) {
       System.err.println (e);
       System.exit (1);
    }
  }
}
Note the relatively few lines of code required to write a simple concurrent server. Moreover, note how easily the server can be multi-threaded by starting each ServiceHandler in its own thread of control. Although the C++ version of ACE provides equivalent functionality and parsimony, it's harder to use features like exception handling, sockets, and threading portably across multiple OS platforms.

The following code illustrates how a Java client application can be written to communicate with the server:

 
// A simple client that connects to the server.
public class Client
{
  public static void main (String args []) {
    String serverName = null;
    int port = DEFAULT_PORT;

    try {
      switch (args.length) {
      case 2:
        port = Integer.parseInt (args[1]);
        // fall through
      case 1:
        serverName = args[0];
        break;
      default:
        System.err.println ("Incorrect parameters");
        System.exit (1);
      }
    } catch (NumberFormatException e) {
       System.err.println (e);
       System.exit (1);
    }
    new Client (serverName, port);
  }

  public Client (String serverName, int port) {
    try {
      // Create a socket to communicate with the server.
      Socket s = new Socket (serverName, port);
    } catch (IOException e) {
       System.err.println (e);
       System.exit (1);
    }

    // Client can now send and receive data to/from the 
    // server using the underlying data streams of the socket.
    // ...
  }
}
The Java wrappers for sockets play a similar role as the C++ socket wrappers in ACE. They both provide a uniform interface that simplifies communication software implementations. In our conversion of ACE from C++ to Java, it was trivial to use the Java socket wrappers to provide a communication interface equivalent to the C++ version of ACE.


2.4. Standard Libraries

The Java development environment provides several useful collections in the form of interfaces and classes. These are contained in the java.util package and include Enumeration, Vector, Stack, Hashtable, BitSet, and Dictionary. Providing generic collections as part of the standard development environment simplies application programming. For example, in the C++ version of ACE, we've implemented reusable components (such as the Map_Manager and the Unbounded_Set) to simplify the development of higher-level ACE components.

In contrast to Java, the standard components we used in the C++ version of ACE were developed from scratch since they didn't exist as in all our C++ development environments. Although the ANSI/ISO draft standard is nearing completion, most C++ compilers still don't provide standard libraries that are portable across platforms. Therefore, programmers must develop these libraries, port them from public domain libraries (such as GNU libg++ and HP's STL), or purchase them separately from vendors like Rogue Wave and Object Space.

For instance, to build a portable application in C++ that uses dynamic arrays, developers must either buy, borrow, implement, and/or port a dynamic array class (such as the STL vector). In the case of Java, developers can simply use the Vector class provided in the development environment without concern for portability. The ubiquity of Java libraries is particularly important for WWW applets because the standard Java components can be pre-configured into browsers to reduce network traffic. In addition, the unified strategies provided by the Java standard libraries (such as iteration, streaming, and externalization) provide idioms that Java programmers can easily recognize, utilize, and extend.


2.5. Garbage Collection and Range Checking

The Java run-time system performs garbage collection and range checking on pointer and array access. This eliminates a common source of errors in C and C++ programs. In Java, programmers don't need to explicitly free memory resources acquired dynamically. Objects are freed by the garbage collector when they are no longer referenced. This eases common programming traps and pitfalls associated with dangling references or memory leaks. [Editor's note: Robert Martin's article in this issue on ``Java and C++: A Critical Comparison'' discusses some drawbacks with Java's garbage collection model.]

There are tools (such as Purify, Bounds Checker, and Great Circle) that reduce the effort of writing memory-safe C++ code. However, it's been our experience that even though these tools exist, C/C++ programmers typically expend considerable effort avoiding memory leaks and other forms of memory corruption.


2.6. Type-safe Exceptions

Exceptions provide a clean way to handle error conditions without cluttering the ``normal'' code path. Java provides an elegant exception handling model that includes both checked exceptions and unchecked exceptions. Checked exceptions allow the compiler to verify at compile-time that methods handle potential exceptions. This allows Java to ensure that all checked exceptions are handled. In addition, Java supports unchecked exceptions, which allows developers to handle run-time exceptions and errors.

We have consciously avoided the use of exceptions in the C++ version of ACE due to portability problems with C++ compilers. In contrast, when converting the ACE framework from C++ to Java, we were able to make extensive use of Java's exception handling mechanisms. The ability to use exception handling significantly improved the clarity of the Java ACE error-handling logic, relative to the C++ version of ACE.

Although Java exceptions are elegant, they also exact a performance penalty. Depending upon how heavily exception handling is used, the impact on performance can vary significantly. As mentioned earlier, performance measurements of the Java version of ACE will be covered in a subsequent article.


2.7. Loading Classes Dynamically

The Java run-time system loads classes ``on-demand'' as they are needed. Moreover, Java can load classes both from the file system, as well as over the network, which is a powerful feature. In addition, Java provides programmers with the hooks to load classes explicitly via the java.lang.ClassLoader mechanism.

The java.lang.ClassLoader is an abstract class that defines the necessary hooks for Java to load classes over the network or from other sources such as the file system. This class made it easy to implement the Service Configurator framework in ACE. The ACE Service Configurator framework provides a flexible architecture that simplifies the development, configuration, and reconfiguration of communication services. It helps to decouple the behavior of these communication services from the point in time at which these services are configured into an application.

Implementing the Service Configurator framework in the C++ version of ACE is challenging because C++ doesn't define a standard mechanism for dynamic linking/loading of classes and objects. Implementing this functionality across platforms, therefore, requires various non-portable mechanisms (such as OS support for explicit dynamic linking). Moreover, since the draft ISO/ANSI C++ standard doesn't address dynamic linking, C++ compilers and run-time systems are not required to support dynamic linking. For instance, many operating systems will not call the constructors of static C++ objects linked in dynamically from shared libraries.

The fact that Java provides standard mechanisms for dynamic linking/loading of classes significantly simplified the implementation of the ACE Service Configurator framework. This reiterates the fact that Java is more than just a programming language. It defines a run-time environment, and can therefore make certain assumptions about the environment in which it runs. Thus, unlike other languages such as C and C++, the Java run-time environment can portably support important run-time features such as explicit dynamic linking/loading.


2.8. Using JavaDoc for Documentation

The
JavaDoc tool generates API documentation in HTML format for the specified package or for individual Java source files specified on the command line. Using JavaDoc is straightforward. The syntax /** documentation */ indicates a documentation comment (a.k.a., a ``doc comment'') and is used by the JavaDoc tool to automatically generate HTML documentation. In addition, doc comments may contain special tags that begin with the @ character. These tags are used by JavaDoc for additional formatting when generating documentation. For example, the tag @param can be used to specify a parameter of a method. JavaDoc extracts all such entries containing the @param tag specified inside a doc comment of a method and generates HTML documentation specifying the complete parameter list of that method.

The C++ version of ACE also provides automatic generation of documentation using a modified version of the freely available OSE tools. The ACE documentation tools produce UNIX-style man pages (in nroff format), as well as JavaDoc-style documentation (in HTML format).


3. Key Limitations of Java and Our Workarounds

This section describes key limitations with Java we identified when converting ACE from C++ to Java. We've ordered our discussion in the order of the impact that each limitation had on the conversion of ACE from C++ to Java.

  1. Lack of Templates
  2. Lack of Enumerated Types
  3. Lack of Pointers to Methods
  4. Lack of ``Conventional'' Concurrency Mechanisms
  5. Lack of Destructors
  6. Lack of Explicit Parameter-passing Modes
  7. Lack of Explicit Inline Functions
Note that some of the limitations arose because we were converting an existing C++ framework to Java. Had we originally developed ACE from scratch using Java, many problems we encountered, and which we describe here as the ``limitations of Java,'' would have been non-issues. However, we suspect that when C++ programmers learn Java, or first convert code written in C++ to Java, they are likely to face the same issues that we did. Therefore, in addition to outlining the limitations with Java, this section describes the workarounds we applied to alleviate the limitations.


3.1. Lack of Templates

The version of Java available to us (version 1.0.2) does not support templates. This was a significant problem when converting ACE from C++ to Java since ACE uses templates extensively. Templates are used in ACE for two purposes:

  1. Common code factoring -- A typical use of templates in ACE is to avoid tedious recoding of algorithms and data structures that differ only by their types. For instance, templates factor out common code for programming abstractions such as the ACE_Map_Manager and ACE_Malloc classes.

    One workaround for Java's lack of templates is to use Object-based containers like the Smalltalk-style collections available from Doug Lea or the Java Generic Library (which is a conversion of the Standard Template Library from C++ to Java) available from Object Space. [Editors note: Graham Glass' STL in Action column in this issue describes the design of the Java Generic Library.]

    These solutions are not entirely ideal, however, since they require application programmers to insert casts into their code. Although Java casts are strongly-typed (which eliminates a common source of errors in C and C++ programs that use traditional untyped casts), it is hard to optimize away the overhead of run-time type checking.

  2. Signature-based type conformance -- Another use of templates in ACE is to implement signature-based type conformance. For instance, the ACE_Acceptor and ACE_Connector are parameterized with a class that must conform to the ACE_Svc_Handler interface. An ACE_Svc_Handler is created and initialized when a connection is established either actively or passively. It is an abstract class that applications can subclass to provide a concrete service handler implementation. To parameterize the ACE_Acceptor with an ACE_Svc_Handler, (e.g., HTTP_Svc_Handler), we would do the following in C++ ACE:
    
    class HTTP_Svc_Handler : 
      public ACE_Svc_Handler<ACE_SOCK_Stream> 
    { /* ... */ };
    
    class HTTP_Acceptor : 
      public ACE_Acceptor<HTTP_Svc_Handler, 
                          ACE_SOCK_Acceptor> 
    { /* ... */ };
    
    HTTP_Acceptor *acceptor = 
      new HTTP_Acceptor (addr);
    
    The C++ code parameterizes the ACE_Acceptor statically with the HTTP_Svc_Handler. The advantage of using templates is that there's no run-time function call overhead. Another advantage is the ability to parameterize locking mechanisms. For instance, the C++ version of ACE uses templates to select an appropriate synchronization strategy (e.g., mutexes vs. readers/writer locks), as well as to remove all locking overhead when there is no concurrency.

    Although Java lacks templates, it contain some interesting features that enabled us to support signature-based type conformance by using a pattern based on its meta-class facilities. For instance, here's how the Java ACE Acceptor is defined:

     
    package ACE.Connection;
    
    class Acceptor 
    {
      public Acceptor (Class svcHandlerFactory, int port) {
        // Cache the Class factory used to create instances
        // of the handlers using the newInstance() method.
        svcHandlerFactory_ = svcHandlerFactory;
    
        // Cache the port to listen for connections on.
        port_ = port;
      }
    
      // Perform the Acceptor pattern...  
      public void accept () {
        // Create a SvcHandler.
        SvcHandler sh = 
          (SvcHandler) svcHandlerFactory_.newInstance ();
    
        // Accept connection into the SvcHandler.
        SOCKStream sockStream = sockAcceptor_.accept ();
        sh.setHandle (sockStream);
    
        // Activate the SvcHandler.
        sh.open ();
      }
    
      private:
        // Factory that accepts client connections.
        SOCKAcceptor sockAcceptor_ = new SOCKAcceptor ();
    }
      // ...
    
    To achieve the C++ ACE behavior in Java ACE, a Class object can be created using the class name ``HTTPSvcHandler'' and this can then be passed to the constructor of Acceptor, as follows:
    
    class HTTPSvcHandler extends SvcHandler
    { /* ... */ }
    
    Acceptor acceptor = 
      new Acceptor (Class.forName ("HTTPSvcHandler"), 
                    DEFAULT_PORT);
    
    // ...
    acceptor.accept ();
    
    Once the acceptor object is initialized to listen on a well-known port, the accept method will accept connections and create HTTPSvcHandler objects to communicate with clients.

    The Java code uses the Class object created using the string corresponding to the name of the SvcHandler factory as a parameter to the constructor of Acceptor. The Java ACE Acceptor uses this to create a new instance of the SvcHandler when needed. As long as HTTPSvcHandler is a subclass of SvcHandler, a new Class object can be created and passed to the Acceptor. Therefore, the signature will match that expected by the Acceptor factory.


3.2. Lack of Enumerated Types

There are three common uses for enumerated types in C++: value abstraction, type safety, and sequential declarations (for use with arrays and switch statements). In C++, enumerated types are strongly typed. That is, unless you use an explicit cast, the C++ compiler won't allow you to assign instances of different enumerated types to each other, nor can you assign integral values to instances of an enumerated type. In addition, enumerated types convert automatically to their integral values, which is commonly used for efficient table-based dispatching of pointers-to-methods.

In Java, there are no enumerated types. This typically isn't a problem when developing new code, or when writing in an orthodox ``object-oriented'' style, because subclasses and the Java instanceOf typesafe dynamic cast feature can be used in place of enumerals (as illustrated below). However, there were several situations where lack of enumerated types was a problem when converting ACE from C++ to Java:

  1. Lack of abstraction and type-safety -- Since Java doesn't provide enumerated types, programmers need to use primitive types (such as int) instead. For example, translating the following C++ ACE code:
    
    // C++ code.
    class ACE_Naming_Msg_Block
    {
    public:
      enum Naming_Msg_Type {
        BIND,    // Request for bind
        REBIND,  // Request for rebind
        RESOLVE, // Request for resolve/find
        UNBIND,  // Request for unbind
        // ... rest omitted...
        MAX_ENUM // maximum enumeration
      };
    
      ACE_Naming_Msg_Block (Naming_Msg_Type mt, 
                            const char *data);
    
    into Java code is tedious and error-prone. The result looks like this:
    
    // Java code
    package ACE.ASX;
    
    class NamingMsgBlock
    {
      // Request for bind.
      public static final int BIND = 0;
      // Request for rebind.
      public static final int REBIND = 1;    
      // Request for resolve/find
      public static final int RESOLVE = 2;   
      // Request for unbind
      public static final int UNBIND = 3;    
      // ... rest omitted...
      // Maximum value.
      public static final int MAX_ENUM = 11;  
    
      NamingMsgBlock (int mt, String data) { 
        // ...
      }
    }
    
    Not only is this less concise, but it is also more error-prone because any value of type int can be accidentally passed to the NamingMsgBlock constructor. Moreover, enumeral values can be duplicated accidentally. In contrast, the C++ type-system ensures that only NamingMsgType parameters are passed as arguments to the constructor of ACE_NamingMsgBlock.

    One workaround in Java for the lack of enumerated types is to use subclassing. In this approach, a base class called NamingMsgType is defined and a subclass of NamingMsgType is created for each type of NamingMsgType. The following code illustrates this common Java pattern:

     
    // Defines the base type for all message types.
    public abstract class NamingMsgType {}
    
    public class BIND extends NamingMsgType {}
    public class REBIND extends NamingMsgType {}
    public class RESOLVE extends NamingMsgType {}
    // ...
    
    Now we can ensure that an argument of type NamingMsgType is passed to the constructor of NamingMsgBlock. Here's how we can do a ``switch'' to determine the type of the message:

     
    public NamingMsgBlock (NamingMsgType mt, 
                             String data)
    {
      // Use instanceof operator to determine 
      // what mt is an instance of...
      if (mt instanceof BIND) { /* ... */ }
      else if (mt instanceof REBIND) { /* ... */ }
      else if (mt instanceof RESOLVE) { /* ... */ }
      // ...
    }
    
    This solves the problem at the expense of creating a large number of subclasses and forcing iterative search using instanceOf.

  2. Lack of efficient dispatching -- Although the Java subclassing pattern solves the problem of type-safety, Java's lack of enumerated types precludes the common C/C++ pattern of using enums as indices into arrays for efficient method dispatching. For instance, the C++ ACE_Name_Handler implementation dispatches the appropriate method by using the message type to index into a table of pointers to C++ methods. The following code demonstrates how a table of pointers to methods can be created to dispatch efficiently based on enum Naming_Msg_Type literals (to simplify the example, some ACE C++ class names have been changed):
     
    class ACE_Name_Handler
    {
      // Specify an array of pointers to member functions
      typedef int (ACE_Name_Handler::*OPERATION) (void);
      OPERATION op_table_[ACE_Naming_Msg_Block::Naming_Msg_Type::MAX_ENUM];
    
      ACE_Name_Handler (void) {
        // Set up the array of pointers to member functions
        op_table_[ACE_Naming_Msg_Block::Naming_Msg_Type::BIND] = 
          &Name_Handler::bind;
        op_table_[ACE_Naming_Msg_Block::Naming_Msg_Type::REBIND] = 
          &Name_Handler::rebind;
        op_table_[ACE_Naming_Msg_Block::Naming_Msg_Type::RESOLVE] = 
          &Name_Handler::resolve;
          ...  ...
      }
    
      int bind (void) { /* ... */ }
      int rebind (void) { /* ... */ }
      int resolve (void) { /* ... */ }
        ...
    
      int dispatch (ACE_Naming_Msg_Block::Naming_Msg_Type msg_type);
    }
    
    Our dispatch routine is straightforward:
    
    int ACE_Name_Handler::dispatch 
      (ACE_Naming_Msg_Block::Naming_Msg_Type msg_type) 
    {
      // Dispatch based on the index of the message type.
      return (*op_table_[msg_type]) ();
    }
    
    Building this type of efficient dispatch mechanism in Java requires the use of a primitive type like int for the message type, which incurs the drawbacks described above. A workaround this problem is presented in Section 3.3, along with a workaround for Java's lack of pointers to methods.


3.3. Lack of Pointers to Methods

As the preceding example demonstrated, the lack of enums in Java precluded us from doing efficient dispatching without using primitive integral types. Not only does Java lack enums, however, it also lacks pointers to methods. Pointers to methods are widely used in GUI frameworks and other event-based systems that demultiplex individual methods based on the arrival of messages.

A common workaround for Java's lack of pointers to methods is to build callback objects via subclassing [Lea:96]. This can be tedious, however, as shown by the following Java rewrite of the C++ ACE_Name_Handler class presented earlier. The example below also illustrates another solution to Java's lack of enumerated types:

 
// Specify an interface that can be implemented
// by classes to build callback objects.
public interface Operation
{
  public int invoke ();
}

// Define callback object to handle BIND messages.
public class BINDHandler implements Operation
{
  public int invoke () { /* Handle BIND messages. */ }
}

// Define callback object to handle REBIND messages.
public class REBINDHandler implements Operation
{
  public int invoke () { /* Handle rebind messages. */ }
}

// Define callback object to handle RESOLVE messages.
public class RESOLVEHandler implements Operation
{
  public int invoke () { /* Handle RESOLVE messages. */ }
}

// ... rest omitted...

Here is how we can define the NameHandler class. This example first uses the class NamingMsgType to emulate the functionality of enums, as well as to provide abstraction and type-safety.
 
final class NamingMsgType
{
  // Request for bind.
  public static final NamingMsgType BIND = 
    new NamingMsgType ("BIND", 0);
  // Request for rebind.
  public static final NamingMsgType REBIND = 
    new NamingMsgType ("REBIND", 1);            
  // Request for resolve/find.
  public static final NamingMsgType RESOLVE = 
    new NamingMsgType ("RESOLVE", 2);           
  // Request for unbind.
  public static final NamingMsgType UNBIND = 
    new NamingMsgType ("UNBIND", 3);            
  // ... rest omitted...

  public String toString () { return name_; }
  public int val () { return value_; }

  private NamingMsgType (String name, int value) {
    name_ = name;
    value_ = value;
  }
  // The int-value corresponding to the enum.
  private int value_;   
  // The string associated with the enum.
  private String name_; 
}
The following NameHandler class uses a Vector to keep instances of callback objects and uses the dispatch routine to extract the right instance of the callback object:


public class NameHandler
{
  public NameHandler () {
    // Initialize a dynamically resizing vector.
    opTable_ = new Vector();

    // Insert the elements, starting at location 0.
    opTable_.addElement (new BINDHandler ());
    opTable_.addElement (new REBINDHandler ());
    opTable_.addElement (new RESOLVEHandler ());
    // ... other entries omitted ...
  }

  // Specify the dispatching routine.  The
  // message types MUST be NamingMsgType.
  public int dispatch (NamingMsgType msgType) {
    // Dispatch based on the value of the message type.
    int index = msgType.val ();
    Operation op = 
      (Operation) opTable_.elementAt (index);
    return op.invoke ();
  }

  // Array of callback objects.
  private Vector[] opTable_;
}
Note that if the values of the NamingMsgType ``enumeration'' weren't contiguous, we could use the Java HashTable rather than a Vector.


3.4. Lack of ``Conventional'' Concurrency Mechanisms

Java is a multi-threaded language with built-in language support for threads. The java.lang.Thread class contains methods for creating, controlling, and synchronizing Java threads. The main synchronization mechanism in Java is based on Dijkstra-style Monitors. Monitors are a relatively simple and expressive model that allow threads to (1) implicitly serialize their execution at method-call interfaces and (2) to coordinate their activities via explicit wait, notify, and notifyAll operations.

While Java's simplicity is often beneficial, we encountered subtle traps and pitfalls with its concurrency model [Cargill:96]. In particular, Java's Monitor-based concurrency model can be non-intuitive and error-prone for programmers accustomed to developing applications using threading models (such as POSIX Pthreads, Solaris threads, or Win32 threads) supported by modern operating systems. These threading models provide a lower-level, yet often more flexible and efficient, set of concurrency primitives such as mutexes, semaphores, condition variables, readers/writer locks, and barriers.

In general, Java presents a different paradigm for multi-threaded programming. We found that this paradigm was initially non-intuitive since we were accustomed to conventional lower-level multi-threaded programming mechanisms such as mutexes, condition variables, and semaphores. As a result, we had to rethink many of the concurrency control and optimization patterns used in C++ ACE. We found that converting C++ ACE code that used these conventional synchronization mechanisms required careful analysis and often changed the implementation of C++ ACE components. This was due to differences in Java concurrency mechanisms, compared with conventional POSIX Pthreads-like threading mechanisms used in C++ ACE. The following discussion explores some threading challenges we faced when converting C++ ACE to Java:


3.5. Lack of Destructors

C++ destructors are commonly used to manage reusable resources (such as memory, I/O descriptors, locks, window buffers, etc.) that must be released before leaving a scope. This leads to common patterns in C++ where constructors and destructors collaborate to ensure that resources are acquired on entry, and released on exit, from a block of code.

For instance, consider the following simplified version of the C++ ACE_Message_Queue (for simplicity, most error handling code has been omitted):

 
template <class SYNCH_POLICY>
class ACE_Message_Queue
{
public:
  ACE_Message_Queue (void) { /* Initialize queue. */ }
  ~ACE_Message_Queue (void) { /* Delete queue. */ }

  // Other methods omitted...
};
Since the destructor is called automatically once the class goes out of scope, we can ensure that the contents in the queue are released. This ``acquire/release'' protocol is particularly important if the contents of the queue are resources (like locks or sockets) that are relatively scarce, compared with virtual memory.

Java lacks general-purpose destructors. Instead, it provides constructs like finally and finalize to manage the release of resources. The ACE_Message_Queue class shown above can be written in Java using the construct finalize as follows:


class MessageQueue
{
  public MessageQueue () {
    // Initialize the message queue...
  }

  public void close () {
    // Close down the message queue and 
    // release all resources...
  }

  protected void finalize () throws Throwable {
    // Delete queue resources when garbage
    // collection is run on this object...
  }
}
Similarly, the Java finally construct can be used inside a method to ensure that resources are released when the method goes out of scope (even if an exception is thrown). Here's an example from ACE in which the svc method uses a MessageQueue to process several messages. To ensure that the MessageQueue is properly closed (and the resources released) when the method goes out of scope, we use the Java finally construct.
 
class Task
{
  // Run by daemon thread to do deferred processing.
  public int svc () {
    try {
      for (;;) { // Process the service in a loop.
        // Dequeue a message for processing.
        MessageBlock msg = 
          msgQueue ().dequeue ();

        // Process the msg...
      }
    } catch (Exception e) {
      System.err.println (e);
      return -1;
    } finally { 
      // Note: this block is executed even if an 
      // exception is raised.

      // Shutdown the queue.
      msgQueue ().close ();
    }
  }
}
Unfortunately, Java's finally and finalize constructs exhibit the following two problems:

  1. Manual intervention -- In the case of finally in Java ACE, the cleanup code needs to be inserted manually in finally blocks, which can be tedious and error-prone. In contrast, C++ ACE makes it easier since the compiler ensures that a destructor of the MessageQueue class is called automatically on exit from the svc method's scope. Therefore, programmers need not insert finally blocks containing cleanup code into multiple blocks.

  2. Premature resource exhaustion -- In the case of finalize, the problem centers around ``lazy deletion.'' Since the garbage collector may not get run until the application runs low on memory, this approach is prone to premature resource exhaustion. In particular, a program that queues many locks or socket descriptors will run out of these resources long before running out of memory. Java does provide hooks to allow programmers to explicitly force garbage collection using the Runtime class's gc method. In addition, the Runtime class also provides a runFinalization method that runs any pending class finalizers when invoked. However, doing any of this requires manual intervention on the part of the programmers. Since garbage collection takes time, running it explicitly multiple times may not be desirable in time-critical applications.


3.6. Lack of Explicit Parameters Passing Modes

In contrast to other OO languages like C++ and Ada95, Java doesn't support user-specified parameter passing modes. Instead, Java has reference semantics for object paramters, but passes these references by value. For example, in the following code, even after the someMethod invocation, bar still points to the String object associated with the literal string ``original.''

void someMethod (String foo) {
  foo = "changed";
}

void caller () {
  String bar = "original";
  someMethod (bar);
  // bar is unchanged...
}
We encountered this problem in ACE when we tried to implement the class ACE_SOCK_Stream, which encapsulates the OS data transfer mechanisms for sockets. The recv and recv_n methods of C++ ACE rely on the message being passed back to the caller by reference. Since Java only passes references ``by value,'' this became problematic.

To circumvent this limitation when converting ACE from C++ to Java, we identified the following four solutions:

  1. Add an extra level of indirection -- We applied this common pattern and added Java wrappers around classes we wanted to pass by reference. Then, we passed instances of the wrapper class. Thus, in the example above, we created a wrapper string class called CString. We then passed a reference to this wrapper object that contained the actual String. Here's what the CString class looks like:
     
    public class CString
    {
      // Constructor.
      public CString (String s) { s_ = s; }
    
      // Set the underlying string.
      public void rep (String s) { s_ = s; }
    
      // Get the underlying string.
      public String toString () { return s_; }
    
      // Note, other type-specific constructors.
      // can be added.
    
      private String s_;
    }
    
    With this change, the String object can now be changed both by the caller and the callee. Thus, the first someMethod example can now be rewritten as follows:
    
    void someMethod (CString foo) {
      foo.rep ("changed");
    }
    
    void caller () {
      CString bar = new CString ("original");
      someMethod (bar);
    }
    
  2. Use one-element array -- The second solution passed the String object in a one-element array. Using this solution, the first someMethod example can be rewritten as follows:
     
    void someMethod (String[] foo) {
      foo[0] = "changed";
    }
    
    void caller () {
      String[] bar = new String[1];
      bar [0] = "original";
      someMethod (bar);
    }
    
    Although this syntax is somewhat ugly, it is relatively concise. Moreover, we can generalize this approach to pass in multiple objects by reference via a multiple-element array.

  3. Use mutator operators -- Our third solution used mutator methods. For instance, in our example above, we could use StringBuffer instead of String to pass the messages between the caller and the callee. Using this solution, the above example can be rewritten as follows:
    
    void someMethod (StringBuffer foo) {
      foo.setLength (0);
      foo.append ("changed");
    }
    
    void caller () {
      StringBuffer bar = 
        new StringBuffer ("original");
      someMethod (bar);
    }
    
    Note that this is the solution we used in implementing SOCKStream in ACE Java. This is because, this solution required the least amount of change to our original design. Unfortunately, the numeric wrappers (e.g., class Double and Integer) lack mutator operations. Therefore, they aren't useful for returning scalar parameters by reference (despite the claims made by certain Java books...). This forces programmers to write many additional (and incompatible) wrapper classes that allow these types to be passed by reference.

  4. Return objects by value -- The fourth solution relies on the callee returning a modified version of the object as the return value to the caller. The caller then assigns the return value to the original object in order to ``update'' it. Using this solution, the above example can be rewritten as follows:
    
    String someMethod (String foo) {
      return new String ("changed");
    }
    
    void caller () {
      String bar = new String ("original");
    
      // Call someMethod and reassign 
      // bar to the return value.
      bar = someMethod (bar);
    }
    
    However, to generalize this approach to return multiple values by reference requires defining even more helper classes.


3.7. Lack of Explicit Inline Functions

The lowest-level of ACE is the OS adaptation layer. This layer consists of
C++ OS wrappers that shield systems programmers from using the tedious, error-prone, and non-portable native OS APIs written in C. In C++, there is essentially no cost for adding this extra level of abstraction since inline static functions can be used.

In contrast, adding an extra level of abstraction can increase the cost of method calls in Java since it does not support explicit inlining. There are various patterns for handling this problem using the Sun JDK. For instance, the JDK Java compiler does allow the user to specify "-O" flag, which does some optimization, but the user has no direct control over this without using convoluted, non-portable techniques. Whenever possible, methods (or entire classes) in Java should be declared final to facilitate inlining.

In general, Java doesn't provide standard hooks (like the C/C++ register keyword) to programmers to manually suggest optimizations. Instead, Java relies on effective compiler optimizations that, in theory, remove the need for programmers to ``hand-optimize'' performance. In practice, it remains to be seen whether Java compilers can perform sufficient optimizations to compete with the performance of C/C++ compilers.


4. Concluding Remarks

This article presents our experience converting the ACE communication software framework from C++ to Java. Overall, the conversion process has been both exciting and frustrating. It was exciting because we had the opportunity to explore Java features that aren't available in C++ (such as meta-class programming), as well as to validate communication software design patterns we'd originally discovered using C++. In addition, converting ACE to Java provided us with many insights on the benefits and limitations of using Java for industrial-strength framework development, particularly for concurrent network servers.

The following are our recommendations to developers who plan to build systems in Java, either from scratch or by converting code originally written in C++:

As more systems are developed using Java, the strengths and weaknesses of Java will become more apparent. Whether these systems are developed from scratch using Java or are being converted from C++, time will tell how well Java suits the needs of the industry. Our goal in writing this article was to help other C++ developers benefit from our experiences in order to use Java effectively in their projects.

The C++ and Java versions of ACE are freely available via the WWW at URL ACE.html.


5. Acknowledgements

We would like to thank Chris Cleeland, David Holmes, Doug Lea, David Levine, and Greg Wilson for comments on earlier drafts of this article.


6. References

[Lea:96] Doug Lea, Concurrent Java: Design Principles and Patterns, Addison-Wesley, 1996.

[AWT:96] AWT Components, Source: http://java.sun.com/tutorial/ui/overview/components.html

[Coplien:92] James Coplien. Advanced C++ Programming Styles and Idioms. Reading, MA.: Addison-Wesley, 1992.

[Schmidt:94] Douglas C. Schmidt, ACE: an Object-Oriented Framework for Developing Distributed Applications, Proceedings of the 6th USENIX C++ Technical Conference, Cambridge, Massachusetts, April, 1994.

[Cargill:96] Specific Notification for Java Thread Synchronization, Proceedings of the 3rd Pattern Languages of Programming Conference, Allerton Park, Illinois, September, 1996.

[GoF:95] E. Gamma, R. Helm, R. Johnson, and J. Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software, Reading, MA: Addison-Wesley, 1995.


This page is maintained by Prashant Jain and Douglas C. Schmidt. If you have any comments or suggestion for improving this document, please send us email.

Last modified 11:34:18 CDT 28 September 2006