Object-Oriented Programming across Native-Virtual Boundaries Paul Werbicki Rob Kremer Computer Science Department University of Calgary 2500 University Dr., NW, Calgary, Canada, T2N 1N4 1-403-2841718 Computer Science Department University of Calgary 2500 University Dr., NW, Calgary, Canada, T2N 1N4 1-403-2205112 [email protected] [email protected] [email protected] ABSTRACT There exist many implementations of object-oriented programming languages that execute on virtual computers which abstract from native host environments. These languages provide developers with a highly mobile platform where applications may easily execute on multiple environments. However, there exist situations where it is not possible to abstract from the host environment, forcing the developer to program at least part of the application specifically to a specific environment. Developing a single application using both native and virtual code allows those mobile portions to remain abstracted from the environment while at the same time providing the required native access. This paper discusses using object-oriented programming to allow the use of C++ and Java in a single application. A class library, developed as part of the investigation, is used to enhance the native-virtual boundary interface and provide the developer with support for the use of objects. By examining how the library works it is possible to gauge the level of support currently available for this technique. An analysis of the available support highlights where virtual machines need to provide additional functionality to fully enable this method of programming. Categories and Subject Descriptors D.3 [Software]: Programming Languages; D.1.5 [Programming Techniques]: Object-oriented Programming; D.2.3 [Software Engineering]: Coding Tools and Techniques – Object-oriented Programming; D.3.4 [Programming Languages]: Processors Interpreters General Terms Experimentation, Standardization, Languages. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Virtual Execution Environments (Vee’05), June 11-12, 2005, Chicago, Illinois, USA. Copyright 2005 ACM 1-58113-000-0/00/0004…$5.00. Keywords Native-virtual boundary, interoperability, object integration, C++, Java, Java Virtual Machine, Java Native Interface, design patterns. 1. INTRODUCTION One of the tenets of interpreted programming languages, Java in particular, is the ability to develop and compile software once and have it execute, without modification, on all environments supported by the virtual machine [4]. To achieve this, the software must generally be written entirely in the interpreted programming language. In some applications, however, it may be necessary to have portions of the software highly mobile, while other portions must take advantage of optimizations only provided on a single native host environment. A practical example of this type of application exists in the area of Multi-Agent Systems (MAS). In these types of applications, all agent processes need to communicate regardless of the environment in which they execute, but some provide services and use resources only available on specific operating systems and hardware platforms. Some components of each entity, specifically the communications library, benefit from being mobile without having to be ported to each new environment. However, it may not be possible to port the entire agent. Some virtual machines provide methods for integrating both virtual code and native code within the same program. The Java Virtual Machine (JVM) with its Java Native Interface (JNI) is a good example. Using the JNI it is possible to make calls into mobile components of the application without having to write the entire program in an interpreted programming language [5]. The JNI exposes a set of C functions that provide a low-level method of manipulating, controlling and executing code compiled to run on the virtual machine. The JVM executes in the same process space and interacts with native code across the native-virtual boundary defined by the JNI. The procedural approach used by the JNI does not lend itself well to using mobile code written as an object-oriented library. Instead of accessing objects using procedural-style JNI functions, it is highly desirable from a software engineering standpoint to treat these interpreted objects as normal objects within the native portions of the application. To investigate this approach, JavaCOM [11] was developed, a class library written in C++ under the Microsoft Windows family of operating systems. The use of JavaCOM makes it somewhat possible to treat objects within an interpreted library as objects within a C++ application or as components within the Component Object Model (COM) [6] in languages such as Microsoft Visual Basic and C++. This paper describes the challenges of trying to integrate interpreted objects and native objects within the same application. The goal of this research is to attempt to achieve complete re-use of an object-oriented communication library for Multi-Agent Systems written in Java within an agent application developed in C++. In doing so the hope is to discover ways to achieve objectoriented development using interpreted and native code within the same process. Where this is not possible, the goal is to propose requirements of virtual machines needed to support such applications. 3. 4. 5. 1.1 Paper Overview This paper is structured as follows. Section 2 specifies the problem further, providing reasons why it is desirable to mix native and interpreted code and introduces an application that benefits from such a scenario. Section 3 outlines approaches used to contain the JVM in a native application. Section 4 describes the JavaCOM class library and how it abstracts from the JNI using design patterns to achieve a level of integration. Section 5 analyses how JavaCOM use of proxy classes provides support for object-oriented programming between C++ and Java. Section 6 presents requirements for the JVM and virtual machines in general to fully support object-oriented programming. We conclude with the benefits this research has for software engineering in general and future work in this area. 2. PROBLEM SPECIFICATION The ability to integrate libraries written in an interpreted programming language (such as Java) into an application written in a native programming language1 presents some interesting opportunities in software development. Interoperability between programming languages impacts many aspects of an application’s development cycle, from initial design decisions, to the cost of maintaining the completed application. From a software engineering viewpoint, some obvious advantages include: 1. 2. 1 Decreased need to choose a single programming language for development. One of the first and often most difficult decisions of any project is the choice of programming language. Developing an application with multiple programming languages reduces this to deciding which programming language will be the primary one, with the other languages being contained by the primary language. Access to more supporting libraries and a larger market of third-party software. It is common for developers to prefer one language over another, often because of the supporting libraries which make the task of programming easier. Combining multiple programming languages into a single application provides the developer with access to the set of Here, native programming languages are considered to be languages that generate machine code specific to a host environment (such as C++ or Visual Basic). all supporting libraries and third-party software provided by all of the programming languages used in the application. Single language for library/API development. Due to the increased access to supporting libraries, researchers and third-party software developers do not need to support different version of their software, one for each programming language they need to support. Instead they can concentrate on creating a single, mobile language API and allow the enduser to integrate this into their application by integrating the programming language in which it was developed. Code reuse without access to the original source code. In situations where applications are being ported from one language to another (such as in legacy systems) the original source code may not be available. In cases like this it is possible to contain the original code and use it directly, instead of re-writing it and potentially introducing errors. Integration of specialized operating system components, hardware platform components and legacy software. Accessing specialized components and integrating with legacy systems is common in Software Engineering. Combining native and virtual code makes it possible to use these exiting pieces from newer programming languages. With the popularity of new object-oriented programming languages, such as Java, that are free from any procedural heritage (unlike C++) more code is begin developed and shared in languages that promote only object-oriented development. Integration of these libraries into a native application using a procedural interface, such as the JNI, is awkward. It breaks down the object-oriented paradigm [9] making it difficult for the developer to write a “good” program. If a language is said to support object-oriented programming then it must provide facilities to make it easy to program using objects. Even though two programming languages both support objectoriented programming it may be difficult for the flow of control to pass from the first programming language to the second. This interaction may take exceptional effort or skill on behalf of the developer and merely enable the use of multiple programming languages within a single application, but not support it. To fully support object-oriented programming using multiple programming languages there must be support for making it easy to integrate objects. Support for object-oriented development between programming languages follows the same requirements for object-oriented programming languages. The most significant features include: 1. 2. 3. 4. Type Checking. Object-oriented programming languages are for the most part typed languages. The compiler rejects programs that are not well-typed based on the values that are expected during compilation. Calling Mechanism. For a given object, there must be a method for calling a specific member function. The calling mechanism must respect inheritance – directing the call to the proper object in the inheritance hierarchy. Encapsulation. Combining elements together to create a new entity is important for any object-oriented programming language. A class must be able to contain member functions and variables and protect or expose those elements as dictated by its purpose. Inheritance. Class derivation (subclassing) is an important technique in object-oriented programming. It allows a general concept expressed in a base class to be specialized in a subclass without having to modify the base class. interoperate with other Microsoft programming languages and operating systems. Including these features in the interface between programming languages provides support for object-oriented programming using multiple programming languages. For example, it would be ideal to have a native language class (C++, for example) inherit directly from a class written in Java. Functionality such as Raw Native Interfaces (Microsoft’s version of the JNI) and J/Direct [6] allowed Java developers to call functions contained in dynamically-linked libraries directly from Java applications. Java Callable Wrappers and COM Callable Wrappers [10], concepts similar to a proxy, allowed Java developers to expose their code as COM components and in turn use COM components written in other programming languages as if they were regular Java objects. 2.1 Multi-Agent Systems: A Practical Example Multi-Agent Systems are a good example of applications that benefit from the mixing of interpreted and native code. MultiAgent Systems consist of computational entities (referred to as agents) that interact with one another to achieve a common goal beyond the capabilities of an individual [8]. Communication is fundamental to these systems, providing the foundation upon which cooperation between agents takes place. Generally, communication between agents is fairly straight-forward, using capabilities within the operating system (inter-process communication, networking, etc.) to transfer information between agents. The method of communication used by every agent in the system must be the same to ensure that they can communicate properly. In addition, the protocols (language, formatting, and turn-taking schemes) of communication must be common among the agents, and these can be quite complex. All this is appropriately handled by libraries written in mobile, interpreted languages. The internals of an agent however vary greatly depending on the purpose of that agent. In the case of “expert” agents, their purpose may be to provide access to a specialized hardware resource, or to perform specialized computations for which they are highly optimized. For the former example it may only be possible to implement the agent using native code, in the latter example it may be more efficient or obtain better performance. However, if this agent is able to communicate with other agents (using interpreted code) it is able to expose its expertise to the rest of the system allowing others to take advantage of the specialized services it provides. The problem arises that for each programming language used to develop an agent in the system, the communications portion of the agent must be ported to that programming language/operating system combination. This decreases code reuse among agents, and increases code duplication, maintenance costs and the potential for bugs to occur between the various versions of the library. However, if the communications portion of the agent was contained in an interpreted library that was used by the agent, irrespective of the programming language used to implement the agent’s internals, the developer could achieve the advantages stated earlier. 3. METHODS OF INTERFACING WITH THE JVM One of the first implementations to provide support beyond the JNI for integrating Java code into native applications was the Microsoft JVM [10]. In order to have Java accepted at the time by developers as a Microsoft mainstream language, effort was made to make Java as powerful and as flexible as it could be. Extensions where added to the Microsoft JVM to allow it to These extensions were specific to the Microsoft JVM which meant that any developer who employed these extensions was only able to run their programs on the Microsoft family of operating systems. This breached part of the license agreement between Microsoft and Sun Microsystems and the court case that was eventually won by Sun forced Microsoft to abandon the Microsoft JVM [3]. After the Microsoft – Sun fiasco it became obvious that changes made directly to the JVM, to support easier interfacing between native and interpreted code, would not be possible. New approaches would have to sit outside of the JVM treating it as a black box and using only the JNI as the interface mechanism. Many commercial and open-source replacements appeared to fill the void left by the departure of the Microsoft JVM. Some of these new approaches included enhancement libraries that provided higher-level interfaces to the JNI. These enhancement libraries wrapped the JNI function set into a class or template library, exposing a more intuitive object-oriented interface to the developer. Proxy class generators were another popular approach. They took Java class files and produced C++ header and source files embedded with the necessary JNI function calls. These utilities could be integrated into a project’s build cycle so that as the Java classes changed, the proxy classes changed as well. Other approaches went so far as to use Inter-Process Communication (IPC) and Remote Procedure Calls (RPC), relying on existing wire protocols to handle the calls across the native-virtual boundary. These later approaches were more complicated requiring multiple processes and greater installation requirements. 4. JAVACOM Our JavaCOM class library started as a simple enhancement library to make the Java Native Interface easier to use. One of the drawbacks of the JNI is the procedural interface between the Java Virtual Machine and native programming languages: the extensive functionality provided by the JNI to allow for integration between Java and C/C++ increases the complexity of performing simple programming tasks [2]. For example, accessing a member variable in a Java class inside the JVM from C++ requires multiple operations. In many cases, JNI functions require the conversion of data into a form accessible by native programming languages before it can be used. Also, resources created by the JNI must be explicitly freed - a paradigm that is counter to the garbage collected nature of the Java programming language. Figure 1. JavaCOM Class Diagram The complete JavaCOM library (see Figure 1) was designed to take advantage of design patterns as defined by Gamma et al [1]. Initially, the only class that existed was the CJavaVM singleton class which is used to start and manage the JVM inside of the current process. Then the CJTypeInfo class was developed to simplify type checking and the CJObject class to contain Java object references and streamline the calling mechanism. By using the library to create a façade design pattern the JNI interface was made substantially easier to use. The preceding Visual Basic example shows how the JavaCOM library was able to adapt the java.util.Date class so that it could be used as a Visual Basic component. This is done using the IDispatch interface, where method calls in Visual Basic are expanded upon execution/compilation so that to the developer the method call uses natural programming language syntax. However, under the covers, at run-time the parameters to the call are being packaged up and a method invocation is being performed by method name. Using an adapter class it was possible to provide an interface to CJObject that was compatible with other programming languages. Originally the goal of JavaCOM was to allow COM-based languages to use Java objects as if they were simply components of the operating system [10]. Through the adapter, programming languages such as Microsoft Visual Basic where enabled to use Java objects with the IDispatch interface, a late-bound calling mechanism in COM. This exposed functionality to Visual Basic applications that was only available in Java class libraries. However, when using the IDispatch interface in C++ the developer does not have the luxury of the compiler performing these steps on their behalf as it does in Visual Basic. The developer is forced to write code each time the CJObject class or IDispatch interface is used, packaging up the parameters and performing a Java member function call by name. Dim objJavaVM As JavaCOM.JVM Dim objDate As Object Dim strTest As String Set objJavaVM = CreateObject(“JavaCOM.JVM”) Call objJavaVM.Initialize(“”) Set objDate = objJavaVM.CreateObject(“java.util.Date”) MsgBox “Date= “ & objDate.toString() To provide an easy interface for C++ developers, where Java objects can be treated like C++ objects, the use of a proxy was introduced. A utility included with the JavaCOM library, translates a given Java class’s interface into a C++ header file with inline expansion of all method calls. Using this header file the developer need only create this object and call a method using C++ syntax and types and all of the work integrating with Java is performed by the JavaCOM library. Managing the JVM, containing the Java object and converting between C++ types and Java types is performed through the use of the CJavaVM, CJObject and CJTypeInfo classes. Figure 2. Flow of Control 5. OBJECT-ORIENTED SUPPORT USING PROXIES The JavaCOM library, along with the proxy generating utility, provides a compact solution for achieving a level of easy integration of C++ and Java. Generated proxy classes provide a compile time bridge between C++ native code and the JVM, allowing the developer to cross the native-virtual boundaries very easily. To examine the level of support provided for objectoriented programming across native-virtual boundaries using this method we compare how the proxy works to implement the features described above. Type checking is handled at compile time by the C++ compiler. Each proxy class is uniquely named by JavaCOM using a common name mangling scheme based on the fully qualified Java class name. Naming proxy classes uniquely allows each class type to be treated individually by the compiler as a separate type. Without the proxy class the developer would be forced to use a CJObject to represent every Java class which provides no distinction between the various classes and no clue to the compiler that they represent different types. The calling mechanism is implemented through inline member functions exposed by the proxy class. A proxy class inherits privately from the CJObject class which provides the generalized calling mechanism used by the inline functions. Parameters from the stack are packaged up and along with the method name are passed to CJObject’s Invoke() method which performs a method invocation by name on the Java object. Where method names are the same, parameter overloading is assumed and the method is matched based on the type signature of the parameters provided. Encapsulation of member functions is possible through the proxy class using private, protected and public section qualifiers. However, the encapsulation of member variable access is more difficult. Access to member variables may not be done directly since there must exist some code in the proxy that forwards the request to the JVM to get/set the value. As such all member variable access, including public member variables, must be performed through accessor functions. By placing the accessor function in the appropriate section of the class it is possible to obtain encapsulation of member variables. Perhaps the most difficult feature to support is inheritance. Given that a proxy class is defined as a C++ class with inline methods it is possible to inherit the proxy and override member functions, essentially inheriting and overriding the contained object at the same time. This works for the most part as long as one strict rule is followed: if a method in the Java superclass calls a function overridden in the C++ subclass, that function must itself be overridden and its implementation replaced in the subclass’s programming language. This rule is important due to the callback nature in which method overloading works. However, the above rule breaks down when the flow of control starts on the interpreted side of the boundary. Looking at Figure 2 we see a C++ class that inherits from a Java class. Here, calls are able to originate from the native side to the virtual side, but not the other way around. When a message arrives asynchronously from the network, the flow of control starts from the interpreted side and the superclass in Java has no mechanism for calling the overloaded function in C++. This is because there is no indication that there is a native class inheriting the interpreted class from outside of the JVM. In this instance, moving this method to the other side of the native-virtual boundary still does not provide the inheritance mechanism with the information it requires to direct the call to the proper native function. Understanding these significant drawbacks allows a developer some level of object-oriented support for inheritance. The developer needs to be aware of the internal implementation of the superclass as well as any flow of control originating in the interpreted code. This goes against some of the software engineering advantages stated earlier. Through trial and error it may be possible to determine if and when the above rule can be applied. However, every time the superclass changes the rule needs to be applied to all overridden methods in case a new dependency has been introduced. In the case of flow of control issues, significant redesign of the interpreted library, including the embedding of native methods in the Java class, may be necessary to accomplish the needed callback. This elaborate coding does not make it possible for a developer to easily integrate objects between two object-oriented programming languages. 6. VIRTUAL MACHINE SUPPORT JavaCOM was designed to work independent of the virtual machine, containing and extending it while treating it as a black box. The JNI, which provides a rich set of functions for working with the JVM, provides the interface to this black box but only up to a certain point. In order to fully support object-oriented programming across the native-virtual boundary using this approach, an extension to the JNI is required. Through experimentation using the JavaCOM class library and several sample applications, two possible approaches were discovered. The first is an event-based approach that monitors member function calls and re-directs them to the appropriate overridden native method. The second approach uses callbacks at the object level into native code allowing the overriding of interpreted methods. The event-based approach would place the burden of control outside of the JVM, providing a way to hook into the inner workings of the virtual machine. A library, such as JavaCOM, would request notification events for member function calls. Before a member function is invoked, an event would be raised by the virtual machine provided with the object reference, the method in question and its type signature. It would then be up to the library to match the instance of the interpreted object with a native object inheriting it on the other side of the boundary. If the interpreted object is being inherited, the call would then be made into the object on the native side and the original call would be cancelled, otherwise the event would be released and the call would be made to the interpreted code as normal. The object level callback approach is similar to how the vtable works in C++. Each object maintains a table with an entry for each member function in the class of the object. Through the JNI, external libraries like JavaCOM install callbacks in the table for each function being overridden. When a member function is being called by the virtual machine the callback table is first checked and if there is an entry for the member function being called, the call is re-directed to the native side of the boundary, otherwise the call proceeds as normal. This approach places the burden on the virtual machine but also provides better integration making the virtual machine aware of what is actually happening to the flow of control. There already exists an event mechanism used by the Java Virtual Machine Tool Interface (JVMTI). With some minor changes this interface could be adapted to serve as the event notification mechanism. However, two problems come from such an approach: first using the JVMTI adds an additional footprint to the JVM which may be unwanted overhead in many applications, and second the JVMTI was designed as an interface for debugging and profiling, using it as a way around JNI deficiencies means using it for something it was not designed to do. Ultimately the JNI is the proper location for these approaches to be implemented. These two approaches demonstrate how virtual machine developers must be aware of objects on the native side that inherit from interpreted objects. There surely exist other approaches to enhancing the JVM and virtual machines in general but this theme is common among all of them. With virtual machine developers aware of this approach to using their platforms, and with support from them in their interfaces, it is possible to fully support object-oriented programming across the native-virtual boundaries. 7. CONCLUSION This paper describes the problem of how the interface between interpreted object-oriented languages and native object-oriented languages is a procedural interface, not an object-oriented interface, which violates many of the advantages of working with object-oriented languages. The problem is particularly interesting because our group has built an agent-based system with extensive protocols in Java. However, some agents have components which are difficult or impossible to implement in Java and must be implemented in native languages such as C++ and Visual Basic. The cost of re-implementing the protocol portions of the agents in various native languages is extremely high, and fraught with maintenance and compatibility problems. Therefore, there is great motivation to program in mixed languages. Unfortunately, the interface between Java and native languages is purely procedural, and our implementation requires native code to inherit from and extend various Java agent classes. To address the problem, JavaCOM, a class library, has been implemented that seeks to provide a clean, object-oriented interface between Java, the Java Virtual Machine, and native languages. Using JavaCOM has been very successful, providing the ability to generate (from the Java code) native-language proxy classes which cleanly wrap the ugly details of marshaling calls to member functions in the Java-based superclass. However, the JavaCOM solution fails to perfectly follow the tenets of objectoriented programming: if a native subclass of a Java class overrides a method in the Java superclass and the Java superclass has another method that calls the overridden method, the program will invoke the Java superclass' method not the native subclass' method, as it should. The reason for this failure is that there exists no way, through the JNI, to inform the JVM that native code has overridden a method in a Java class. There needs to exist a facility within the JVM to receive messages when an object’s member function is called. This requirement may be extended for virtual machines in general. Since changing the JVM is not an option, two possible solution strategies are proposed that, unfortunately, involve extensions to the JNI. The hope is that one of these extensions will eventually be implemented. 8. REFERENCES [1] Gamma E., Helm, R., Johnson, R. and Vlissides, J. Design Patterns, Elements of Reusable Object-Oriented Software. Addison Wesley, Boston, MA, USA, 1995. [2] Gabrilovich, E. and Finkelstein, L. JNI – C++ Integration Made Easy. C/C++ Users Journal. CMP Media LLC, Manhasset, NY, USA, January 2001. [3] Gilbert H. The Tragedy of Microsoft and Java. Technology and Planning. Yale University, New Haven, CT, USA, 2003. [4] Gosling, J., and McGilton, H. The Java Language Environment. A White Paper. Sun Microsystems Inc., Santa Clara, CA, USA, 1996. [5] Liang S. The Java™ Native Interface – Programmer’s Guide and Specification. Addison-Wesley, Reading, MA, 1999. [6] Microsoft Corporation. Writing Windows-Based Applications with J/Direct. Microsoft Developer Network Library. Microsoft Corporation, Redmond, WA, 2005. [7] Microsoft Corporation and Digital Equipment Corporation. The Component Object Model Specification - Draft Version 0.9. Microsoft Corporation, Redmond, WA, 24 Oct 1995. [8] Synder, R. D.; and Tomlinson, R. S. Robustness Infrastructure for Multi-Agent Systems Proceedings of the Open Cougaar, New York, N.Y., U.S.A, 2004. [9] Stroustrup, B. What is Object-Oriented Programming? Proceedings of the 1st European Software Festival, Munich, Germany, February 1991Sdafa [10] Verbowski, C. Microsoft Virtual Machine. Intergrating Java and COM – A Technology Overview. Microsoft Corporation, Redmond, WA, USA, Nov 1998. [11] Werbicki, P. JavaCOM. Master’s Thesis. University of Calgary, Calgary, AB, 2004.
© Copyright 2026 Paperzz