Cross-language Babel structs—making scientific interfaces more efficient

Adrian Prantl; Dietmar Ebner; Thomas G W Epperly

doi:10.1088/1749-4699/6/1/014004

1. Introduction

Babel [1, 2] is an open-source language interoperability framework tailored to the needs of high-performance scientific computing. It addresses widespread interoperability requirements of high-performance scientific applications that are mainly caused by (a) the overwhelming amount of legacy code still in use and (b) the trend to integrate various mathematical models, usually implemented by different teams in different languages, in order to increase simulation precision, e.g. climate models might be combined with social models to predict emissions of carbon dioxide. Developing a common language ecosystem for all components is, for most applications, infeasible, for both technical and economical reasons.

One paradigm to manage this complexity is component-based software design. This approach can greatly facilitate reuse, interoperability and composability of software. Consequently, it has become very popular in the design of business applications and internet technology. There are a large number of widely available frameworks, e.g. CORBA/CCM [3], Microsoft's (D)COM [4, 5] and .Net [6], or Sun's JavaBeans [7]. More recent examples include Mozilla's XPCOM [8], Google Protocol Buffers [9] and Facebook's Apache Thrift [10], with the latter two focusing on serialization protocols for transmitting method calls over a network.

Babel is part of the Common Component Architecture (CCA) [11, 12]—a joint effort by researchers from both academia and U.S. national laboratories to establish and adapt these techniques for scientific computing. The CCA basically mediates how components interact with each other and with the underlying framework.

Babel is based on the scientific interface definition language (SIDL), which builds on previous work such as CORBA and COM by tailoring the idea to the needs of scientific computing. SIDL provides a language-independent, object-oriented programming model and type system. This allows components to share complicated data structures such as multi-dimensional arrays, interfaces or exceptions across various languages. Babel generates the necessary glue code that maps these high-level interfaces to a particular language ecosystem. As such, it can be used stand-alone or as part of the full CCA component framework, which provides additional capabilities such as dynamic composition of applications.

One of Babel's main design principles is to scale well with a growing set of supported languages. Currently, backends are available for most traditional languages relevant to the high-performance computing community including various versions of Fortran, C/C++, Java and Python. While the main focus is on fast in-process communication, there is also full support for transparent remote method invocation (RMI) [13]. In the latter case, caller and callee may reside in a different address space or on different machines.

With its focus on high-performance computing, Babel has first-class support for fundamental numeric types and multi-dimensional arrays, including array strides, dynamic ranges and ordering specifications (row-major versus column-major). Babel has elaborate support for arrays because they are a critical programming construct for scientific computing.

Until the work described here, Babel has not had a good way of representing simple data aggregations other than as an SIDL class. The SIDL class approach has frustrated Babel users because it is much more complex and less efficient than the data structure it tries to replace, the simple struct. By introducing structs, we relieve Babel users of the tedious implementation of SIDL getter/setter classes. The resulting system provides higher performance, greater developer productivity and a more natural-looking interface.

In this paper, we discuss why and how struct data types have been added to this mix and how they map to native language constructs such as structs, records or derived types (sections 2–4). Section 5 discusses performance and code size considerations for supported language bindings.

2. Design principles

We extended Babel's SIDL language with a struct idiom: a user-defined data type that maps to corresponding language constructs found in most imperative programming languages, e.g. structs in C/C++, records in Pascal or derived types in Fortran. They represent a useful alternative to classes whose main purpose is to group semantically related data together.

Structs are important for scientific programming and a component architecture for scientific programming. In scientific programming, there are often data structures that do not have any associated methods, and these are more naturally represented as structs. It is quite natural to think of a three-dimensional coordinate as a collection of three doubles: x, y and z. In addition, structs provide data aggregation without incurring any runtime penalty for access to data. In some cases, legacy software libraries have structs as part of their interface, and by introducing SIDL structs, one can provide a more natural multi-language interface to the legacy library. Adding structs to Babel also makes it easier to support interoperability with other component frameworks that have structs as first class objects.

Figure 1 shows the contrast between the syntax for an SIDL struct (on the right-hand side) and the syntax for an SIDL class with getter and setter methods (on the left-hand side). The mode-attribute in before the argument types in the class example specifies that the argument is passed by value. This is analogous to the Fortran INTENT attribute. Other mode attributes are out and inout.

The SIDL struct syntax shown on the right side of figure 1 is very similar to the struct syntax of C and C++. For simple numeric types, the syntax is identical. The syntax for arrays is different, so structs with arrays look different than they would in C or C++.

The use of an explicit SIDL class to store data has the main advantage that the actual storage layout is hidden in the class and can easily be changed without effects on dependent code. Structs, on the other hand, can be accessed faster in most languages and require less code to be written, thereby reducing development effort.

The particular design choices for the addition of SIDL structs were governed by the following goals: performance, development effort, completeness and compatibility. These goals and their impact on the design are clarified below³.

Performance. For cutting-edge computational codes, lost performance means less accurate scientific models. Reads and writes to main memory are significantly slower than the processor's speed; hence, avoiding copying is critical to achieve optimal performance. In addition, it is important to avoid additional function calls and to declare the struct in a form that the compiler can generate code that directly accesses the data elements. Using an SIDL class with getter and setter routines involves a virtual function call that, in general, involves dynamic dispatch and marshaling of arguments and return values. For regular Babel classes, this overhead is usually easily amortized over the amount of work in the procedure. However, getter and setter routines execute very small amounts of code. Thus, the overhead compared to natively supported structs can be substantial.
Development effort. Regular Babel classes are more verbose than structs. They (a) require the declaration of access methods for each data member, (b) have to be implemented by the user and (c) are often less concise in their use compared to the clean syntax usually provided for struct field accesses.
Completeness. Babel tries to map SIDL constructs and types to a particular language in a way that makes experienced developers feel at home. A struct feels often more natural for a particular purpose than a fully fledged SIDL class.
Compatibility. Compatibility is important in two aspects. First, related systems such as CORBA [3] or WSDL [14] already support the concept of structs; SIDL structs thus facilitate the development of compatibility layers. The second aspect is compatibility of user-defined SIDL interfaces with existing legacy software. The addition of SIDL structs often allows us to wrap these interfaces using little or no code. This is not only faster; it also feels more natural to people familiar with the existing interface.

SIDL structs have been designed with these goals in mind. C, C++, Chapel and Fortran 2003/2008 are the bindings where these goals are best achieved. For structs involving simple numeric types, these language bindings have fast data exchange that requires no copies between these languages. This is made possible by recent developments in Babel, which now supports Fortran 2003 features such as ISO-Bind(C) compatibility and type extensions. The other language bindings involve more compromises and trade-offs.

SIDL structs may contain any SIDL type, including arrays and raw arrays (r-arrays). The latter are a special SIDL feature that allows for low-level access to numeric arrays (section 3). Structs may also be nested within other structs. There is currently no support for arrays of structs. While this would be possible, the implementation is non-trivial, and we found that feature not heavily requested by users.

For regular classes, memory is automatically allocated and freed by Babel via a reference counting scheme. This is important as Babel applications often contain modules written in a variety of languages with different approaches to memory management and garbage collection. However, there is no reference counting for structs. This choice was made to keep structs as simple as possible. Adding reference counting to structs would require adding an integer value to the struct to store the count, and it requires methods for managing the count. For the common case of structs containing simple numeric types, the reference counting infrastructure adds complexity and runtime costs when it is not needed. For complex structs, it is the responsibility of the programmer to make sure that memory is properly allocated and released, and that there are no dangling references once a struct is freed. Babel generates corresponding support functions in order to do so for languages without support for dynamic memory allocation such as Fortran 77.

All Babel objects support transparent RMI [15]. No code modifications are required to switch from using a local object to using a remote one. For each struct, the Babel compiler therefore generates a serialization and de-serialization routine that assists in marshaling data for wire transfers. This code is automatically generated as a part of the client stub and does not require user modifications.

3. Babel architecture

At its heart, Babel is a compiler that translates SIDL interface definitions into glue code for the supported languages. Babel provides a traditional object-oriented programming model with single inheritance of classes and multiple implementation of interfaces. By default, all functions are virtual, i.e. a function being called always depends on the dynamic type of the associated object rather than the static type of the object's reference. Babel also provides implicit reference counting and memory (de-)allocation.

Restricting Babel to the least common denominator of features across the whole set of supported languages would produce a system missing critical features. Instead, Babel tries to take advantage of native language features such as built-in data types or method overloading whenever possible and provides reasonable alternatives in the remaining cases, e.g. overloading symbols is supported in most object-oriented languages while unique identifiers are required for earlier dialects of Fortran. Across all supported languages, Babel provides sophisticated features such as transparent support for RMI, overloading, inheritance and exception handling, e.g. it is common use to derive a Python class from a class implemented in Fortran to overwrite a subset of the member functions.

In order to achieve this, Babel employs a C-based intermediate object representation (IOR). The IOR is exactly the same no matter which pair of languages is being connected. The term 'object' in IOR refers to all of the supported data structures, including structs and enumerations, rather than just objects in the sense of object-oriented programming. The SIDL language uses the term 'class' to describe the latter.

The IOR is essential to achieve scalability across a growing set of languages. Any language binding essentially needs to translate from and to Babel's IOR, thereby achieving full interoperability with all other supported languages. This hub and spokes architecture avoids the n² possible binary interactions between n supported languages. By encoding data and function dispatch rules in a shared middle layer, Babel also ensures common behavioral semantics across all its supported languages.

Under the hood, the IOR corresponds mainly to the storage layout and calling conventions used in C. The reasons are twofold. Firstly, C allows for fine-grained control of the memory layout of data structures. Secondly, with few exceptions, most notable earlier Fortran standards (Fortran 95 and earlier), almost all languages support some kind of C compatibility layer, effectively making C the lingua franca among programming languages. The IOR representation for dispatch tables is the so-called entry point vector (EPV), which is a record containing function pointers to all the methods of the object. This is comparable with a virtual function table in C++.

Figure 2 depicts the control flow of a local Babel function call. On the client side, a so-called stub is generated that converts arguments to Babel's IOR representation, calls the proper method entry point from the object's EPV and—if necessary—converts return values to the representation used in the original language. On the server side (skeleton), the inverse operations are performed, i.e. arguments are converted from IOR to the particular implementation language, the user-supplied implementation is called and return values are converted back to Babel's IOR. In addition, the skeleton is responsible for catching exceptions thrown in the implementation and convert them to a language-independent representation.

SIDL and Babel have a rich array-type system. Because arrays are such a critical data structure in scientific computing, computational scientists want a Babel array that closely matches their library's internal interface. Due to direct feedback from users, we introduced two different types of arrays in Babel:

SIDL arrays are managed by the Babel runtime and are available in all variations of shape and dimension and stride. In many languages, access to the array elements is provided via a function interface; some language bindings also provide a native interface to access array elements.
Raw arrays (r-arrays) are a low-level alternative to the fully fledged SIDL arrays that allow direct access to the underlying data structures. They provide a trade-off between comfort and performance. For example, in C, a one-dimensional raw single-precision array will be represented as (float *).

Unlike regular SIDL arrays, r-arrays adhere to several constraints. Among other things, they must be contiguous blocks of memory organized in column-major order. Also, they can only be passed in in or inout mode and must retain their shape across method invocations. These restrictions also apply to structs containing r-arrays—either directly or indirectly via another nested struct. The Babel compiler makes sure that these limitations are satisfied at compile time.

Raw arrays can either be of constant size or dynamically sized. If a dynamically sized array is passed as an argument to a function, Babel requires the size of the array to be a function of fields or arguments defined in the SIDL interface definition. This size expression may only contain simple arithmetic operators, constant values and other (integer) arguments of that function call. If an r-array is a field of an SIDL struct, the requirement is that the size expression may only refer to (integer) fields of the same struct. In this way, structs are self-contained and can be passed as arguments to function calls. Because of the memory management restrictions for r-arrays mentioned above, structs containing r-arrays (either directly or via a nested struct) cannot be used as return values of functions or as out-arguments. The Babel compiler will automatically reject such functions. Regular SIDL arrays can be used instead.

The IOR form for structs is a C struct with the IOR type of each data element followed by its name. The IOR form of the struct example from above is shown in figure 3. It is a normal C struct that is the same as the SIDL declaration except that int has been replaced with the IOR type int32_t, a 32-bit C integer.

**Figure 3.** The IOR format (in C) for the struct example from figure 1.
Download figure:
Standard image High-resolution image

4. Language bindings

The implementation of the language bindings in Babel differs in respect of performance, convenience and level of integration with the host language ('nativeness'). Table 1 gives a high-level overview of the different approaches. More details on the implementation of arrays inside of structs are given in the comparison chart in table 2. The following sections discuss all the language bindings in more detail.

Table 1. Implementation of SIDL structs in languages supported by Babel.

Language	Appearance	Argument passing	Field access	Implementation approach
C	Native	Fast	Fast	`struct` (direct access to IOR)
C++	Native	Fast	Fast	Fully featured class, inheriting IOR `struct`
Fortran 03/08	Native+functions	Fast	Fast (mostly)	Use C interoperability to access IOR via a derived type
Fortran 90/95	Native	Slow	Fast	IOR is copied into an F90 derived type
Fortran 77	Functions	Fast	Slow	Opaque object with access functions for each field
Python	Native	Fast	Slow	IOR is accessed via a C extension type
Java	Native	Slow	Fast	IOR is copied into a Java object
Chapel	Native	Fast (mostly)	Fast	Passed by reference (using BRAID)

Table 2. Implementation of arrays as struct fields in Babel.

Language	SIDL arrays	R-arrays
Language	SIDL arrays	Size expression	Fixed size
C	Pointer to IOR + access macros	Array pointer (e.g. `int* a;`)	Embedded array (e.g. `int a[42];`)
C++	Template + overloaded []-operator	Array pointer	Embedded array
Fortran 03/08	`bind(C)`-pointer + access functions	`bind(C)`-pointer (e.g. `type(c_ptr);`)	C-interoperable array (e.g. `int` `(kind=sidl_int)`, `dimension(1)` `:: a`)
Fortran 77	Opaque pointer with access functions
Fortran 90/95	Native Fortran array or access functions—depending on data type
Python	Numpy arrays [16] or generic sequence types
Java	JNI array class wrapping the IOR
Chapel	Chapel borrowed arrays

4.1. C

In the C language, the raw IOR is presented to the user (e.g. figure 3). In terms of performance, this is the baseline. Since the IOR coincides with the native representation, no conversions are necessary and no performance penalty needs to be paid.

Figure 4 shows what happens when a C client calls a server also implemented in C. When the user writes a Babel method invocation, the client-side stub is invoked. The stub performs an indirect call of the method via the EPV. Since the stub is so tiny, Babel generates it as an inline-attributed function, such that the only overhead is the cost of the indirect function call, which will be inserted by the C compiler in lieu of the Babel method call written by the user.

For local calls, the server-side skeleton is not needed and the EPV points directly to the server implementation. In a remote call, the EPV points to a function that serializes all arguments and pushes them over the network. On the server side, the reverse actions are performed prior to calling the user's server implementation.

4.2. C++

C++ is practically a superset of C that covers almost 100% of the language. For the SIDL struct implementation, this has the implication that no performance penalty is paid for conversion. To provide the programmer with a more object-oriented representation, Babel generates a C++ class⁴ that inherits from the IOR. The class has a constructor/destructor pair, access functions for selected fields and defines an assignment operator such that creating copies of the C++ wrapper class can be created easily. The class also has methods used by the RMI functionality to (de-)serialize the struct from/into a string. An example C++ class for the struct in figure 1 is shown in figure 5. Field access functions are provided for fields whose C++ representation is different from the IOR representation. The use of the field access functions is optional—it is still possible to access the publicly inherited fields directly—but, if used, they convert the IOR data types into their C++ equivalents. For example, (char *)-strings are converted to a C++ std::string object, and SIDL arrays are converted to the appropriate instantiations of the sidl::array < > template.

**Figure 5.** Babel-generated C++ interface for the struct in figure 1.
Download figure:
Standard image High-resolution image

Contrary to the C binding, C++ servers come with an actual skeleton, which is used to convert C++ exception handling into SIDL exception variables. The client-side stub looks strikingly similar to the one used by the C binding, with an additional code that converts any SIDL exceptions thrown by the invoked function to a C++ exception.

For C++ programmers, it is worth noting some of the limitations of SIDL structs compared with C++. SIDL does not support templates in structs or any part of the language. This limitation is because it is difficult to imagine how templates could be efficiently implemented or used from languages other than C++ or Java. SIDL also does not allow inheritance for structs.

4.3. Fortran 77

Fortran 77 and Fortran 90/95 do not have the necessary C compatibility layer to allow direct access to struct data. In Fortran 77, a struct is represented by an opaque integer parameter that holds the address of the struct. Fields can be accessed using automatically generated access functions. While this notation is more verbose than, e.g., the field access operator in C, it is still relatively inexpensive: the only performance penalty is a function call per field access. The access functions are implemented in C and are automatically generated by Babel in a way such that they are callable from Fortran. They take care of converting IOR data types to their Fortran equivalents. Figure 6 shows an example of how a field access is performed in legacy Fortran programs.

**Figure 6.** Opaque pointers and access functions in Fortran 77.
Download figure:
Standard image High-resolution image

Memory management can be tricky in Fortran 77. The Babel compiler uses tagged pointers to determine the ownership of memory. If the lowest significant bit of a pointer is set, then the associated memory is borrowed. This implementation detail is completely transparent to the user. There is no difference between SIDL arrays and r-arrays in Fortran 77. Both are accessed through the same function interface.

4.4. Fortran 90/95

In contrast to Fortran 77, Fortran 90 introduces a native representation for structs: Fortran derived types. The resulting interface is very clean. The downside is that the skeleton needs to copy the IOR C-struct into the binary-incompatible Fortran derived type. This means that while access to the fields is inexpensive, passing a struct to a function always involves copying. The skeleton implementation actually requires two indirections. The 'regular' skeleton (in C) takes care of converting the IOR to Fortran 90/95 data types. It then passes each (converted) field as an argument to the second part of the skeleton (flattening), which is implemented in Fortran 90/95 and copies all the arguments into a Fortran derived type (unflattening), which is then passed to the actual server-side implementation. This also works for structs nested within other structs.

SIDL arrays are generally passed as an opaque pointer with getter/setter functions. The names of these functions are considerably shorter than their Fortran 77 equivalents because they are declared as module procedures, which has the effect of overloading a generic name as in val = get(array, i, j) and have the compiler decide which function to invoke based on the arguments. SIDL arrays of numeric types are wrapped into a derived type containing an opaque pointer to the IOR and a Fortran pointer to the array's raw data. Since data structure interoperability with C is not standardized, Babel uses libchasm [17] to generate an array descriptor adhering to the Fortran-vendor's specific data layout. By populating the array descriptor, Babel provides Fortran 90/95 with the size and shape meta-data that Fortran requires to directly access array data. In Fortran 90/95, r-arrays are always wrapped into SIDL arrays; there is no user-visible difference between the two.

Another peculiarity of the Fortran 90/95 binding is the generation of type modules. Due to limitations of the language, it is necessary to split some derived type declarations (such as the declaration of the Fortran equivalent of an SIDL class) into a separate module. This is necessary to avoid circular dependences, which would occur in situations where SIDL classes or interfaces are passed as method arguments. Apart from the additional file being generated, this has no practical effect for the user.

4.5. Fortran 2003/2008

Fortran 2003 adds C interoperability via the bind(C) intrinsic module. Babel uses this feature to generate derived types that are binary compatible with the C representation. This eliminates all of the copying necessary for Fortran 77 and Fortran 90/95 language bindings. An example is shown in figure 7. This combines the performance of direct access with the convenience of a native data type. Since some data types (such as Boolean and Character types) are still not binary compatible, access functions are still generated, but they need only be used for these specific types. In contrast to the older Fortran versions, the Babel compiler generates skeletons for Fortran 2003/2008 servers directly in Fortran instead of C. Because the Fortran 2003/2008 language only allows interoperable functions to return scalar values [18], the Babel compiler generates for these functions an additional wrapper of the Fortran skeleton in C, which converts an out-parameter to the return value.

**Figure 7.** Fortran 2003/2008 `bind(C)`-interoperability with C structs.
Download figure:
Standard image High-resolution image

4.6. Python

In Python, a C extension module for a Python object resembling the struct is generated. The extension module translates each access to a member of the Python object to an access of the corresponding field in the underlying IOR. The C extension also converts Python objects to the IOR. It is, for instance, possible to assign a Python list to an array field in a struct:

$\hbox{\fontsize{8}{11}\selectfont myStruct.doubleArray = [ 1.0, 2.0, 3.0 ]}$

or we can even write

$\hbox{\fontsize{8}{11}\selectfont myStruct.objectArray = [ sidl.BaseClass.BaseClass() ]}$

In this example, the struct objectArray is an array of SIDL objects that is a field of the struct s. We are assigning a new instance of the generic SIDL base class.

The skeleton performs the necessary type conversions, acquires the Python interpreter's global interpreter lock and starts the interpretation of the server code. By convention, Babel expects server implementations to return a tuple of return value and all out-attributed parameters. Upon completion, the skeleton copies the elements of the return tuple back into their corresponding out-parameters. It also handles the conversion of possible Python exceptions into their SIDL counterparts. The C extension module takes care of object (de-)serialization and of translating python field accesses into the appropriate actions on the IOR.

There is no difference between SIDL arrays and r-arrays in Python, but for numeric data types, the Babel compiler uses the more efficient Numpy arrays [16] instead of regular Python sequence types.

4.7. Java

The Java binding uses an approach similar to the Python binding: a copy or reference to the IOR is used to create a Java object using the Java Native Interface (JNI) [19]. This makes passing a struct to/from a Java method more expensive but makes field access inexpensive because it does not go through the JNI. The stub uses the JNI to convert between the IOR and an object residing in the Java Virtual Machine.

As shown in figure 8, an SIDL struct is represented as a Java class with the Java counterparts of all the IOR-struct's fields as public members. The class also contains a public inner Holder class used for out and inout arguments. Since Java does not support pointers, this class can be used to 'hold' the struct in these cases. A code example is shown in figure 9.

**Figure 9.** Using a *Holder* class instead of pointers.
Download figure:
Standard image High-resolution image

4.8. Chapel

Chapel is a high-level parallel programming language that implements the partitioned global address space model. Its development is led by Cray Inc., originally as part of the DARPA HPCS program [20]. The Chapel compiler is currently implemented as a source-to-source compiler that generates C intermediate code. Our Chapel language binding is current based on interoperability with the generated C code. Our work targets the development version of Chapel [21] from their source code repository, the 1.4.0 beta branch, and it requires a small patch that we contributed back to the Chapel development team.

Babel does not directly support Chapel; however, the closely related BRAID tool [22, 23] does generate Babel-compatible bindings for this language. BRAID is a new framework for language interoperability that can generate glue that is backwards-compatible with Babel. We extended the BRAID compiler to support struct arguments. We use Chapel's record datatype for this (cf figure 10). If all fields are interoperable with Chapel, structs are passed by reference⁵ to the Chapel implementation. If for any field in a struct Babel's IOR and the Chapel representation are not the same (e.g. bools, objects or arrays), a copy is generated in the stub or skeleton, and the fields are converted to the respective other representation. Arrays are converted to Chapel borrowed arrays, a feature provided by BRAID that wraps new Chapel array meta-data around existing data without copying it.

**Figure 10.** Chapel representation of an SIDL struct.
Download figure:
Standard image High-resolution image

4.9. Remote method invocation

In section 2, we mentioned that Babel also supports transparent RMI [15]. We fully support RMI with our structs extension for all the languages covered by Babel. A remote method call involves serializing the arguments into a byte-stream, which is transmitted over the network. On the server side, the data are unpacked and the method implementation is invoked. Structs are serialized by packing all fields in a first-to-last, left-to-right order. Structs are essentially fixed-shape trees. A struct nested inside another struct is serialized in place by calling its respective serialization function. The server side converts the byte-stream into a struct by unpacking the struct elements first-to-last, left-to-right.

5. Experimental evaluation

To determine the performance impact of the design decisions discussed in section 4, we constructed a suite of benchmarks with structs of different sizes and with different data types. The different instances of the benchmarks were automatically generated from a language-independent intermediate representation with the help of BRAID's code generator [23]. For each of the data types t∈{bool,int,float,string}, we generated SIDL definitions for structs containing $n \in \{1\ldots 128\}$ fields of that type.

In the 'call/t' benchmark, a struct $A = \{a_0, \ldots , a_n\}$ (with a_i of type t) is passed to a no-op function, to measure the argument conversion overhead. In the 'access/t' benchmark, the benchmark function accepts a struct $A = \{a_0, \ldots , a_n\}$ as an in-argument and returns the field-reversed $A' = \{a_n, \ldots , a_0\}$ in an out-argument. Upon start-up, the struct fields are initialized to true, i, or a 16-character string filled with 13 spaces and i printed to 3 digits, depending on their data type. The 'bsort/int' benchmark shows what happens if there are many (O(n²)) field accesses in the server function. This benchmark takes a struct of n integer fields as in-argument and returns a sorted struct as out-argument. The sorting algorithm is a naive bubble sort that has a quadratic worst-case behavior. The input is always reverse-sorted. Including the copying operation from input argument to output argument, this results in a total of 2n + n² field accesses.

The client implementation in all the benchmarks is always written in C. Since C always has the least overhead involved, this ensures a fair comparison of the different Babel language bindings.

The plots in figures 11–14 show the number of instructions executed on a x86-64 machine⁶. This number was measured by querying the instructions-performance counter provided by the perf [24] interface of Linux 2.6.32. In order to eliminate the instructions used for start-up and initialization, the instruction count of one execution of the benchmark program with one iteration was subtracted from that of the median of ten runs with 10⁶ + 1 iterations each. The result was divided by 10⁶ and plotted into the graph. The plots are logarithmic in both axes. The x-axis denotes the number of fields in the struct. The y-axis shows the number of instructions executed by the benchmark (lower values are better). Table 3 shows which Babel basic types require marshaling when passing data from the given language to the IOR and reverse. This helps explain the results shown in the figures.

**Figure 11.** Passing and accessing a struct of n booleans.
Download figure:
Standard image High-resolution image

**Figure 12.** Passing and accessing a struct of n floats.
Download figure:
Standard image High-resolution image

**Figure 13.** Passing and accessing a struct of n strings.
Download figure:
Standard image High-resolution image

**Figure 14.** Quadratic number of accesses of n integers.
Download figure:
Standard image High-resolution image

Table 3. Babel basic types and whether they require marshaling for different supported languages. Yes indicates marshaling is required and no indicates it is not. C++ methods refer to accessing struct fields via accessor methods that convert to the C++ representation.

Type	C	C++	C++ methods	Chapel	F77 and F90/95	Fortran 2003	Python	Java
Array	No	No	Yes	Yes	Yes	Yes	Yes	Yes
Bool	No	No	No	Yes	Yes	Yes	Yes	Yes
Char	No	No	No	Yes	Yes	Yes	Yes	Yes
Class	No	No	No	Yes	Yes	Yes	Yes	Yes
Dcomplex	No	No	Yes	Yes	Yes	No	Yes	Yes
Double	No	No	No	No	Yes	No	Yes	Yes
Enum	No	No	No	Yes	Yes	Yes	Yes	Yes
Fcomplex	No	No	No	Yes	Yes	No	Yes	Yes
Float	No	No	No	No	Yes	No	Yes	Yes
Int	No	No	No	No	Yes	No	Yes	Yes
Interface	No	No	Yes	Yes	Yes	Yes	Yes	Yes
Long	No	No	No	No	Yes	No	Yes	Yes
Opaque	No	No	No	No	Yes	No	Yes	Yes
String	No	No	Yes	No	Yes	Yes	Yes	Yes

The benchmarks reflect many of the considerations put forward in the previous section:

C is the fastest of the implementations, since it operates directly on the bare IOR.
In C++, a constant overhead⁷ has to be paid due to the way the language binding is implemented, i.e. the method dispatch mechanism goes through a wrapper function that encapsulates the called method in a try/catch-block, where possible C++ exceptions are translated into SIDL exceptions.
Due to the C interoperability, Fortran 2003/2008 is—for most data types—also offset from C only by a constant amount. One exception is the access/bool benchmark, which uses getter/setter functions because of the incompatible binary representation of truth values between the two languages. In the Fortran 2003/2008 case, the overhead is not paid for exception handling but for casting C pointers to their Fortran counterparts, transparently performed by the skeleton wrapper function.
Fortran 77 has a low function call overhead, but a high cost for field accesses. In the 'call'-test cases it is even faster than Fortran 2003/2008, but the cost for the field access (cf the 'access' benchmarks) is higher because of the additional function call.
The copy operation performed by the Fortran 90/95 implementation makes it stand out in all the 'call'-test cases. Although this is obscured by the log/log scale of the plot, the overhead is actually linear (as one would expect from a copy operation). The overhead can be neglected if all the struct fields are accessed, as can be seen in the 'access'-benchmarks.
Python and Java incur the most overhead. In Java, field access is considerably less expensive as in Python, but the function call overhead is higher. Function calls in Java are expensive because of the conversion of the arguments from IOR to JNI objects. For higher⁸ workloads, however, the just-in-time-compiled Java version quickly overtakes the interpreted Python implementation.
The 'bsort' benchmark shows what happens when there are many field accesses. This benchmark makes it clear that the copy overhead incurred by the Fortran 90/95 implementation (the skeleton copies the IOR into a native derived type) becomes negligible when there is a sufficient amount of computation in the function. Particularly interesting is also the performance of the Java language binding, which shows that asymptotic behavior of Java is closer to native languages when the workload becomes significant.
Chapel's peculiar performance (observe the jump at float n ∼ 16) needs to be explained. At the time of writing, the Chapel compiler generates a C code that always copies outgoing records from a temporary local copy into the outgoing argument. For this, a helper function is generated that copies the record field by field. If this function grows above a certain threshold, GCC (which was used to compile the C code generated by the Chapel compiler) will no longer inline the helper function and thus will no longer eliminate the unwarranted copy operation. According to the Chapel developers,⁹ proper pass-by-reference semantics will be implemented in a future release to alleviate this issue.The bool benchmarks show the linear effort associated with copying the struct into the native Chapel format, which is not necessary for the float datatype.

Because the conversions from and to IOR often involve copying operations, strings are more expensive than other data types. This is reflected by the benchmarks in figure 13: for all languages other than C and C++, the copy operation dominates the instruction count, making them virtually indistinguishable in performance. Comparing the 'call' with the 'access' test case shows that Fortran 2003/2008 does not copy the strings unless accessed.

In section 1 we argued that because of the direct field access, structs offer a better performance compared to (SIDL) classes. The performance of the Fortran 77 language binding with its getter/setter function interface is also a lower bound for the performance of a class. In a class each member access would have to be another Babel method call.

5.1. Code size considerations

Scientific computing interfaces are often bound to specific mathematical models that render software engineering principles such as encapsulation (data hiding) counterproductive. By using a struct instead of an SIDL class, the interface is directly exposed and accessible and not hidden behind a pointless access function that does little more than wrapping class members.

Compared to using SIDL classes, structs can therefore significantly reduce the amount of glue code in an application: glue code that would have to be written by the user. As a consequence, they effectively reduce code size in most languages.

Structs are slightly more memory efficient than an equivalent SIDL class with getter/setter methods. An SIDL class must also carry meta information such as pointers to its EPV and the EPV of each parent class. A struct occupies the space occupied by its fields plus any padding introduced by the compiler to address alignment requirements of the processor.

6. A case study: fluxgrid

To demonstrate the real-world practicability of this feature, we are now reporting our experience with a physics code from Los Alamos National Laboratory which was wrapped into a Babel component by Tech-X Corporation [25, 26]. The program describes itself as follows [27]:

fluxgrid is a code for reading in output from Grad–Shafranov equilibrium solvers and producing useful output for other codes. There are interfaces to a large number of commonly-used equilibrium codes, both direct and indirect, and adding interfaces to new codes is usually very simple. It currently can be used as either a standalone code or as a library which can be embedded into other codes such as nimset.

This code is a typical example for the type of programs being componentized via Babel. It consists of about 20 Fortran 90/95 modules, which are connected via a function interface that uses Fortran derived types (structs) to exchange data between the modules. The SIDL file, which is too long for inclusion in this article, contains 15 different struct definitions, some of them nested. The median number of struct fields is 19; the largest struct counts 53 fields, the smallest only 4. Combined, the structs contain 105 (SIDL) arrays and 8 other structs (which are also defined in the same file). The arrays have a dimensionality of up to 5. The most common base types for struct fields are int, double and string. To aid the task of defining the interface, the developers at Tech-X crafted a Python script that parses the Fortran sources and automatically generates the SIDL file with all the derived types used by fluxgrid's interfaces.

An external constraint of the design was that the resulting code must compile with a selection of legacy compilers. The 'babelized' version therefore uses the Fortran 90/95 binding instead of the more efficient Fortran 2003/2008 binding. Since the majority of the data are encapsulated inside of SIDL arrays (cf section 4.4), the overhead is still acceptable. Another design decision was that the existing Fortran 90/95 sources were not to be modified in the process. For this reason, an additional layer of Fortran 90/95 glue code was added, which translates the derived types from the Babel method arguments into the derived types used by the original Fortran implementation. With the Fortran 2003/2008 language binding, the SIDL arrays could be replaced by r-arrays and this copy operation could have been eliminated.

Our measurements include this additional overhead. Using the geqdsk input set provided by Tech-X, we measured a 1.3% overhead for calling the Babel version of fluxgrid from a driver written in C++, whereas the original version is driven by a Fortran program that calls the library functions directly.

This example shows that the Babel struct extension was successfully used in practice to wrap existing code into a well-defined component interface, while retaining the original data layout¹⁰ for input and output. It is now possible to orchestrate the Fortran 90/95 core directly from C++ and Python, which enables a much tighter coupling between components written in different programming languages than previously possible.

7. Outlook

With the addition of struct data types, Babel comes one step closer to providing a full programming ecosystem between multiple languages. Using classes to exchange data is oftentimes an overkill; structs allow users to write more understandable and compact code that will also have higher performance. Babel's struct support degrades gracefully (to auto-generated classes or function interfaces) for languages that do not support such a feature natively. The highest performance is achieved by the C, C++, Fortran 2003/2008, and (under certain circumstances) the Chapel language bindings. They provide zero-copy, direct access struct implementations that are set off only by a constant factor in most cases (comparable to a native call in C).

By extending Babel with this feature, we provide the computational scientist with another (and much requested) choice of data structures for their interfaces. The measurements included in this paper show the detailed performance trade-offs for different data types and programming languages. We hope that this work makes choosing the most appropriate representation for a specific domain a little easier.

Acknowledgments

We thank Scott Kruger and Roopa Pundaleeka from Tech-X Corporation for providing us with their version of fluxgrid and for sharing their experiences with using the Babel extensions presented in this paper. This work was performed under the auspices of the US Department of Energy by the Lawrence Livermore National Laboratory under contract No. DE-AC52-07NA27344. We gratefully acknowledge funding from the DOE's Office of Advanced Scientific Computing Research.

Cross-language Babel structs—making scientific interfaces more efficient

Article metrics

Permissions

Author e-mails

Author affiliations

Dates

Abstract

1. Introduction

2. Design principles

3. Babel architecture