Skip to content

Native methods in the JDK

Richard Elkins edited this page Sep 8, 2024 · 7 revisions

As mentioned in the post about the project's three-year anniversary, one of the biggest (and rather unexpected) challenges of writing a JVM is how many of the JDK library methods are native functions. Calling them from golang is quite a challenge.

In the attempt to do this without resorting to writing non-golang code, we've examined CGO and the excellent purego library, as well as the JDK facilities, JNI (the Java Native Interface, which is being phased out) and FFM (Foreign Function and Memory API, which is the Java 21 replacement for JNI). @texadactyl has written a lot of code testing these solutions--each of which suffers from an important drawback.

As of v. 0.6.0, we're trying a new approach. (Note: This is experimental and might be discarded. If so, we'll update this page.)

The motivation

The fundamental problem is that despite the addition of FFM in Java 21, the JDK still relies significantly on JNI to call C-based functions. These functions are part of the JDK distribution and located in .so files (for Unix and Linux), .dylib files on MacOS, and .dll files (on Windows). Calling native functions through JNI has all the limitations that were articulated in the various JEPs (Java Enhancement Proposals) for replacing it: JNI is very finnicky, very brittle, and poorly documented.

Until now, we've used gfunctions as a substitute. gfunctions are native functions rewritten in golang that replace native method implementations. This substituion mostly works well, but there are quite a few native methods that are complicated and, frankly, poorly documented so that it's difficult to the replace them with a golang equivalent. In addition, let's not overlook the quantity aspect. The JDK have more than 2,500 methods, many of which are native, and there is little joy in rewriting them all. So, the ability to call the native method directly has considerable appeal.

Presently, Jacobin executes two kinds of methods: regular Java methods and gfunctions. We're adding a new category, which for lack of a better term, we refer to as native functions.

Creating a map of native functions

Our first step is to read the library files and find out which functions they contain. We'll take this data and put it into a function-library cross reference file. The data will consist of a map whose keys are the function name and the value is the library file in which it's located. This file will be stored in the directory pointed to in the environmental variable JACOBIN_HOME.

When the first native function needs to execute, it will load the cross reference file into memory, look up which library contains it, load the library, and execute the function. If the cross reference file does not exist or represents a different version of the JDK, a new cross reference file will be created and stored in JACOBIN_HOME. (We use a similar technique for looking up methods in the standard libraries to determine which jmod file they're stored in. So, this is all familiar ground to us.)

Execution

At present, we expect to use the purego library to execute the function from golang. We believe that it will satisfy all our needs except for variadic functions (that is functions with a variable number of arguments), which purego does not support. Because there is a fair amount of ceremony in a purego call, we expect to write a series of templates that, for example, create all the ceremony for a function that takes a string and returns a string, takes two ints and returns a boolean, etc. We'll then mostly invoke the template with the function's name depending on its signature. There will surely be one-off signatures and we'll probably write those out by hand.

Additional notes

As mentioned earlier, this approach with the cross reference table and purego is entirely new, although @texadactyl has experience with purego and we both have experience with the jmod/class gob file; so we're confident that this will work. Above, we implied that due to the large number of native functions, we would benefit from having this solution, rather than rewriting everything by hand.

However, this conveys an incomplete picture. Ultimately, we expect that a large percentage of the native functions will indeed be rewritten in golang as gfunctions. The gfunction approach has two benefits: 1) calling the functions is easy, 2) it's far easier to debug them in golang than to tunnel through a library to access them and find out why something is not working. However, the proposed map of native functions allows us to get code up and running without sidetracking us every time into writing gfunctions. The map technique will be retained to the extent possible for methods that heavily use JNI conventions.