Skip to content

Native golang functions methods

Andrew Binstock edited this page May 26, 2024 · 5 revisions

While we anticipated many of the difficulties we've faced writing Jacobin, one we didn't foresee is how much the JDK relies on native functions. For the time being, we've replaced many of those native functions with equivalents written in golang. We call these gfunctions. While they're native, in the sense they're not executed inside the JVM proper, we don't call them native functions, because that term is reserved for the JDK's native functions (written in C and C++), which we might eventually call directly from within Jacobin.

gfunction anatomy

gfunctions are called in the interpreter similarly to Java methods. The gfunction is looked up in the MTable (a large permanent cache containing all functions/methods that have been called, see mtable.go. Each time a function is called that's not in the MTable, it's first added to the MTable and then executed). In order for Jacobin to run gfunctions rather than the JDK's native functions, we preload all the gfunctions into the MTable at Jacobin startup. (see MTableLoadNatives(MTable *classloader.MT) in gfunction.go) In the MTable, all functions are defined as follows:

type MTentry struct {
	Meth  MData // the method data
	MType byte  // method type, G = Go method, J = Java method
}

The MData (i.e., method data) for a gfunction is defined this way (see gfunction.go):

   type GMeth struct {
	ParamSlots   int
	GFunction    func([]interface{}) interface{}
	NeedsContext bool 
   }

The items ParamSlots and NeedsContext are explained in the next section. GFunction is the function itself (in actual fact, it's a pointer to a function). As can be seen, all gfunctions have the same signature: they receive an array of empty interfaces (the golang version of anything) and they return a single empty interface. As you'd expect, gfunctions that take no parameters are passed an empty array, and gfunctions that are declared void return nil.

When a gfunction is called, Jacobin first looks it up in the MTable. It's found there and returns the entry, which has its MType flag set to 'G', identifying it as a gfunction. Jacobin then sets up a apecial frame for the gfunction (see runGmethod() in goFunctionExec.go) and executes it.

Getting execution-context data

Because gfunctions are native, they do not have the ability to access Jacobin's internal data items, unless those are specifically passed to them.

JVM frame stack data

If a gfunction requires access to context data, the boolean NeedsContext (which defaults to false) is set to true. When Jacobin is setting up the frame for execution of the gfunction, it checks this flag and if it's set to true, an additional parameter is appended to the array of passed parameters. (see runGmethod() in goFunctionExec.go) The gfunction know that context data is there because it is aware of the flag.

The passed context data is a pointer to the frame stack for that particular thread. The presently executing function/method is always the one at the top of the stack (which is implemented in Jacobin as a list). In this way, the gfunction can access the frame data and also see which function/method called it.

Object reference

When functions are invoked by the INVOKEVIRTUAL and INVOKESPECIAL bytecodes, they are invoked as methods of a particular object. For example, the method java.lang.String.toUpperCase() is called on a String object, which is not passed to the method--you will note it is a method that takes no arguments. In Java, this method call would look like:

    s = "hello, world!";
    sUpper = s.toUpperCase();

If String.toUpperCase() is a native function (as it is in Jacobin JVM), then it needs to know what object it's being called on. This is done with the use of an object reference, which is a pointer to the instantiated class, here a String. This object reference is passed to the gfunction in params[0]. See case opcodes.INVOKEVIRTUAL: in jvm\run.go for an example of this.

Note that in the definition of the gfunction (see above), the ParamSlots variable contains the number of non-context parameters being passed. This number should match the number of arguments that the equivalent function in the JDK libraries uses.

Let's summarize this by showing the possible layouts of passed parameters to gfunctions:

  1. { p[n] | ... | p[1] | p[0] } Mostly from calls by INVOKESTATIC, p[0] through p[n] are here the actual data parameters
  2. { p[n] | ... | p[1] | object Ref } Calls by INVOKEVIRTUAL or INVOKESPECIAL, where p[0] is the object reference, and p[1] - p[n] are data parameters
  3. { context | p[n] | ... | p[1] | object Ref } Like #2, but also passing context data.

Exceptions in gfunctions

A very few gfunctions actually throw exceptions. These exceptions cannot be processed within the gfunction itself, because it lacks certain features required by exception handling, such as a constant pool. As a result, when an exception occurs, the gfunction returns a gfunciton error block:

type GErrBlk struct {
	ExceptionType int
	ErrMsg        string
}

The gfunction frame is then popped from the stack. At this point, the frame of the function/method that called the gfunction becomes current and it throws the exception. The exception is identified by an integer, ExceptionType pointing to an array of possible exception types (see JVMexceptionNames []string in exceptions.go) and the error message, if any, is passed through ErrMsg. The error message as received by the calling method is divided into two parts: the first part is the name of the gfunction throwing the exception, the second is the actual error message. These are separated by a colon (:). (Note that the error message can contain colons, but the first colon is always the separator between the function name and the actual error message.)