Skip to content

How Fields are Handled in the JVM

Andrew Binstock edited this page Dec 31, 2023 · 7 revisions

Overview

Fields in Java can be divided into roughly three categories:

  • local variables (these occur and live inside a method),
  • instance variables (fields that exist inside an instantiated class, but are not local to a single method),
  • class variables (that is, static fields).

Each of the three types is handled differently by Jacobin and most JVMs.

It's worth noting that the JVM specification says very little about how objects are implemented and how their fields are handled. As long as the required operational features are delivered, the JVM has wide latitude in how it implements fields. However, most JVMs roughly follow the approaches described here.

Fields contain two types of values:

  • primitives (such as ints, longs, doubles, etc.)
  • objects (which are allocated on the stack)

Among objects, there is an additional division: regular objects and arrays. Arrays are objects in all the usual ways except that they are created using a special set of instructions.

Let's look at these various field types in greater detail

Local variables

Local variables are local to a method. They are created on the method's operand stack. So, if your method declares int i, then the JVM increases the stack by four bytes, which is where the int will be stored. When the method exits, the stack memory is reclaimed by the JVM and the local variabe i ceases to exist.

If the local variable is a new object, then the object is created on the heap and the stack contains a pointer to the new object, rather than the object itself. (this is not universally true. Some optimizations involve moving the fields of the object on to the local stack, essentially inlining the object's fields.) When the method exits, the pointer to the object is destroyed. If no other pointers to the object exist, the object's memory will be reclaimed by garbage collection.

Creation of local fields is very fast. The amount of memory to allocate on the stack for local fields is known at compile time, so the entries are created and initialized to their default values at method creation. No special bytecode instructions are needed for them to be created.

Instance fields

Intance variables are variables declared inside a class, but not local to a single method. These are created when a class is instantiated. Let's look at this process. When a class is instantiated, the class definition is fetched from the .class file, which has been loaded (that is, parsed, validated and made available to the program). The class definition contains a table of sorts called the constant pool, which holds a wealth of details, including all the non-local fields (both static and instance fields). From this data, the NEW bytecode allocates a structure containing the instance fields and references to the static fields. (I'll get into the static fields in the next section).

The instantiated class consists of a pointer to the struct of fields and a pointer to the class's constant pool. In most JVM implementations, including Jacobin, the instantiated object also contains a set of metadata fields, which are implementation-defined and typically not accessible to the developer. These fields include instantiation-specific data and items like a monitor (used for locking access to an object in concurrent operations).

In an unoptimized situation, accessing an instance field requires several look-ups. The field is referred to in the bytecode instructions by the number of its entry in the constant pool. From that entry, you can get its name. With a pointer to the instantiate object and the name of the field, the JVM looks up the field in the instantiated object and performs some operation on it. The bytecodes for this are GETFIELD to get the value of the field and PUTFIELD to change the value of the field. In Jacobin, this field access uses a map that maps the name of the field (a string) to a pointer to the field. In optimized accesses, many of these derefernce steps are eliminated.

Class fields

Class fields are static fields. That is, they are created via the static keyword. They are treated uniquely inside the JVM. If a class definition has a static field in it, all instances of that class share one instance of that variable. For example, the Java String class contains the following line:

static final boolean COMPACT_STRINGS;

(This boolean tells the JVM whether Java Strings are compact strings. For many releases now, the default value is true). All instances of Java Strings refer to this single boolean value. If it were not final, any String could change it and it would be changed for all String instances. Static variables are accessed using the GETSTATIC and PUTSTATIC bytecodes.

Initializing class variables can involve tricky timing. For example, consider the static constant PI, which is a common math value (roughly 3.14159) that is a static double in java.lang.Math. If the first line of your main() method accesses PI, the constant must be initialized prior to that first access. The JVM does this by moving initialization of class variables into a hidden method, called <clinit>. This method is run at class instantiation before any other code, including constructors. You can add code to this class by putting it in a static initalization block, which is guaranteed to run before any other code. However, because it runs before any declared variables are created or any methods are called, it can access only static variables, which exist as of the instantiation of the first instance of a class. (Note this is not the only way static fields are initialized. The bytecode of the class can sometimes use a field attribute entitled ConstantValue to initialize the value at the time of object instantiation.)

Because class fields are uniquely named and JVM-wide, in Jacobin we store all of them in a map, which maps the name as a string to a struct with two fields, one defining the type of variable, the other its value. (See statics.go) So, for example, PI would be:

java\lang\Math.PI --> "D", 3.141592653589793

When Jacobin instantiates a class with a static variable in it, we prefix the field's type with the letter X, which flags that the field is static and to find its value in the statics table. In this way, there exists only the single instance of the field across all threads in the JVM.