Unsafe Part 1: sun.misc.Unsafe Helper Classes

| unsafe | java

I recently came across the sun.misc.Unsafe class, a poorly documented, internal API that gives your java program direct access to the JVM’s memory. Of course accessing the JVM’s memory can be considered unsafe, but allows for some exciting opportunities.

You can use Unsafe to inspect and manipulate the layout of your objects in RAM, allocate memory off the heap, do interesting things with threads, or even hack in multiple inheritance. Multiple people have written about Unsafe before, and there are some really good articles, so we won’t cover it here.

Using unsafe is not too difficult, but I found the need for a few helper methods, thus I created a collection of classes wrapping the Unsafe code, starting with UnsafeHelper. The main methods of interest are getUnsafe(), sizeOf(), firstFieldOffset(), toByteArray() and hexDump(). The javadoc is the best place to look for documentation, however I’ll quickly explain their use.

To get an sun.misc.Unsafe instance, you have to extract it from a private static field within sun.misc.Unsafe class. For ease, the UnsafeHelper.getUnsafe() method does that.

When accessing an object, you typically need to know the size of the object (in bytes), and be able to find the offset to individual fields. If you understand the memory layout the JVM uses, you’ll know there is a header in front of the Object’s fields. Typically it looks like this, but varies based on CPU architecture, platform, etc:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
mark word(8) klass pointer(4) padding
More information [here][6] and [here][7].

To hide some of the details, headerSize() returns the size of the header, and sizeOf() return the total size an object including the header in bytes. firstFieldOffset() is then useful as it provides the the offset to the first field. Note that headerSize() and firstFieldOffset() do not always return identical results, as padding (not part of the header) may be used to correctly align the first field.

Next toByteArray() will take an object, and copy it (and its header) into a byte array. Useful for easily inspecting, and serialising the object. Finally, hexDump() uses the toByteArray() to grab an object, and print out a hex representation of the memory, for example:

/**
 * hexDump(new Class4()) prints:
 * 0x00000000: 01 00 00 00 00 00 00 00  8A BF 62 DF 67 45 23 01
 */
static class Class4 {
    int i = 0x01234567;
}

/**
 * Longs are always 8 byte aligned, so 4 bytes of padding
 * hexDump(new Class8()) prints:
 * 0x00000000: 01 00 00 00 00 00 00 00  9B 81 61 DF 00 00 00 00
 * 0x00000010: EF CD AB 89 67 45 23 01
 */
static class Class8 {
    long l = 0x0123456789ABCDEFL;
}

In the first example, Class4, a simple class with a single int field, takes up 16 bytes of memory, with the first 8 used by the JVM, the 2nd 4 bytes being a class pointer (basically how the object knows what kind of class it is), and the last four actually being the value of the field. The second example shows a similar header, but with bytes 12-16 being used as padding, so that the long field value is 8 byte aligned.

These helper methods are available in new project on Github, and downloadable via Maven. Just download the jar file, or include a maven dependency, and import net.bramp.unsafe.UnsafeHelper.

<dependency>
    <groupId>net.bramp.unsafe</groupId>
    <artifactId>unsafe-helper</artifactId>
    <version>1.0</version>
</dependency>

Next article, we’ll make use of this new UnsafeHelper to build a special List which copies objects, instead of storing references.