Java’s Unsafe class: “There is certainly also a way without it!”
The sun.misc.Unsafe API, which is used by many libraries and frameworks, is being removed as default from Java 9. With the debate still brewing, we asked Lucene developer Uwe Schindler for his opinion on this controversial decision.
JAXenter: A discussion is happening on social media that Java 9 may ship without sun.misc.Unsafe. First of all: What’s behind sun.misc.Unsafe?
Uwe Schindler: Unsafe is an private and internal class of the Oracle JDK / OpenJDK platform. It’s used behind the scenes in a lot of public Java APIs to implement operations which are otherwise only available with native C or assembly code. It’s used for mainly the following tasks:
- Atomic compare-and-swap operations: Java bytecode does not have an instruction that can do this natively, so implementations from the package java.util.concurrent like AtomicInteger, LongAdder, or ConcurrentHashMap use Unsafe behind the scenes.
- Volatile array access: There is also no corresponding Java bytecode for this – you can only declare the whole array as volatile, but not its elements.
- Direct access to off-heap memory: Unsafe provides something like malloc() – as known from the C programming language. In addition, it provides methods to access allocated memory (which is outside of the main Java heap) by its absolute address. This is similar to pointers in C. This functionality is mainly used by direct buffers in the Java NIO API and to provide access to memory-mapped files.
- Some useful constants about the internal structure of object instances. These constants are usually required to call one of the above methods, but they are also useful to make runtime estimations on how much memory object instances require on Java’s heap.
Why should sun.misc.Unsafe be removed?
The problem with Unsafe is very easy to explain: The methods in this class can be used to access memory inside and outside of Java’s heap with direct pointers. As there are no checks, they could be used to crash the JVM (SIGSEGV) or to inject platform’s assembly code into the JVM! This is generally “unsafe” (as the name of the class suggests). Java code should use the “official” APIs provided by the JDK: AtomicInteger, ConcurrentHashMap, ByteBuffer,…
Officially, the JDK restricts access to sun.misc.Unsafe because the static factory method getUnsafe() is caller-sensitive: It can only be called from inside JRE’s classes (which are provided by the system classloader), but not from application code. However, clever developers found a way through Java reflection to get an instance of Unsafe (reading a private field of Unsafe holding the singleton instance – if the Java SecurityManager does not prevent this).
And this is where Java 9 puts its restrictions on: As a result of the new module concept, sun.misc.Unsafe is removed from “java.base” and is pushed into an internal module. Application code cannot access classes from these internal modules or even instantiate them. In fact, sun.misc.Unsafe no longer exists outside the JDK and is completely isolated by its new modular concept. Therefore, Class.forName() will throw a ClassNotFoundException. This really prevents any access from application code!
What would removing sun.misc.Unsafe mean for Apache Lucene and Solr?
This would have no impact on Apache Lucene, because we prohibit usage of sun.misc.Unsafe in our code guidelines. We are a library, which is used by many projects and we cannot compromise the security of our user’s programs. In addition, relying on undocumented APIs like sun.misc.Unsafe makes our code sensitive to changes in internal APIs of the platform. If one would look at the source code of Apache Lucene, many programmers who have used Unsafe in the past may understand that there is certainly also a way without it!
Apache Solr may eventually have problems, because there are numerous third-party libraries for which we do not know whether they use Unsafe. One of them was replaced recently.
Detractors fear that many mainstream applications will stop working after removing sun.misc.Unsafe. Are they right?
This is, in fact, a correct statement! There are numerous libraries, e.g. Netty, which use Unsafe internally (using the previously described reflection hacks). In lots of cases this isn’t even documented! Some projects also use special libraries as a replacement for features appearing in later Java versions (javax packages). One example is LongAdder, which is only part of Java 8’s java.util.concurrent classes. Those projects should be updated to require Java 8 and remove the obsolete external library.
What do you think about the proposal to reimplement parts of sun.misc.Unsafe and add it as a public API to the Java specification?
There are actually some parts of sun.misc.Unsafe which are not really “unsafe”. For example, all the constants or the compare-and-swap primitives can quite safely be used if they are properly encapsulated by a public API. However, direct access to absolute memory addresses (like “C code”) should really be avoided, as that would completely undermine the whole security of the Java platform. As you can see in the code of Apache Lucene’s ByteBufferIndexInput, one can use ByteBuffer.allocateDirect() and implement off-heap memory access using public APIs in a safe way, but with a small runtime overhead through bounds checks. However, optimisations in the Hotspot VM have recently removed a lot of those slowdowns! A new and safe API as a replacement for ByteBuffer with final bounds and 64 bit access would improve that even more.
Therefore, the developers of these libraries should consider this and possibly provide new updates to their products that do not require Unsafe. In my opinion, this should be possible to manage until the release of Java 9. But there will certainly be a small “gap” in the development process, where there may be problems. Because of that, the OpenJDK developers responsible for Java 9 have presented a “command-line switch” as a workaround which makes sun.misc.Unsafe visible again for legacy code.