The proliferation of Java Garbage Collection logs standards
Garbage Collection logging is not standardized. And as time goes on, differing “standards” proliferate, leading to further fragmentation within the field. How can we fix this? In this article, Ram Lakshmanan goes over the latest attempt at standardization, the Unified JVM Logging framework.
As most of you know, garbage collection (GC) logging is not standardized. GC log format varies by JVM vendor, Java version, GC algorithm, and GC system properties that you pass. Logs differ from Oracle to IBM, Java 8 to Java 9, Serial to Shenandoah, and even based on technical considerations. Based on these permutations and combinations, already there are over 40 different garbage collection log formats.
Challenges of disparate garbage collection standards
If you have developed any script that parses GC logs to extract certain statistics or trigger alerts when GC time exceeds certain threshold or monitor for repeated Full GC events, those scripts must be customized to cater to different formats of GC logs.
Since there are so many different GC log formats, it’s hard to understand each one of the formats and interpret the results effectively. Most formats don’t have any documentation or literature. Of course, this point can be counter-argued: you can use sophisticated GC log analysis tools such as GCeasy or HPJmeter.
Unification or proliferation?
When we heard the news that GC logs are re-implemented using ‘Unified JVM logging framework (JEP 158)’ in Java 9, we were thrilled. We thought that all of these many different garbage collection logs would be consolidated into one standard format. Life would have become so much easier. Unfortunately, that’s not what happened. Instead, it ended up creating even more log formats! :-C
Unified JVM Logging framework’s goal is to introduce a common logging system for all components of JVM. Specifically, it’s meant to unify components like the compiler, garbage collection, classloader, metaspace, svc, jfr, and more. So, how does it work? In each log line, the following information is printed in addition to current information that is present in the old version of GC Logs:
- Component name – compiler, GC, classloader, metaspace, svc, jfr
- Log Level – trace, debug, info, warning, error
- Decorations – time, uptime, timemillis, uptimemillis, timenanos, uptimenanos, pid, tid
On top of these additional parameters, the garbage collection log statements format has also changed. Here’s a comparison between Java 8 and Java 9:
So, if you look Figures 1, 2, 3, and 4 over closely, you can see the clear differences in how garbage collection logs have changed between Java 8 and Java 9.
But again, the Unified garbage collection logging framework does little to simplify the already complicated GC log format space.
It’s easy to complain and criticize any implementation. However, we understand and respect the mandate Oracle engineering team received to migrate to JVM unified logging framework. As it’s a JVM wide global initiative, GC logging was folded into this complicated situation. My personal solution to this complicated problem is to leverage tools like GCeasy or HPJmeter to parse most formats of GC logs. But in the end, only the community at large can simplify this issue by choosing one standardized approach.