JEP draft: Throughput post-write barrier for G1
Java Platform Team software engineer Man Cao has published a new JEP draft proposing to improve the performance of the G1 garbage collector when concurrent refinement is disabled. He proposes to do this by introducing a simplified post-write barrier. Let’s take a closer look at what could be the future of Java.
Cao’s JEP draft aims to improve the throughput of the G1 garbage collector when concurrent refinement is disabled, as well as reducing overall CPU usage. He wants to do this because G1 garbage collector’s post-write barrier is currently more complicated than that of traditional garbage collectors like parallel and concurrent mark sweep. The reason for this is the fact G1 supports concurrent refinement, which need the post-barrier to add dirty cards to a per-thread dirty card queue and ensure proper memory visibility. Due to this, G1’s post-barrier has a noticeable overhead on throughput and CPU usage in Java code.
Cao writes, “concurrent refinement builds remembered sets concurrently, in order to reduce card scanning work during a collection pause. However, concurrent refinement offers limited benefit for certain types of workload. Examples include throughput-oriented workload and workload that is tuned to minimize old-generation collections. For these cases, G1 could perform better with a simplified post-barrier and disabling concurrent refinement.”
What would change?
The Java enhancement proposal draft would introduce a throughput post-write barrier for G1. So if concurrent refinement has been disabled via
-XX:G1ConcRefinementThreads=0, the compilers and interpreter would issue a simplified post-barrier for a write
p.f = q:
if (p and q in same region) -> exit if (q is NULL) -> exit if (*card(p) == DIRTY) -> exit *card(p) = DIRTY
G1 scans the dirty cards mapped to regions outside the collection set as well as remembered sets for regions in the collection set – this ensures correctness.
Overall, this would reduce the total amount of effort involved in handling a dirty card and compilation work for JIT compilers. Furthermore, reducing the size of remembered sets and not using per-thread dirty card queues and decreases the memory footprint.
SEE ALSO: JEP 359: Records
Cao suggests that a different way to implement the throughput post-write barrier would be:
if (p and q in same region) -> exit if (q is NULL) -> exit *card(p) = DIRTY
However, of this approach, he wrote, “benchmarks show that this alternative barrier has little or no performance difference compared to the proposed barrier. The proposed barrier shares more similarity with the default post-write barrier, which makes it possible to implement further enhancement such as dynamically switch between default and throughput barrier.”
SEE ALSO: JEP 360: Sealed Types
In some use cases the small pause time goal would be harder to meet with concurrent refinement disabled and the throughput post-write barrier. Cao offers the example of workloads with a large heap and a considerable proportion of long-lived objects. He assumes that such cases would fall outside of his proposal because they would usually keep concurrent confinement enabled.
For more information, read the JEP draft about a throughput post-write barrier for G1 here.