Garbage Collection in Java  

Garbage Collection in Java  

Garbage collection in Java is a familiar term in the coding world. You will come across it when learning the Java programming language. Because it’s built into Java memory management, the garbage collector is one of Java’s crucial features. It helps prevent serious errors and allows programmers to create new objects without worrying about unwanted objects.

However, since the background thread that performs garbage collection (GC) runs at unpredictable times, it can reduce an application’s performance or slow it down, causing unnecessary delays. Therefore, developers need to clearly understand the right garbage collector to use to avoid being held back when executing a program.

This article gives an in-depth description of what garbage collection in Java entails. It also includes the major benefits of garbage collection and a simple explanation of how it works to help developers and administrators have an easier time using it in their applications.

Contents

What Is Garbage Collection?

Garbage collection is the process of reclaiming unused run-time memory automatically. This means developers can create new objects without worrying about memory allocation and deallocation since unused objects are destroyed automatically.

In languages such as C and C++, where the programmer creates and destroys objects manually, it might be challenging to manage and constantly destroy the redundant objects. When too many unwanted objects are stored, the system’s memory may be unable to allocate the latest objects efficiently.

Additionally, manual allocation makes the system susceptible to memory leaks, especially when the programmer does not release the unneeded memory or an object in the memory heap cannot be accessed by a running code.

Java programs run on a Java virtual machine (JVM) in the form of bytecode. JVM executes the programs by interpreting the bytecode to allow object creation in the heap space or the allocated memory portion for the program.

Creating and releasing new objects continues until some objects become completely unwanted. At that point, the heap contains two types of objects, which can be described as:

Dead: Dispensable objects or objects no longer used.                      

Live: Objects in use and being referenced.

What Are the Advantages of Garbage Collection?

Many programmers believe the performance of explicit storage reclamation is better than automatic garbage collection in memory management. However, several studies showed that well-developed systems under garbage collectors produced better results than those with explicit deallocation. Here are some of the benefits of garbage collection.

  • Automatic deallocation helps developers manually release memory, increase system writeability, and save development time and costs.
  • There is efficiency in allocating objects in a managed heap.
  • It provides memory safety such that when an object exhausts its memory allocation, it cannot use memory allocated for another object.
  • By reclaiming objects that are no longer in use and clearing memory, garbage collection keeps memory free and available for future use. Developers don’t need to initiate every field because there is clean content to start with.
  • Garbage collection ensures program efficiency.

How Does Garbage Collection Work in Java?

Garbage collection in Java is an automatic operation that runs the heap memory by identifying the valuable and non-valuable objects and doing away with unused objects. The objects that the running codes of a program cannot reach occupy memory that can be freed for use.

Contrary to other languages, the programmer won’t be forced to control all the activities run by objects because the JVM manages the unwanted data in Java. JVM directs the garbage collector; therefore, whenever the heap grows, JVM performs a clean-up process automatically.

Now, check out how garbage collection happens in three basic steps:

  1. Mark: This is the stage where the garbage collector identifies the used and unused objects. The unreferenced objects are marked pending removal. The objects the application holds references for are still useful and will remain.
  2. Sweep: This stage involves clearing the unreferenced objects to free up space. Also, the garbage collector saves the live objects and the pointers to the empty area. Finally, the heap memory that holds unreferenced objects is released to create space for new objects.
  3. Compact: This is the final step where the fragmented objects are rearranged from the beginning of the heap to create space for newly allocated objects. The empty space for rearrangement results from unwanted memory cleared by the garbage collector, leading to efficient performance when running your programs.

The mark, sweep, and compact algorithm is the most basic garbage collection level. However, it might fail to run the full mark, sweep, and compact process because most objects are short-lived.

As a result, programmers need a more efficient GC algorithm to ensure all the short-lived objects are managed.

Generational Garbage Collection

The generational garbage collection algorithm arranges objects in terms of their lifespan to collect garbage more efficiently. It ensures the garbage collector fully operates on all short-lived objects by creating different levels depending on the duration of its existence. It further divides the heap memory into two major compartments, the young and old generations.

Young Generation

The young generation carries all the new objects. It is subdivided into two partitions, Eden and Survivor. 

Eden: All the new objects are put here after every garbage collection cycle before moving to the survivor partition.

Survivor: Survivor is divided into SO and S1, also called FromSpace and ToSpace.

This is how the flow of object allocation takes place:

  • First, all new objects are allocated to the Eden partition, with both the Survivor partitions remaining empty.
  • With time, the Eden partitions fill up such that no new allocation can go through. This leads JVM to execute a minor garbage collection, mark the referenced object, and transfer it to the S0 partition, creating space on S1.
  • A similar process occurs when Eden is whole again. First, JVM performs a minor garbage collection process. It then identifies the used objects in the Eden and S0 divisions and moves them to S1. This means that either S1 or S0 will always be empty at any time.
  • The next minor garbage collection is similar to the one discussed above, only that the survivor spaces switch. Objects are moved from S1 to S0 and not S0 to S1. The live objects remain in S0.

Before being promoted to the old generation, the garbage collection takes some time in the third and fourth steps. After objects are identified as long-lived, the garbage collector moves them to old objects.

Old Generation

Unlike the young generation, this compartment involves marking and sweeps and runs major GC operations. A complete garbage collection cleans up both the old and young generations. This happens by promoting all live objects from the young generation to the old generation and compacting the space.

The old generation protects all the existing objects and ensures that no new objects are moved to the heap memory by pausing its application when complete garbage collection is initiated.

Garbage Collection Activities

Garbage collection involves two types of activities:

Minor or Incremental Garbage Collection

This operation runs in the young generation to clear the unreachable objects found on its heap memory.

Major or Full Garbage Collection   

Major garbage collection is executed to find and delete the objects that were not cleared by the minor garbage collection and were copied into the old heap memory; the surviving objects are compacted to arrange new objects in a sequence. In this phase, garbage collection frequency is less than in the young generation. Also, there is less frequent garbage collection in this partition. 

Unreferenced Objects

Unreferenced objects denote objects that are no longer valuable to the programmer. All objects minus a reference point are regarded as unwanted in this case. To create space in the heap memory, garbage collectors destroy these unreferenced objects, reclaim the memory, and compact it, readying the space for new allocations.

How Can an Object Be Unreferenced?

Even though a programmer is not responsible for destroying unneeded objects, they should make an object unreachable if it is no longer required. When unreferenced objects are not eliminated, they fill up the heap memory and interfere with the program’s performance. These are usually referred to as memory leaks. There are simple tricks that can render a reference to an object omitted and eligible for garbage collection. They include:

  • Re-assigning the reference variable.
  • Making a reference null.
  • Using an anonymous object.

Garbage Collection Roots

Garbage collection roots are unique objects in the garbage collection process. They are mainly the starting point of garbage collection and are referenced by the JVM. Thus, they keep every other object directly or indirectly referenced from being garbage collected. All the applications or programs should be able to access the roots to reach out to other parts of the tree. Below are the four primary garbage collection roots in Java.

  1. Local Variables: These remain alive as long as a thread’s stack connects them.
  2. Active Java Threads: Active Java threads join the garbage collection roots list because they are regarded as live threads.
  3. Static Variables: Static variables are referenced into classes; therefore, these garbage collector roots cannot be subject to clearance when the class is loaded. The classes can be garbage collected, removing all referenced static variables. This is essential when using class loaders or application servers in general.
  4. JNI References: These are Java objects that help the garbage collector find the unreachable objects referenced as native code. The JNI references expose these types of objects and enable the JVM to work on them. If the objects referenced as native codes are not identified, the JVM may fail to run garbage collection on them.

Types of Garbage Collectors in the Java Virtual Machine

Garbage collection plays a vital role in Java memory by removing the unreferenced objects from the heap and providing enough space for newly designed objects. JVM has different types of garbage collectors. Here are comprehensive details of each.

Serial GC

The serial garbage collector was explicitly developed for applications running on single-thread processes. The serial GC blocks all the multi-thread applications to do the collection serially under one thread when performing its activities. After every garbage collection cycle, the serial garbage collector compacts the memory to avoid memory fragmentation.

Applications with low pause times are discouraged because they may interfere with garbage collection. This interference usually happens when an application resumes its normal activities while the serial GC initiates the pause application, commonly known as “stop the world event.”

Parallel GC     

The parallel GC works efficiently on large data because it runs the heap memory system through the multi-threads processor and is usually the default application for activities in the JVM. The minor garbage collection in the young generation utilizes multiple threads, while the major garbage collection in the old generation uses a single thread.

Like serial GC, the parallel garbage collector also has the “stop the world event,” which pauses the application currently processing the garbage collection. This type of collector is suitable when doing a lot of work that may require long pauses.

Parallel Old GC

This is an older version of parallel GC since Java 7u4. It’s slightly different from parallel GC because it performs the multi-thread processor for young and old generations. With this collector, programmers can control the number of garbage collector threads, specify the maximum pause time goal, and stipulate the maximum throughput target.

CMS (Concurrent Mark Sweep) GC

The CMS garbage collector is also called the concurrent low pause collector, which signifies its ability to process the garbage collection non-stop.

Additionally, its activities on minor garbage collection are done under multiple thread processes. The multi-thread processes work on applications better than other garbage collectors because they use more CPU space. Also, if more CPU space equals better performance, then the CMS garbage collector is a better choice than the parallel collector. However, in some circumstances, the application performance might be slow.

G1 (Garbage First) GC

The G1 garbage collector works similarly to the CMS garbage collector in terms of parallel and concurrent performance. However, it was mainly created to replace the CMS garbage collector JDK1.7 and bring better experience in the field of GC. G1 uses the mark-sweep algorithm to perform garbage collections. Since it is more performance-efficient, G1 might eventually replace CMS.

It partitions the memory heap into a set of equal-size regions and scans them using multiple threads.

Its name, “garbage first,” is derived from the garbage collection activities of G1, which involves identifying the empty area first to generate more free space in the memory system.

Epsilon GC

Epsilon was an update to JDK 11 and released as a passive or non-operational garbage collector. It only executes memory allocation but can’t retrieve it because the JVM shuts down after the Java heap memory is depleted. Additionally, the Epsilon garbage collector can efficiently work on applications that run out of memory and crash.

It seeks to achieve better application performance by measuring, controlling, and eliminating the inefficiencies caused by garbage collectors. It also provides a better understanding of how garbage collectors affect the smooth running of an application and the memory threshold by showing them whenever it runs out. For users who want to optimize the performance of their applications, the Epsilon garbage collector is a viable choice.

Shenandoah GC

Shenandoah was released as part of JDK 12. It improved the experience of garbage collectors by reducing pause time while allowing the Java program to continue its operations in the background. Furthermore, Shenandoah uses the ultra-low pause time GC to successfully mark all the live objects and produce simultaneous compaction of memory spaces.

Unlike G1, Shenandoah’s garbage collection cycle runs concurrently with the app. It does not need to pause the application to compact and relocate live objects. This makes it more CPU intensive.

Shenandoah also controls access to objects by adding a forwarding pointer to every heap object, making it an easy task to move objects within the partitioned regions. Also, its ability to aggressively compact the heap parallel with application threads sets it apart from the other garbage collectors.

Z Garbage Collector

ZGC is ideal for applications requiring low latency or using a huge heap (multi-terabytes). It uses colored pointers to keep track of heap usage by performing operations when the threads are running. ZGC is easy to use and highly scalable. It allows Java applications to run as it performs garbage collection activities.

Its low pause time brings a significant improvement over other garbage collectors. The partitions by this collector differ in size. ZGC does its marking in three phases:

  1. Short stop-the-world phase: In this phase, ZGC examines the GC roots to ensure pauses don’t increase as the heap grows.
  2. Concurrent phase: Scans the object graph, examining the colored pointers and marks of reachable objects.
  3. Relocation phase: Moves live objects to free heap memory.

Java Memory Management

Java memory management is the process of allocation and deallocation of objects. It’s an automatic process that does not need the direct intervention of a programmer.

However, the automatic process does not guarantee everything. This means programmers need to know how memory management in Java works to write high-performance-based programs that will not crash. Even when they crash, a programmer will know how to debug. Managing the memory prevents leaks that may affect performance.

The central concepts in Java memory management are JVM memory structure and the working of garbage collectors.

Having comprehensively discussed the working of garbage collectors, let’s look at the JVM memory structure.

What Is JVM Memory Structure?

JVM defines various run-time data areas used during the execution of a program. Some areas are created by JVM and others by the threads used in the program. The memory areas created by JVM are destroyed when it exits. The same applies to data areas created by the thread.

The Java memory area consists of various parts including, heap, method area, JVM stack native method stack, and PC registers.

Each area performs specific tasks, with the ultimate goal being to execute programs successfully.

Conclusion

Although garbage collection in Java is an automatic process, programmers who want to advance their skills should learn how garbage collection works. This will help them optimize garbage collection activity by ensuring Java heap memory is adequately configured and managed.

Also, it is essential to monitor garbage collection for optimal Java application performance. This makes it easy for users to run programs without interruptions and downtime.