Understanding Garbage Collection in Go

About 3338 wordsAbout 11 min

2024-12-26

If you want to know how Go performs GC (Garbage Collection). I have read a lot about the "three-color marking method" on the Internet, but I think it is very complicated and not simple enough. Then this article is a very good reference for your entry into Go's GC mechanism.

Garbage Collection is a problem that every Go language programmer needs to understand after going deep into it to a certain extent. Although there are many articles on this issue on the Internet, most of them are plagiarized from each other, and many key points are explained vaguely. There are good articles written on the Internet, But I think many people cannot get over the firewall and reading English is definitely not as easy as reading Chinese, so I wrote a document to record my study notes. The foreign materials I have referenced will be listed at the end of the article.

What Is GC?

GC is the abbreviation of garbage collection, which means: garbage collector. In the process of writing code, programmers write many new keywords to create new objects or use many temporary variables, which are constantly occupying the memory space of the operating system. When the program no longer needs these variables (for example, when the life cycle of the variable ends), these memory spaces need to be released again so that they can be reallocated to new programs.

The garbage collection mechanism of mainstream high-level programming languages is divided into two categories:

manual: such as C++. Programmers need to manually release the memory they apply for.
automatic: such as Java, Python, Go, etc. The memory management system automatically recycles during the program running.

The mainstream automatic recycling mechanism has the following two types:

Reference counting method.
Tracking garbage collection.

Go uses three-color marking method. In fact, it is a kind of tracking garbage collection. So how does Go achieve this collection? How to ensure that some objects or variables in use will not be collected by mistake during the collection process? This is what we will introduce next.

I will introduce GC in the following order:

Mark & Sweep: basic algorithm idea.
Tri-color Mark & Sweep: the improvement of Mark & Sweep, the core algorithm currently used by Go.
Write Barrier: guarantee of algorithm correctness.
Stop the World (STW): the core of algorithm optimization.

It doesn’t matter if you don’t understand what the above words mean now. They are listed mainly to have a preliminary impression.

Mark & Sweep

Mark & Sweep is the basis of the three-color marking method of the Go language, which is also called: Mark-Sweep in Chinese. Let’s first understand some basic concepts of Mark & Sweep.

STW

The full name of STW is: Stop the World, That is, at some point in the GC process, we need to stop all applications to determine the reference relationship.

This is the focus of GC algorithm optimization, because the longer the application stops, the worse the performance of the language.

Root object

The root object refers to an object that can be directly accessed without going through other objects. It mainly includes the following three types:

Global object.
Object in the stack.
Object in the heap.

The memory occupied by each object in the entire process space can be regarded as a tree, and the references between objects are regarded as the relationship between nodes in the tree, and the root node of this tree is the root object we are talking about here. The three-color marking method starts from the root object, and through the reference and pointer of the object, the reference tree is traversed in depth first, and other surviving objects are constantly tracked. There will be a detailed explanation below.

After understanding the basic concepts, let's look back at how this Mark & Sweep is implemented.

Stop the World.
Mark: Through Root Continuously recursively search for reachable objects and mark them as active.
Sweep: Iterate over objects and add objects that are not marked in the Mark phase (i.e. inactive objects) to the freelist, which can be used for allocation again.
Start the World.

Warning

Obviously, the biggest problem with the algorithm is that the entire program must be completely paused during GC. In Go 1.1, the STW time may reach seconds, which is intolerable for systems with high concurrency and throughput requirements.

However, the marking process requires STW because if the program is not stopped and is still running during the marking of whether the object is active, the reference relationship may be modified during the marking phase, resulting in incorrect marking results.

The evolution of GC in Go

Versions before 1.3 use the mark-sweep method, and the entire process requires STW.
Version 1.3 separates the marking and sweeping operations, and only the marking process is STW, while the sweeping process uses concurrent execution.
Version 1.5 uses the three-color marking method in the marking process, Marking and sweeping are performed concurrently. However, before and after the marking phase, STW needs a certain amount of time to prepare for GC and re-scan the stack.

Tri-color Mark & Sweep

Based on the understanding of the Mark & Sweep algorithm, let's now look at the core algorithm of the entire Go GC mechanism: the tri-color marking method.

Tri-color refers to marking objects with the following three colors for distinction:

White: Potential garbage.
Black: Active objects, including objects without any external pointers and objects reachable from the root node.
Gray: Active objects, but there are external pointers to white objects.

At the beginning of the algorithm, all objects will be regarded as white, and the objects will be gradually "dyed" during the iteration process. At the end of the marking algorithm, there will only be black nodes and white nodes, and Go will recycle all white objects.

We can summarize the algorithm into the following three steps:

Step 1
Select a grey object from the set of grey objects and mark it black.
Step 2
Mark all objects pointed to by the black object as grey, ensuring that the object and the objects referenced by the object will not be recycled.
Step 3
Repeat the above two steps until there is no grey object in the object graph.

The pseudo code of the algorithm is as follows:

# Maintain three sets of black, white and grey to store objects of different colors
whiteSet = {A,B,C,D,E,F,G,H}
greySet = {}
blackSet = {}

# Change the objects accessible by the root node from white to grey. Note: This step will not be executed recursively.
whiteSet.remove(root.reference)
greySet.add(root.reference)

while True:
    for greyObject in greySet:
        # If the grey node has a pointer to other objects
        if greyObject.reference:

            # Dye the current grey object black
            greySet.remove(greyObject)
            blackSet.add(greyObject)

            # Dye the white object pointed to by the grey node grey
            whiteSet.remove(greyObject.reference)
            greySet.add(greyObject.reference)

    # When all global objects have been traversed, the algorithm ends
    if greySet is empty:
        break

The pseudo code can be viewed with the following example. The following figure shows the results of 4 iterations of the algorithm during execution:

According to step 1: Since the A object can be accessed from the root node on the stack and the F object can be accessed from the root node on the heap, A and F are set to grey.
According to step 2: Select a grey object from the set of all grey objects and mark it black, that is, set A to black and the object B it references to to grey.
Repeat step 2 above, set B to black and set the objects C and E it references to to gray.

Finally, after all the nodes on the reference tree built from the root nodes of the stack and heap are traversed, they will be marked as follows:

In Go 1.5, marking and sweeping are executed concurrently, that is, multiple applications and marking are executed concurrently. To ensure the correctness of marking in concurrency, we need to achieve any of the following two Tri-color invariants:

Strong tri-color invariant: Black objects will not point to white objects, but only to gray objects or gray objects.
Weak tri-color invariant: Black objects pointing to white objects must contain a route from gray objects to multiple white objects.

To better understand these two tri-color invariants, Let's look at the following example:

The diagram on the left shows strong tri-color invariance, and the diagram on the right shows weak tri-color invariance.

Let's first look at the strong tri-color invariance on the left. Consider a situation like this: If we do not perform STW in the Mark phase, a pointer $A \rightarrow D$ may be added during the marking process ( Note: This pointer addition destroys the strong tri-color invariance ). But at this time A has been visited by the goroutine executing the Mark task and has been marked black. According to the tri-color marking algorithm, only the references to the gray nodes will be visited and colored starting from the gray nodes each time, which means that D will not be visited and D will remain white at the end of the algorithm. At this time, the active object A directly references D, but D is mistakenly considered garbage and is recycled. Recycling objects that are still in use will lead to fatal errors in the program, which is unacceptable to GC. Then the significance of strong three-color invariance is: by ensuring the validity of the direct path from the active node (i.e., black node) to the node that may not be visited (i.e., white node), we can avoid the above situation.

Let's look at the weak three-color invariance on the right. Consider such a situation: if we do not perform STW in the Mark phase, a pointer of $A \rightarrow D$ may be added during the marking process. Although the strong three-color invariance is broken, we cannot access D from the black node A. However, since there is a route $B \rightarrow E \rightarrow D$ from the gray B to D, the node D will still be visited and colored, and will not cause the wrong recycling of D. So the significance of weak three-color invariance is: by ensuring the indirect path from the black node to the white node to ensure the correctness of the Mark.

From the above two examples, we can see that Following any of the above two invariants can ensure the correctness of the garbage collection algorithm in the scenario of concurrent execution without STW. So how to ensure any of these two invariants? This involves the following technology: Write Barrier.

Write Barrier

Here I will introduce two write barrier technologies, namely Insert Write Barrier proposed by Dijkstra and Delete Write Barrier proposed by Yuasa. These two write barrier technology guarantees strong three-color invariance and weak three-color invariance respectively.

Insert Write Barrier

Dijkstra proposed inserting write barrier in 1978, which can ensure the correctness of program execution when the user program and the garbage collector work alternately. The following is its pseudo code:

def writePointer(slot, ptr):
    shade(ptr)
    slot = ptr

This code is like a hook method, which will be executed every time the user program updates the object pointer. The two parameters slot and ptr in the pseudo code respectively identify the current variable and the variable that the pointer needs to point to. The shade() function represents coloring. It means that every time we execute an expression like *slot = ptr to update the object pointer, the write barrier will try to change the color of the pointer through the shade function. If the new pointer ptr is white, it will be dyed gray. The following figure shows this process well. When adding a pointer of $A \rightarrow C$ , The write barrier turns the white C gray.

The garbage collector marks the root object pointing to the A object black and marks the object B pointed to by the A object gray.
The user program modifies the pointer of the A object and points the pointer originally pointing to the B object to the C object. At this time, the write barrier is triggered to mark the C object gray.
The garbage collector traverses the other gray objects in the program in turn and marks them black respectively.

Dijkstra's insert write barrier is a relatively conservative barrier technology. It will mark all objects that may survive as gray to meet the strong three-color invariant. But the disadvantages are also very obvious.

Disadvantages of inserting write barriers

High complexity. The process code will be more complicated.
Low performance. Objects on the stack will access or change pointers multiple times in a short period of time due to the principle of locality, Write barriers are triggered frequently.

To compensate for the low performance, Go chooses to insert write barriers only for pointers on the heap (because large objects, global variables or static variables are usually placed on the heap, and they do not change frequently. Local variables or parameters are usually stored on the stack, and they change frequently). However, this will cause the stack to still reference white objects after the scan. Therefore, in order to ensure the correctness of the algorithm, STW is performed at this time and the stack is rescanned to make it black. This process is called re-scan. At this time, the entire algorithm process will become:

Initialize GC tasks: including turning on write barriers and auxiliary GC, counting the number of root object tasks, etc. STW
Scan all root objects, including global pointers and pointers on the goroutine stack (when scanning a goroutine, the goroutine stops), add them to the gray queue, and loop the gray queue until it is empty. Background parallel execution
Rescan global pointers and stacks (re-scan). STW
Reclaim all white objects according to the mark. Background parallel execution

Delete write barrier

Yuasa proposed the delete write barrier in 1990. The pseudo code is as follows:

def writePointer(slot, ptr):
    if isGrey(slot) or isWhite(slot):
        shade(*slot)
    slot = ptr

The above code will paint the white old object gray when the reference of the old object is deleted to ensure weak three-color invariance.

Mark the root object A as black, and mark the object B pointed to by A as gray.
The user program changes the pointer of A that originally pointed to B to point to C, that is, one pointer is deleted and another pointer is added. At this time Trigger the delete write barrier, but because B itself is gray, no changes are made.
The user program deletes the pointer that B originally pointed to C. At this time Trigger the delete write barrier, the white C is painted gray.
The garbage collector traverses other objects and marks them black respectively, and the algorithm ends.

Disadvantages of the delete write barrier

Low precision. Even if an object is deleted and the last pointer pointing to it is deleted, it can still survive this cycle and can only be cleaned up by GC in the next cycle.

Hybrid Write Barrier

Both the insertion barrier and the deletion barrier have their own advantages and disadvantages. The insertion write barrier does not require STW at the beginning, but STW is required to re-scan at the end. The deletion write barrier requires STW to scan the stack at the beginning to record the initial snapshot, but no STW is required at the end. The hybrid write barrier combines the characteristics of these two methods and satisfies the deformed weak three-color invariant: black objects are allowed to reference white objects, and white objects are in a gray protection state, But it is only protected by the grey object on the heap. The pseudo code is as follows:

def writePointer(slot, ptr):
    shade(*slot)
    if current stack is grey:
        shade(ptr)
    *slot = ptr

The mixed barrier only needs to scan the stacks of each goroutine concurrently at the beginning, making it black and keeping it black (this process does not require STW). After the marking is completed, because the stack is always black, there is no need for STW to re-scan. This greatly reduces the STW time. However, STW needs a certain amount of time before and after the marking phase to prepare for GC.

Sweep

Sweep is the sweeping phase, which will let Go know which memory can be reallocated for use.

During the Sweep process, the memory will not be actually zeroed (zeroing the memory), but the bit will be reset when it is allocated for reuse. GC uses the bitmap gcmarkBits to track the memory in use (1: survival; 2: recycling), and Sweep only resets the bitmap. 1 or 0.

Go provides two strategies:

Start a dedicated worker in the background to wait for memory cleanup.
Lazy trigger when requesting memory allocation.

STW

Stop the World is an important stage in the GC mechanism. All currently running programs are paused, the root node of the memory is scanned and a write barrier is added.

All processors P will be marked as stopped and no code will be run. The scheduler will separate each processor M from the processor P in its own formation and put it into the idle list.

Pacing

This is a configurable parameter about GC. It indicates the ratio of how much new memory can be allocated before the next garbage collection must be started. The default value is 100, that is, 100% of the memory can be allocated before the next GC.

Of course, this parameter can only affect GC to a certain extent. If it is not triggered for more than 2 minutes, Go will force GC to be triggered.

Summary

Go's GC process is an automatic process, using the three-color marking method to cyclically mark all objects in memory. At first, all objects are "white", and objects that can be referenced from "white" objects are "gray". After marking, all objects will eventually become "white" and "black". "Black" means objects that are still alive, and "white" means garbage and will be recycled by the system.

In this process, in order to ensure the correctness of the marking and avoid changes in the pointer of the object during the marking process, the STW program pause method will be adopted. This greatly affects the performance of GC. Therefore, in order to improve the performance, two write barriers are proposed: insert write barrier and delete write barrier. These two write barriers guarantee strong three-color invariance and weak three-color invariance respectively.

If you still have questions about Go's GC, please leave a message. The following video is what I think is a clearer explanation of this issue, I hope it can help you understand better.

References for this article