In this post I will summarize the workings of .NET garbage collection. The notes are from the book Writing High Performance .NET Code by Ben Watson.
The purpose of .NET garbage collection (GC) is to manage memory. In the memory heap, we have a native allocation and a managed allocation. Native allocation is for storing objects required for Win API, OS, CLR etc. GC works in the managed memory heap.
The purpose of .NET garbage collection (GC) is to manage memory. In the memory heap, we have a native allocation and a managed allocation. Native allocation is for storing objects required for Win API, OS, CLR etc. GC works in the managed memory heap.
This heap is further divided into small object heap (SOH) and large object heap (LOH). Different memory segments are allocated for these two. A memory segment has around 100MB to more of space. LOH can span into multiple memory segments. Based on a certain size limit, objects are allocated either to SOH or LOH. Larger arrays or larger objects will get allocated LOH and smaller objects to SOH.
SOH is further divided into Gen 0, 1 and 2. Gen 0 and 1 share the same segment. Gen 2 can be split into multiple segments.
Gen 0
- All small objects first go here.
- Usually at end of so that allocation is faster, otherwise where ever there is space in the segment.
- If there is no space a Gen 0 GC is triggered.
- When GC happens, alive objects move to Gen 1 and rest of memory is regained.
- An object is alive - if GC can reach the object via any root/graph of object references
- It also does compaction of segment to free up space at end.
- If compaction can not be done, boundaries of Gen 0 and 1 are redrawn.
- works similar to Gen 0
- when Gen 1 GC occurs, it calls Gen 0 GC as well
Gen 2
- Object remain here for their lifetime
- If object is not alive, memory is reclaimed when Gen 2 GC runs.
- Gen 2 GC is full GC because it also calls Gen 1 GC which in turn calls Gen 0 GC as well.
The rule for better programming is to code in such a way that either the objects are cleaned up in Gen 0 or not at all. Which means that the objects should have extremely small lifespan. And those that need to stay, should reach Gen 2 sooner and stay there for the length of the program. For ex, cached objects. The reason is that Gen 2 collection is costly.
Tools:
- PerfView - Memory heap snapshot can show you the objects that are still alive. You can force GC as well to debug memory leaks.
- vmmap - shows which process is using how much memory and how much of that is managed and unmanaged.