• Fri, Jun 2026

Learn how garbage collection works in Python. You’ll learn the core ideas (reference counting and generational GC), explore the gc module, diagnose cyclic references, use weakref safely, and adopt practical patterns to keep memory usage healthy in real-world apps.

1) Why Care About Garbage Collection?

Garbage collection (GC) frees memory that your program no longer needs. Most of the time it “just works,” but understanding it helps you:

  • Prevent memory bloat in web services, scripts, and data pipelines.
  • Fix subtle bugs from reference cycles and finalizers.
  • Confidently tune performance for latency-sensitive workloads.

2) The Two Pillars of Python Memory Management

2.1 Reference Counting (Immediate Reclamation)

In CPython (the most widely used Python implementation), every object has a reference count. When the count drops to zero, the object’s memory is reclaimed immediately.

import sys

x = []                          # create a list object
print(sys.getrefcount(x))       # note: adds a temporary ref for the call
y = x                           # add another reference
print(sys.getrefcount(x))
del y                           # drop a reference
print(sys.getrefcount(x))

Pro: Fast reclamation when objects become unreachable. 
Con: Pure reference counting can’t reclaim cycles (objects that reference each other but are otherwise unreachable).

2.2 Cycle Detection (Generational Garbage Collector)

To handle cycles, CPython layers a cyclic GC on top of reference counting. It periodically scans containers (lists, dicts, sets, instances, etc.) to find groups that only reference each other.


3) Generational Garbage Collection in Practice

CPython organizes container objects into three “generations.” Younger generations are collected more frequently, based on how many allocations and deallocations have happened since the last collection.

3.1 Generations and Thresholds

The GC tracks counts and triggers collections when certain thresholds are exceeded.

GenerationTypical RoleCollection Frequency
0Newly allocated container objectsMost frequent
1Objects that survived gen-0 collectionLess frequent
2Long-lived survivorsLeast frequent

If objects survive a collection, they are promoted to an older generation. The assumption is that older objects are more likely to live longer, so collecting them less often saves work.


4) The gc Module: Essential APIs

The gc module exposes controls and introspection for the cyclic collector.

4.1 Quick Status and Manual Collection

import gc

# Is the cyclic GC enabled?
print("GC enabled:", gc.isenabled())

# Manually run a full collection (all generations)
unreachable = gc.collect()
print("Unreachable objects found:", unreachable)

4.2 Inspecting Thresholds and Counters

# Current thresholds per generation (g0, g1, g2)
print("Thresholds:", gc.get_threshold())

# Current allocation/deallocation counters since last collection
print("Counts:", gc.get_count())

4.3 Adjusting Thresholds

# Set custom thresholds: (gen0, gen1, gen2)
gc.set_threshold(700, 10, 10)
print("New thresholds:", gc.get_threshold())

4.4 Enabling, Disabling, and Context Control

# Temporarily disable GC in a tight loop to reduce overhead
gc.disable()
try:
    # ... create many short-lived objects ...
    pass
finally:
    gc.enable()     # always re-enable

5) Detecting and Collecting Reference Cycles

5.1 Creating a Simple Cycle

import gc

class Node:
    def __init__(self, name):
        self.name = name
        self.ref = None

a = Node("A")
b = Node("B")
a.ref = b
b.ref = a            # A cycle

del a, b             # Drop our references; objects still reference each other

# Force a collection
unreachable = gc.collect()
print("Collected:", unreachable)

Without the cyclic GC, a and b would never be freed. The collector detects the cycle and reclaims it.

5.2 Debugging Leaks and Cycles

# Save all unreachable objects in gc.garbage for inspection
gc.set_debug(gc.DEBUG_SAVEALL)

# Create cycles, then collect
gc.collect()

# Inspect garbage
for obj in gc.garbage:
    print("Unreachable:", type(obj), getattr(obj, "__dict__", obj))

If you see objects with __del__ methods inside cycles, CPython may move them to gc.garbage to avoid unsafe finalization order. You must break those cycles or avoid __del__.


6) Finalizers (__del__) and Why They’re Tricky

__del__ is a destructor-like hook that runs when an object is reclaimed. It can complicate cycle collection because Python cannot safely decide the destruction order of cyclic objects that both define __del__.

6.1 A Subtle __del__ Example

import gc

class Resource:
    def __init__(self, name):
        self.name = name
        self.partner = None
    def __del__(self):
        # Potentially problematic if part of a cycle
        print("Cleaning up", self.name)

x = Resource("x")
y = Resource("y")
x.partner = y
y.partner = x  # cycle

del x, y
gc.collect()   # may put objects into gc.garbage if unsafe to finalize
print("Garbage size:", len(gc.garbage))

Safer alternative: prefer weakref.finalize for cleanup logic that doesn’t interfere with the collector (see next section).


7) Using weakref and weakref.finalize for Safer Cleanup

A weak reference does not increase an object’s reference count. This is useful when you need to refer to objects without preventing their collection, or to avoid forming cycles.

7.1 Weak References

import weakref

class Expensive:
    pass

obj = Expensive()
r = weakref.ref(obj)          # does not increment refcount
print("Alive?", r() is not None)

del obj
print("Alive after del?", r() is not None)  # becomes None when collected

7.2 weakref.finalize for Cleanup

import weakref

class Connection:
    def __init__(self):
        self.open = True
    def close(self):
        self.open = False
        print("Connection closed")

c = Connection()
finalizer = weakref.finalize(c, c.close)

# When c is unreachable, finalizer will call c.close() safely
del c
# Finalizer runs when GC reclaims the object

Using finalize avoids __del__-related issues in cycles and gives you better control over cleanup timing.


8) Diagnosing Leaks and Memory Growth

Spotting memory growth early is crucial for servers and batch jobs. Here are practical tools and patterns:

8.1 Track Object Counts

import gc

def snapshot():
    counts = {}
    for obj in gc.get_objects():
        t = type(obj)
        counts[t] = counts.get(t, 0) + 1
    return counts

before = snapshot()
# ... run workload ...
gc.collect()
after = snapshot()

for t in sorted(after, key=lambda k: after[k] - before.get(k, 0), reverse=True)[:10]:
    delta = after[t] - before.get(t, 0)
    if delta != 0:
        print(f"{t.__name__:+30s}  Δ={delta}")

8.2 Watch Generational Counts

import gc, time

for _ in range(5):
    print("Counts:", gc.get_count())  # (gen0, gen1, gen2)
    time.sleep(1)

8.3 Keep an Eye on gc.garbage

import gc

gc.set_debug(gc.DEBUG_SAVEALL)
gc.collect()
print("Garbage objects:", len(gc.garbage))

Objects in gc.garbage often indicate cycles that involve __del__ or other tricky patterns.


9) Tuning GC Thresholds and When to Disable GC

9.1 Threshold Tuning

If your application creates many short-lived objects, you can raise the gen-0 threshold to reduce collection frequency and overhead:

import gc

old = gc.get_threshold()
gc.set_threshold(1200, 10, 10)   # example: raise gen-0 threshold
print("Old:", old, "New:", gc.get_threshold())

9.2 Disabling GC in Hot Loops

For short, compute-heavy sections, temporarily disable the GC to reduce pauses:

import gc

gc.disable()
try:
    # tight loop creating lots of small objects
    data = [tuple(range(20)) for _ in range(1_000_000)]
finally:
    gc.enable()

Always re-enable GC. Disabling it permanently can hide leaks.


10) Best Practices and Actionable Checklist

  • Avoid unnecessary global caches (they keep objects alive). Use bounded LRU caches or weak references.
  • Prefer composition over cyclical graphs. If you must link back, use weakref to avoid cycles.
  • Don’t rely on __del__ for critical resource cleanup. Use context managers or weakref.finalize.
  • Use context managers (with) to deterministically release files, sockets, and locks.
  • Periodically run gc.collect() in batch jobs and at safe points in long-running services.
  • Monitor gc.get_count() and process RSS (via psutil) in production for early signals of leaks.
  • Only tune thresholds after measuring. Make one change at a time and benchmark latency/throughput.

11) FAQ: Quick Answers

Is GC behavior the same across all Python implementations?

No. This guide focuses on CPython. Other interpreters (like PyPy) use different GC strategies and heuristics.

Does del free memory immediately?

del drops a reference. If the reference count hits zero and the object is not part of a problematic cycle, CPython reclaims it immediately.

Should I call gc.collect() manually?

Usually not. It’s helpful for diagnostics, at the end of large batch stages, or before memory-sensitive tasks—measure the impact.

Is __del__ bad?

Not inherently, but it complicates cycles and finalization order. Prefer context managers and weakref.finalize for safer cleanup.


12) Complete, Runnable Examples

12.1 End-to-End: Create, Detect, and Break a Cycle

import gc

class Node:
    def __init__(self, name):
        self.name = name
        self.next = None

def make_cycle():
    a = Node("a")
    b = Node("b")
    a.next = b
    b.next = a
    return a, b

gc.set_debug(gc.DEBUG_SAVEALL)
a, b = make_cycle()

# Drop strong references; only the cycle remains
a_id, b_id = id(a), id(b)
del a, b

# Force collection and inspect results
unreachable = gc.collect()
print("Unreachable:", unreachable)
print("Garbage objects:", len(gc.garbage))

# Manually break the cycle if needed (example: when you still hold refs)
# for obj in list(gc.garbage):
#     if isinstance(obj, Node):
#         obj.next = None

# Clear the garbage list once done inspecting
gc.garbage.clear()

12.2 Safer Resource Cleanup with weakref.finalize

import weakref
import time

class TempFile:
    def __init__(self, name):
        self.name = name
        self.open = True
        print("Opened", self.name)

    def close(self):
        if self.open:
            print("Closed", self.name)
            self.open = False

def use_tempfile():
    t = TempFile("session.tmp")
    # Ensure cleanup even if the object participates in a cycle
    weakref.finalize(t, t.close)
    return t

t = use_tempfile()
# Drop last strong reference; finalizer will run when GC reclaims
del t

# give the GC a moment in interactive sessions
time.sleep(0.1)

12.3 Context Managers Beat Finalizers for Deterministic Release

from contextlib import contextmanager

@contextmanager
def resource(name):
    print("Acquired", name)
    try:
        yield
    finally:
        print("Released", name)

with resource("db-connection"):
    print("Do work with connection")

12.4 Measured Tuning of Thresholds

import gc
import time

def workload(n=500_000):
    # Allocate many small container objects
    data = []
    for i in range(n):
        data.append([i, i+1])
    return data

def timed_run(thresholds):
    gc.set_threshold(*thresholds)
    start = time.perf_counter()
    data = workload()
    del data
    gc.collect()
    end = time.perf_counter()
    return end - start

baseline = gc.get_threshold()
print("Baseline thresholds:", baseline)

for t in [(700, 10, 10), (1200, 10, 10), (2000, 10, 10)]:
    dur = timed_run(t)
    print("Thresholds", t, "Duration:", round(dur, 3), "s")

# restore
gc.set_threshold(*baseline)

Summary

Python’s memory management blends immediate reference counting with a generational cyclic collector. Most applications never need manual intervention, but understanding the model pays off when debugging leaks, handling long-running services, or tuning performance. Reach first for context managers and weakref.finalize, monitor with gc diagnostics, and only tune thresholds when measurements justify it.

 

This website uses cookies to enhance your browsing experience. By continuing to use this site, you consent to the use of cookies. Please review our Privacy Policy for more information on how we handle your data. Cookie Policy