[Django]-Why doesn't memory get released to system after large queries (or series of queries) in django?

26👍

I decided to move my comments into an answer to make things clearer.

Since Python 2.5, the CPython memory allocation tracks internal memory usage by the small object allocator, and attempts to return completely free arenas to the underlying OS. This works most of the time, but the fact that objects can’t be moved around in memory means that fragmentation can be a serious problem.

Try the following experiment (I used 3.2, but 2.5+ should be similar if you use xrange):

# Create the big lists in advance to avoid skewing the memory counts
seq1 = [None] * 10**6 # Big list of references to None
seq2 = seq1[::10]

# Create and reference a lot of smaller lists
seq1[:] = [[] for x in range(10**6)] # References all the new lists
seq2[:] = seq1[::10] # Grab a second reference to 10% of the new lists

# Memory fragmentation in action
seq1[:] = [None] * 10**6 # 90% of the lists are no longer referenced here
seq2[:] = seq1[::10] # But memory freed only after last 10% are dropped

Note, even if you drop the references to seq1 and seq2, the above sequence will likely leave your Python process holding a lot of extra memory.

When people talk about PyPy using less memory than CPython, this is a major part of what they’re talking about. Because PyPy doesn’t use direct pointer references under the hood, it is able to use a compacting GC, thus avoiding much of the fragmentation problem and more reliably returning memory to the OS.

6👍

Lots of applications, language runtimes, and perhaps even some system memory allocators will keep deallocated memory in place for as long as possible with a view to re-using it, purely for performance purposes. In a complex system like Django it could be any number of extensions, possibly implemented in C, which exhibit this behaviour, or it might be Python with some sort of memory pool, or lazy garbage collection.

It could even be the underlying malloc implementation doing this, or your operating system keeping a certain amount of memory space allocated to your process even though the process isn’t explicitly using it. Don’t quote me on this though – it’s been a while since I looked into such things.

On the whole though, if repeating the allocation process after the initial alloc and dealloc does not double the amount of memory used, what you’re seeing is not a memory leak but memory pooling. It’s only likely to be a problem if you have a lot of processes that contend for a limited amount of memory on that machine.

Leave a comment