[Django]-How to iterate a large table in Django without running out of memory?


The first thing to try is using the iterator() method on the queryset before iterating over it:

for ii in MyModel.objects.all().iterator():


The correct answer is to use Django’siterator() method on the queryset before iterating over it. However, you must also wrap your query in a transaction.

    with transaction.atomic():
        for ii in MyModel.objects.all().iterator():

This is because by default Django operates in "autocommit" mode, which means database cursors will have the WITH HOLD argument, causing common DB’s like postgres to use the large tempfile.


If you are using python3.X you can try some async tasks.

It may be useful to create a schedule fetch.

Something like this:

async def _fetch_all(self):
     if self._result_cache is None:
          self._result_cache = await list(self.iterator())  # <<<< this guy!
     if self._prefetch_related_lookups and not self._prefetch_done:
          await self._prefetch_related_objects()

To run your code:

import asyncio
my_model = MyModel()

But if you are using 2.7 you will need to create a async task with celery or try some tools to do that like Django Async.

Hope it helps.

Take a look

Leave a comment