[Django]-Django filter. What is more efficient โ€“ .filter(date__range) or .all() and filter via Python

5๐Ÿ‘

โœ…

In theory, since .all() creates a lazy QuerySet, it might perform better than range filtering in database with 1M+ rows. Also would be interesting to know about memory consumption in both cases.

A QuerySet is lazy in the sense that it will only retrieve the objects if necessary, but once it has to do that, it will fetch all items. Not only .all() is lazy by the way, all QuerySets are lazy, so if you define a Result.objects.filter(โ€ฆ) queryset, that QuerySet is lazy as well.

But regardless how it is implemented, filtering at the database side is more efficient, since databases are designed to do this, and it results in less bandwidth from the database to the Python/Django layer.

In case there are memory issues, it likely means that your QuerySet, even if filtered, is too large to store in memory. You can work with the .iterator(โ€ฆ) method [Django-doc] to load batches of items that then can be processed:

results = Result.objects.filter(date__range=(date_from, date_to)).iterator()

for result in results:
    # โ€ฆ
    pass

If will each time load a chunk of records into memory that then can be processed. If you do not store the items (for example in a list), then Python can reuse the memory for the next chunk.

Leave a comment