[Fixed]-Is it a good practice to add extra field to database model to speed up database queries?

1đź‘Ť

id_list = [dev.id for dev in device_list]
devtests = DeviceTest.objects.filter(
    device_id__in=id_list).order_by('-created_at').distinct('device')

That should give you, in one database call, in devtests only the latest entries for each device_id by create_at value.

Then do your loop and take the values from the list, instead of calling the database on each iteration.

However, it could also be a good idea to denormalize the database, like you suggested. Using “redundant fields” can definitely be good practice. You can automate the denormalization in the save() method or by listening to a post_save() signal from the related model.

Edit

First a correction: should be .distinct('device') (not created_at)

A list comprehension to fetch only the id values from the device_list. Equivalent to Device.objects.filter(...).values_list('id', flat=True)

id_list = [dev.id for dev in device_list]

Using the list of ids, we fetch all related DeviceTest objects

devtests = DeviceTest.objects.filter(device_id__in=id_list)

and order them by created_at but with the newest first -created_at. That also means, for every Device, the newest related DeviceTest will be first.

.order_by('-created_at')

Finally, for every device we only select the first related value we find (that would be the newest, because we sorted the values that way).

.distinct('device')

Additionally, you could also combine the device id and DeviceTest lookups

devtests = DeviceTest.objects.filter(device_in=Device.objects.filter(...))

then Django would create the SQL for it to do the JOIN in the database, so you don’t need to load and loop the id list in Python.

👤C14L

Leave a comment