[Django]-Django: remove duplicates (group by) from queryset by related model field

7👍

By using .distinct() with a field name

Django has also a .distinct(..) function that takes as input column the column names that should be unique. Alas most database systems do not support this (only PostgreSQL to the best of my knowledge). But in PostgreSQL we can thus perform:

# Limited number of database systems support this
addresses = (Address.objects
                    .filter(user__group__id=1)
                    .order_by('-id')
                    .distinct('user_id'))

By using two queries

Another way to handle this is by first having a query that works over the users, and for each user obtains the largest address_id:

from django.db.models import Max

address_ids = (User.objects
                   .annotate(address_id=Max('address_set__id'))
                   .filter(address_id__isnull=False)
                   .values_list('address_id'))

So now for every user, we have calculated the largest corresponding address_id, and we eliminate Users that have no address. We then obtain the list of ids.

In a second step, we then fetch the addresses:

addresses = Address.objects.filter(pk__in=address_ids)

Leave a comment