[Fixed]-Django ORM filter queryset for duplicates

1👍

Getting a single query set that spans across models that don’t have a common inheritance parent is a bad smell. I’d stick to the simple list instead. Any QS logic doesn’t translate to such an construct.

The below one is quite speedy. It makes use of the indexes. Please mind the empty .order_by(). If your models don’t have ordering specified in the Meta class, you can skip it.

Get all Track names:

track_names = Track.objects.order_by().values_list('name', flat=True).distinct()

Get all Artist names:

artist_names = Artist.objects.order_by().values_list('name', flat=True).distinct()

Then get an intersection:

duplicated_names = set(track_names) & set(artist_names)

That way you get the names that are both in Artist and Track models.

To get them, simply:

Track.objects.filter(name__in=duplicated_names)

To find the duplicates in single model, use the method you’ve already quoted.

Leave a comment