[Django]-How can I find the intersection of two Django querysets?

4đź‘Ť

âś…

Refactor

class FeatureManager(models.Manager):

    @staticmethod
    def _test_cases_eq_0( qs ):
       return qs.annotate( num_test_cases=models.Count('testcase_set') ).filter(num_test_cases=0)

    @staticmethod
    def _standardized_gt_0( qs ):
        return qs.annotate( standardised=Count('documentation_set__standard') ).filter(standardised__gt=0)

    def without_test_cases(self):
        return self._test_cases_eq_0( self.get_query_set() )

    def standardised(self):
        return self._standardized_gt_0( self.get_query_set() )

    def intersection( self ):
        return self._test_cases_eq_0( self._standardized_gt_0( self.get_query_set() ) )
👤S.Lott

75đź‘Ť

In most cases you can just write (exploiting the “Set” part of QuerySet) :

intersection = Model.objects.filter(...) & Model.objects.filter(...)

This isn’t very well documented, but should behave almost exactly like using AND conditions on conditions from both queries. Relevant code: https://github.com/django/django/blob/1.8c1/django/db/models/query.py#L203

👤lqc

28đź‘Ť

You can just do something like this:

intersection = queryset1 & queryset2

To do a union just replace & by |

👤Caumons

27đź‘Ť

As per Django 1.11, now it’s available the function intersection()

>>> qs1.intersection(qs2, qs3)
👤renno

5đź‘Ť

I believe qs1.filter(pk__in=qs2) should work (usually). It seems to work for a similar case for me, it makes sense that it would work, and the generated query looks sane. (If one of your querysets uses values() to not select the primary key column or something weird, I can believe it’d break, though…)

👤Alex Dehnert

3đź‘Ť

If you want to do it in python, not in the database:

intersection = set(queryset1) & set(queryset2)

The problems is that if you use different annotations in the queriesdue to the added annotations the objects might look different…

0đź‘Ť

One way may be to use the python sets module and just do an intersection:

make a couple of query sets that overlap at id=5:

In [42]: first = Location.objects.filter(id__lt=6)
In [43]: last = Location.objects.filter(id__gt=4)

“import sets” first (gets a deprecation warning… ummm… oh well). Now build and intersect them – we get one element in the set:

In [44]: sets.Set(first).intersection(sets.Set(last))
Out[44]: Set([<Location: Location object>])

Now get the id of the intersection elements to check it really is 5:

In [48]: [s.id for s in sets.Set(first).intersection(sets.Set(last))]
Out[48]: [5]

This obviously hits the database twice and returns all the elements of the query set – better way would be to chain the filters on your managers and that should be able to do it in one DB hit and at the SQL level. I cant see a QuerySet.and/or(QuerySet) method.

👤Spacedman

0đź‘Ť

If you really are just using annotation to filter based on whether the count is zero or not, then this should work instead:

class FeatureManager(models.Manager):

    def without_test_cases(self):
        return self.get_query_set().filter(testcase__pk__isnull=True)

    def standardised(self):
        return self.get_query_set().filter(documentation_set__standard__isnull=False)

Since you no longer are worrying about annotation, the two queries should intersect very smoothly.

👤Jordan Reiter

Leave a comment