[Fixed]-Django query difference in group

1👍

I want all A’s where any of the related model C’s X attributes have a difference of 20% or more

If any of the C.x for A has difference of %20 then surely MAX and MIN C.x for that A has at least %20 or more difference. You can construct your query on that fact. Using aggregate expressions you can do something like this:

A.objects.annotate(
      max_diff=(Max('b__c__x') - Min('b__c__x')) * 100 / Min('b__c__x')
    ).filter(max_diff__gte=20)

Of course b, c represents the related names for the foreign or many to many relations. This will first annotate A objects with max_diff in percent, then we filter on that value. Depending on your field types you might need to specify an output_field too.

I don’t know the specifics for your problem but I’d also recommend you to check available aggregation functions, maybe standard deviation or variance would help.

Reference:

0👍

This may be doable in the orm, but it sounds like the sort of thing that would be a mess (need to make sure the count of those within 20% of the average of all of them is the same as the count of call of them). It seems like a great opportunity for a prefetch. Maybe something like

for a in A.objects.prefetch_related("b_set__c"):
    cs = [b.c for b in a.b_set.all()]

and then you can apply your filtering in python which will make it a little more inspectable and should only run a finite number of queries. The only issue is that your queryset hopefully isn’t too large to iterate over entirely for this result.

If this method doesn’t work out well, you may be better served by adding a column for indicate whether this fits your low-variability criterion and updating that when saving instances of B and C.

Leave a comment