219
Conditional aggregation in Django 2.0+ allows you to further reduce the amount of faff this has been in the past. This will also use Postgres’ filter
logic, which is somewhat faster than a sum-case (I’ve seen numbers like 20-30% bandied around).
Anyway, in your case, we’re looking at something as simple as:
from django.db.models import Q, Count
events = Event.objects.annotate(
paid_participants=Count('participants', filter=Q(participants__is_paid=True))
)
There’s a separate section in the docs about filtering on annotations. It’s the same stuff as conditional aggregation but more like my example above. Either which way, this is a lot healthier than the gnarly subqueries I was doing before.
97
Just discovered that Django 1.8 has new conditional expressions feature, so now we can do like this:
events = Event.objects.all().annotate(paid_participants=models.Sum(
models.Case(
models.When(participant__is_paid=True, then=1),
default=0, output_field=models.IntegerField()
)))
- [Django]-Django.db.migrations.exceptions.InconsistentMigrationHistory
- [Django]-How can I upgrade specific packages using pip and a requirements file?
- [Django]-Has Django served an excess of 100k daily visits?
51
UPDATE
The sub-query approach which I mention is now supported in Django 1.11 via subquery-expressions.
Event.objects.annotate(
num_paid_participants=Subquery(
Participant.objects.filter(
is_paid=True,
event=OuterRef('pk')
).values('event')
.annotate(cnt=Count('pk'))
.values('cnt'),
output_field=models.IntegerField()
)
)
I prefer this over aggregation (sum+case), because it should be faster and easier to be optimized (with proper indexing).
For older version, the same can be achieved using .extra
Event.objects.extra(select={'num_paid_participants': "\
SELECT COUNT(*) \
FROM `myapp_participant` \
WHERE `myapp_participant`.`is_paid` = 1 AND \
`myapp_participant`.`event_id` = `myapp_event`.`id`"
})
- [Django]-How to convert JSON data into a Python object?
- [Django]-How to use Django ImageField, and why use it at all?
- [Django]-Django Queryset with year(date) = '2010'
6
I would suggest to use the .values
method of your Participant
queryset instead.
For short, what you want to do is given by:
Participant.objects\
.filter(is_paid=True)\
.values('event')\
.distinct()\
.annotate(models.Count('id'))
A complete example is as follow:
-
Create 2
Event
s:event1 = Event.objects.create(title='event1') event2 = Event.objects.create(title='event2')
-
Add
Participant
s to them:part1l = [Participant.objects.create(event=event1, is_paid=((_%2) == 0))\ for _ in range(10)] part2l = [Participant.objects.create(event=event2, is_paid=((_%2) == 0))\ for _ in range(50)]
-
Group all
Participant
s by theirevent
field:Participant.objects.values('event') > <QuerySet [{'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, '...(remaining elements truncated)...']>
Here distinct is needed:
Participant.objects.values('event').distinct() > <QuerySet [{'event': 1}, {'event': 2}]>
What
.values
and.distinct
are doing here is that they are creating two buckets ofParticipant
s grouped by their elementevent
. Note that those buckets containParticipant
. -
You can then annotate those buckets as they contain the set of original
Participant
. Here we want to count the number ofParticipant
, this is simply done by counting theid
s of the elements in those buckets (since those areParticipant
):Participant.objects\ .values('event')\ .distinct()\ .annotate(models.Count('id')) > <QuerySet [{'event': 1, 'id__count': 10}, {'event': 2, 'id__count': 50}]>
-
Finally you want only
Participant
with ais_paid
beingTrue
, you may just add a filter in front of the previous expression, and this yield the expression shown above:Participant.objects\ .filter(is_paid=True)\ .values('event')\ .distinct()\ .annotate(models.Count('id')) > <QuerySet [{'event': 1, 'id__count': 5}, {'event': 2, 'id__count': 25}]>
The only drawback is that you have to retrieve the Event
afterwards as you only have the id
from the method above.
- [Django]-Data Mining in a Django/Postgres application
- [Django]-How do I create a slug in Django?
- [Django]-Django dynamic model fields
3
For Django 3.x just write filter after the annotate:
User.objects.values('user_id')
.annotate(sudo_field=models.Count('likes'))
.filter(sudo_field__gt=100)
In above sudo_field
is not a model field in User Model and here we are filtering the users who have likes (or xyz) more than 100.
- [Django]-Django-tables2: How to use accessor to bring in foreign columns?
- [Django]-Remove Labels in a Django Crispy Forms
- [Django]-Best practices for adding .gitignore file for Python projects?
1
What result I am looking for:
- People (assignee) who have tasks added to a report. – Total Unique
count of People - People who have tasks added to a report but, for task
whose billability is more than 0 only.
In general, I would have to use two different queries:
Task.objects.filter(billable_efforts__gt=0)
Task.objects.all()
But I want both in one query. Hence:
Task.objects.values('report__title').annotate(withMoreThanZero=Count('assignee', distinct=True, filter=Q(billable_efforts__gt=0))).annotate(totalUniqueAssignee=Count('assignee', distinct=True))
Result:
<QuerySet [{'report__title': 'TestReport', 'withMoreThanZero': 37, 'totalUniqueAssignee': 50}, {'report__title': 'Utilization_Report_April_2019', 'withMoreThanZero': 37, 'totalUniqueAssignee': 50}]>
- [Django]-How to go from django image field to PIL image and back?
- [Django]-Switching to PostgreSQL fails loading datadump
- [Django]-Django celery task: Newly created model DoesNotExist