[Django]-Django aggregation over annotated query

1👍

I wasn’t able to do it with pure Django code, but this is the best I could do, depending as much as possible on Django code instead of raw sql.

from django.db import connection
from django.db.models import Count

def get_average_count(klass, field_name):
    foo = klass.objects.values(field_name).annotate(countval=Count('id'))
    query = "SELECT AVG(subquery.countval) FROM (%s) subquery" % str(foo.query)
    cursor = connection.cursor()
    cursor.execute(query)
    return float(cursor.fetchone()[0])

This will execute the exact SQL statement you said you wanted to generate. It is also fully independent of the SQL backend you’re using, and fully reusable (yay DRY) for all classes with reverse ForeignKey or ManyToMany relationships.

If you really don’t want to use raw SQL, another option is to calculate the average in Django:

from __future__ import division # no need to cast to float now

def get_average_count(klass, field_name):
    counts = klass.objects.values(field_name).annotate(countval=Count('id')).\
        values_list('countval', flat=True)
    return reduce(lambda x, y: x + y / len(counts), counts, 0)

You might want to check for any performance differences if you’re planning to have large datasets in your database.

👤knbk

Leave a comment