761π
If you mean to do aggregation you can use the aggregation features of the ORM:
from django.db.models import Count
result = (Members.objects
.values('designation')
.annotate(dcount=Count('designation'))
.order_by()
)
This results in a query similar to
SELECT designation, COUNT(designation) AS dcount
FROM members GROUP BY designation
and the output would be of the form
[{'designation': 'Salesman', 'dcount': 2},
{'designation': 'Manager', 'dcount': 2}]
If you donβt include the order_by()
, you may get incorrect results if the default sorting is not what you expect.
If you want to include multiple fields in the results, just add them as arguments to values
, for example:
.values('designation', 'first_name', 'last_name')
References:
- Django documentation:
values()
,annotate()
, andCount
- Django documentation: Aggregation, and in particular the section entitled Interaction with default ordering or
order_by()
79π
An easy solution, but not the proper way is to use raw SQL:
results = Members.objects.raw('SELECT * FROM myapp_members GROUP BY designation')
Another solution is to use the group_by
property:
query = Members.objects.all().query
query.group_by = ['designation']
results = QuerySet(query=query, model=Members)
You can now iterate over the results variable to retrieve your results. Note that group_by
is not documented and may be changed in future version of Django.
And⦠why do you want to use group_by
? If you donβt use aggregation, you can use order_by
to achieve an alike result.
- [Django]-What's the best way to extend the User model in Django?
- [Django]-Google Static Maps URL length limit
- [Django]-Django β No module named _sqlite3
59π
You can also use the regroup
template tag to group by attributes. From the docs:
cities = [
{'name': 'Mumbai', 'population': '19,000,000', 'country': 'India'},
{'name': 'Calcutta', 'population': '15,000,000', 'country': 'India'},
{'name': 'New York', 'population': '20,000,000', 'country': 'USA'},
{'name': 'Chicago', 'population': '7,000,000', 'country': 'USA'},
{'name': 'Tokyo', 'population': '33,000,000', 'country': 'Japan'},
]
...
{% regroup cities by country as countries_list %}
<ul>
{% for country in countries_list %}
<li>{{ country.grouper }}
<ul>
{% for city in country.list %}
<li>{{ city.name }}: {{ city.population }}</li>
{% endfor %}
</ul>
</li>
{% endfor %}
</ul>
Looks like this:
- India
- Mumbai: 19,000,000
- Calcutta: 15,000,000
- USA
- New York: 20,000,000
- Chicago: 7,000,000
- Japan
- Tokyo: 33,000,000
It also works on QuerySet
s I believe.
source: https://docs.djangoproject.com/en/2.1/ref/templates/builtins/#regroup
edit: note the regroup
tag does not work as you would expect it to if your list of dictionaries is not key-sorted. It works iteratively. So sort your list (or query set) by the key of the grouper before passing it to the regroup
tag.
- [Django]-Serializer call is showing an TypeError: Object of type 'ListSerializer' is not JSON serializable?
- [Django]-How to stop autopep8 not installed messages in Code
- [Django]-Remove pk field from django serialized objects
11π
Django does not support free group by queries. I learned it in the very bad way. ORM is not designed to support stuff like what you want to do, without using custom SQL. You are limited to:
- RAW sql (i.e. MyModel.objects.raw())
cr.execute
sentences (and a hand-made parsing of the result)..annotate()
(the group by sentences are performed in the child model for .annotate(), in examples like aggregating lines_count=Count(βlinesβ))).
Over a queryset qs
you can call qs.query.group_by = ['field1', 'field2', ...]
but it is risky if you donβt know what query are you editing and have no guarantee that it will work and not break internals of the QuerySet object. Besides, it is an internal (undocumented) API you should not access directly without risking the code not being anymore compatible with future Django versions.
- [Django]-How to test "render to template" functions in django? (TDD)
- [Django]-Django select_for_update cannot be used outside of a transaction
- [Django]-Django CMS fails to synch db or migrate
9π
The following module allows you to group Django models and still work with a QuerySet in the result: https://github.com/kako-nawao/django-group-by
For example:
from django_group_by import GroupByMixin
class BookQuerySet(QuerySet, GroupByMixin):
pass
class Book(Model):
title = TextField(...)
author = ForeignKey(User, ...)
shop = ForeignKey(Shop, ...)
price = DecimalField(...)
class GroupedBookListView(PaginationMixin, ListView):
template_name = 'book/books.html'
model = Book
paginate_by = 100
def get_queryset(self):
return Book.objects.group_by('title', 'author').annotate(
shop_count=Count('shop'), price_avg=Avg('price')).order_by(
'name', 'author').distinct()
def get_context_data(self, **kwargs):
return super().get_context_data(total_count=self.get_queryset().count(), **kwargs)
βbook/books.htmlβ
<ul>
{% for book in object_list %}
<li>
<h2>{{ book.title }}</td>
<p>{{ book.author.last_name }}, {{ book.author.first_name }}</p>
<p>{{ book.shop_count }}</p>
<p>{{ book.price_avg }}</p>
</li>
{% endfor %}
</ul>
The difference to the annotate
/aggregate
basic Django queries is the use of the attributes of a related field, e.g. book.author.last_name
.
If you need the PKs of the instances that have been grouped together, add the following annotation:
.annotate(pks=ArrayAgg('id'))
NOTE: ArrayAgg
is a Postgres specific function, available from Django 1.9 onwards: https://docs.djangoproject.com/en/3.2/ref/contrib/postgres/aggregates/#arrayagg
- [Django]-How to add a cancel button to DeleteView in django
- [Django]-Django-allauth: Linking multiple social accounts to a single user
- [Django]-Paginating the results of a Django forms POST request
9π
You could also use pythons built-in itertools.groupby
directly:
from itertools import groupby
designation_key_func = lambda member: member.designation
queryset = Members.objects.all().select_related("designation")
for designation, member_group in groupby(queryset, designation_key_func):
print(f"{designation} : {list(member_group)}")
No raw sql, subqueries, third-party-libs or templatetags needed and pythonic and explicit in my eyes.
- [Django]-Django models: default value for column
- [Django]-Pytest.mark.parametrize with django.test.SimpleTestCase
- [Django]-Django model one foreign key to many tables
7π
The documentation says that you can use values to group the queryset .
class Travel(models.Model):
interest = models.ForeignKey(Interest)
user = models.ForeignKey(User)
time = models.DateTimeField(auto_now_add=True)
# Find the travel and group by the interest:
>>> Travel.objects.values('interest').annotate(Count('user'))
<QuerySet [{'interest': 5, 'user__count': 2}, {'interest': 6, 'user__count': 1}]>
# the interest(id=5) had been visited for 2 times,
# and the interest(id=6) had only been visited for 1 time.
>>> Travel.objects.values('interest').annotate(Count('user', distinct=True))
<QuerySet [{'interest': 5, 'user__count': 1}, {'interest': 6, 'user__count': 1}]>
# the interest(id=5) had been visited by only one person (but this person had
# visited the interest for 2 times
You can find all the books and group them by name using this code:
Book.objects.values('name').annotate(Count('id')).order_by() # ensure you add the order_by()
You can watch some cheat sheet here.
- [Django]-Handling race condition in model.save()
- [Django]-How to use refresh token to obtain new access token on django-oauth-toolkit?
- [Django]-Add rich text format functionality to django TextField
4π
You need to do custom SQL as exemplified in this snippet:
Or in a custom manager as shown in the online Django docs:
- [Django]-Adding to the "constructor" of a django model
- [Django]-Django error: got multiple values for keyword argument
- [Django]-How to reset Django admin password?
2π
This is little complex, but get questioner what he/she expected with only one DB hit.
from django.db.models import Subquery, OuterRef
member_qs = Members.objects.filter(
pk__in = Members.objects.values('designation').distinct().annotate(
pk = Subquery(
Members.objects.filter(
designation= OuterRef("designation")
)
.order_by("pk") # you can set other column, e.g. -pk, create_date...
.values("pk")[:1]
)
)
.values_list("pk", flat=True)
)
- [Django]-Django 2.0 β Not a valid view function or pattern name (Customizing Auth views)
- [Django]-Django Form File Field disappears on form error
- [Django]-How do I use django rest framework to send a file in response?
1π
If, in other words, you need to just "remove duplicates" based on some field, and otherwise just to query the ORM objects as they are, I came up with the following workaround:
from django.db.models import OuterRef, Exists
qs = Members.objects.all()
qs = qs.annotate(is_duplicate=Exists(
Members.objects.filter(
id__lt=OuterRef('id'),
designation=OuterRef('designation')))
qs = qs.filter(is_duplicate=False)
So, basically weβre just annotating the is_duplicate
value by using some convenient filtering (which might vary based on your model and requirements), and then simply using that field to filter out the duplicates.
- [Django]-Update all models at once in Django
- [Django]-Homepage login form Django
- [Django]-Get object by field other than primary key
1π
If you want the model objects, and not just plain values or dictionaries, you can do something like this:
members = Member.objects.filter(foobar=True)
designations = Designation.objects.filter(member__in=members).order_by('pk').distinct()
Replace member__in
with the lowercase version of your model name, followed by __in
. For example, if your model name is Car
, use car__in
.
- [Django]-Django β how to unit test a post request using request.FILES
- [Django]-Django switching, for a block of code, switch the language so translations are done in one language
- [Django]-Django: Reverse for 'detail' with arguments '('',)' and keyword arguments '{}' not found
1π
For some reason, the above mentioned solutions did not work for me. This is what worked:
dupes_query = MyModel.objects.all().values('my_field').annotate(
count=Count('id')
).order_by('-count').filter(count__gt=1)
I hope it helps.
- [Django]-Python Asyncio in Django View
- [Django]-Creating a JSON response using Django and Python
- [Django]-Django celery task: Newly created model DoesNotExist
-7π
from django.db.models import Sum
Members.objects.annotate(total=Sum(designation))
first you need to import Sum
then ..
- [Django]-Get count of related model efficiently in Django
- [Django]-Django-allauth: Linking multiple social accounts to a single user
- [Django]-What are the limitations of Django's ORM?