5π
This Will help you
from django.db.models import Count, Max
MyClass.objects.values('my_integer').annotate(count=Count("my_integer"),latest_date=Max('created_ts'))
Data in table
my_integer created_ts
- -----------
1 2015-09-08 20:05:51.144321+00:00
1 2015-09-08 20:08:40.687936+00:00
3 2015-09-08 20:08:58.472077+00:00
2 2015-09-08 20:09:08.493748+00:00
2 2015-09-08 20:10:20.906069+00:00
Output
[
{'count': 2, 'latest_date': datetime.datetime(2015, 9, 8, 20, 8, 40, 687936, tzinfo=<UTC>), 'my_integer': 1},
{'count': 2, 'latest_date': datetime.datetime(2015, 9, 8, 20, 10, 20, 906069, tzinfo=<UTC>), 'my_integer': 2},
{'count': 1, 'latest_date': datetime.datetime(2015, 9, 8, 20, 8, 58, 472077, tzinfo=<UTC>), 'my_integer': 3}
]
0π
Try this;
from django.db.models import Max
MyClass.objects.values('my_integer').annotate(Max('created_ts'))
- Django python-rq β DatabaseError SSL error: decryption failed or bad record mac
- Django.db.utils.IntegrityError: FOREIGN KEY constraint failed
- Django: efficient template/string separation and override
- How to declare variables inside Django templates
0π
This here is a really basic way to do it. Assuming the amount of data you have isnβt super large, this will work decently. You can use this in your views by overriding the get_queryset
function and just returning filtered. Or you can use it as a static method on your class if you plan on using it everywhere.
values = MyClass.objects.order_by('-created_ts').all()
filtered = []
existing = []
for value in values:
if value.my_integer not in existing:
existing.append(value.my_integer)
filtered.append(value)
Since the list is ordered by the most recent first they will get added to the existing first for that integer. I did some basic testing with it, but not much so there may be a flaw or two in there. Tested with sqlite.
Edit
Here is a much faster version.
def iter_tools():
import itertools
qs = MyClass.objects.all()
filtered = []
group_by = itertools.groupby(qs, lambda x: x.my_integer)
for x in group_by:
filtered.append(sorted(x[1], key=lambda x: x.created_ts, reverse=True)[0])
return filtered
Essentially the way this is working is getting all of your objects from your db, grouping them by the integer, then sorting each group based on the timestamp and getting just the first one from each group. Speeding this even more is beyond my skills but I am sure there is some ways.
Here is the timeit
of this one vs the one earlier with only like 6 entries in the db:
In[]: timeit.timeit(manual, number=1500)
Out[]: 0.5577559471130371
In[]: timeit.timeit(iter_tools, number=1500)
Out[]: 0.39012885093688965
-----------------------------------------------
In[]: timeit.timeit(manual, number=5000)
Out[]: 1.770777940750122
In[]: timeit.timeit(iter_tools, number=5000)
Out[]: 1.2411231994628906
Edit 2:
I created 60000 objects to the database to try it out with some data. I generated the data with django-fixtureless so the integers are completely random and the timestamp on all of them is a new datetime.now()
for each object.
In[]: timeit.timeit(manual, number=1)
Out[]: 11.946185827255249
In[]: timeit.timeit(iter_tools, number=1)
Out[]: 0.7811920642852783
In[]: timeit.timeit(iter_tools, number=100)
Out[]: 77.93837308883667
In[]: MyClass.objects.all().count()
Out[]: 60000
A note about the DB: In the above examples I was using sqlite3 just on my local machine. I just now setup a quick little mysql server as a vm and received a much better speed result.
In[16]: MyClass.objects.all().count()
Out[16]: 60000
In[17]: timeit.timeit(iter_tools, number=100)
Out[17]: 49.636733055114746
In[18]: timeit.timeit(iter_tools, number=1)
Out[18]: 0.4923059940338135
Either way, you get the same objects returned. If performance is an issue I would recommend either using the itertools one or a custom sql query.
0π
You can either do a raw query:
MyClass.objects.raw("""
SELECT m1.id, m1.my_integer, m1.created_ts
FROM app_myclass AS m1, (
SELECT my_integer, MAX(created_ts) AS max_created_ts
FROM app_myclass
GROUP BY my_integer
) AS m2
WHERE m1.my_integer = m2.my_integer AND m1.created_ts = m2.max_created_ts
"""))
Or use the Django ORM:
MyClass.objects.filter(
created_ts__in=MyClass.objects.values(
"my_integer"
).annotate(
created_ts=models.Max(
"created_ts"
)
).values_list("created_ts", flat=True)
)
Note that this requires only a single SQL request, as you can see by printing len(django.db.connection.queries)
before and after the query.
However, note that the latter solution only works if your created_ts
attribute is guaranteed to be unique, which might not be your case.
If youβre not willing to use raw queries or an index on created_ts
, then you should probably start using PostgreSQL and its DISTINCT ON
feature, as suggested by other answers.
- ProgrammingError: column "product" is of type product[] but expression is of type text[] enum postgres
- Django render template within a template
- How to integrate postgresql 10/11 declarative table partitioning (i.e. PARTITION BY clause) in a Django model?
- Can django-tastypie display a different set of fields in the list and detail views of a single resource?
- Google Streaming Speech Recognition on an Audio Stream Python
-1π
untested
results = MyClass.objects.all().distinct('my_integer').order_by('created_ts')
- How to paginate with filters in django rest framework
- Django ORM: See if a model has no foreign key entry in another model
- Fatal error: libmemcached/memcached.h: no such file or directory
- Django β How ModelChoiceField queryset's works?
- Adding Autoincrement field to existing model with Django South?
-1π
MyClass.objects.order_by('my_integer', '-created_ts').distinct('my_integer')
According to distinct, you need to call distinct on attributes, in the same order as in order_by
. Hence order the elements based on integer and then in reverse timestamp, and call distinct on them, which returns the latest instance for every integer.
- Passing arguments to management.call_command on a django view
- Django Heroku Server Error (500) when I set Debug β False on True it is working fine
- Custom field's to_python not working? β Django
- Why use "through" argument for ManyToManyField in Django models?
- Django Storages using s3boto ignoring MEDIA_URL