[Django]-How filter latest m2m objects with a QuerySet in Django

2đź‘Ť

âś…

tl;dr )

The current important answer is the EDIT at the end.

Yes, it is important to ignore all older related objects (maintenances), even those which are still valid, because there may exist a renewed maintenance.

I think that you simplified your real model so much, that it doesn’t work good.
You have two chained ManyToMany relations with anonymous (not explicit) relationship table. This makes impossible to write right query sets.

Bugs:

1) You use the same name (“licenses” and also “maintenances”) for a field and its related_name. It is a nonsense because: docs

related_name
The name to use for the relation from the related object back to this one. It’s also the default value for related_query_name (the name to use for the reverse filter name from the target model).

It is not useful that you will see a reverse queryset maintenances on object Maintenance to License. Similarly a queryset `licenses’ on License to Customer. You can easily rename the related_name because it does’t change the database and doesn’t cause a migration.

2) Is the License a common or individual object? If it is individual then it doesn’t need many-to-many relationship to Customer object. If it is common then you can’t track a paid maintenance to individual customers through it. (You also don’t mean that two customers are co-owners of one license! Are they? 🙂 You probably mean a common LicensedProduct and an individual License that connects the customer with that product. I understand that the user can buy one maintenace for more licenses and and many-to-many is here good.

First I fix the model (some way I guess before I can ask you)

class Customer(SomeInheritedModel):
    # "licenses" is the reverse query to License
    # optionally you can enable many-to-many relation to licensed products
    # lic_products = models.ManyToManyField(

class Product(models.Model):
    pass  # licensed product details

class License(SomeInheritedModel):
    customer = models.ForeignKey(Customer, related_name='licenses')
    product = models.ForeignKey(Product, related_name='+')  # used '+' because not deeded
    maintenances = models.ManyToManyField(
        Maintenance,
        through='LicenseMaintenance',
        through_fields=('license', 'maintenance'),
        related_name='licenses')

class Maintenance(SomeInheritedModel):
    start_date = DateTimeField(null=True)
    expiration_date = DateTimeField(null=True)

class LicenseMaintenance(models.Model):
    license = models.ForeignKey(License, on_delete=models.CASCADE)
    maintenance = models.ForeignKey(Maintenance, on_delete=models.CASCADE)

querysets: (can be simplified much by removing order_by and related fields)

remind_start = datetime.datetime.now(tz=TIMEZONE)
remind_end = remind_start + datetime.timedelta(days=30)

expiring_lic_maintenances = (
    LicenseMaintenance.objects.values('license',
                                      'license__customer',
                                      'license__customer__name')
    .annotate(max_exp_date=models.Max('maintenance__expiration_date'))
    .filter(max_exp_date__lte=remind_start, max_exp_date__gte=remind_end)
    .order_by('license__customer__name', 'license__customer', 'license')
)   # some small detail can be used like e.g. customer name in the example, not used later

expiring_licenses = (
    License.objects.filter(
        license__in=expiring_lic_maintenances.values_list('license', flat=True))
    .select_related('customer', 'product')
    .order_by('license__customer__name', 'license__customer', 'license')
)   # that queryset with subquery is executed by one SQL command

Not more than two SQL requests are executed by running these querysets:

# but I prefer a simple map and queryset with subquery:
expiration_map = {x.license_id: x.max_exp_date for x in expiring_lic_maintenances}


for lic in expiring_licenses:
    print("{name}, your maintenance for {lic_name} is expiring on {exp_date}".format(
        name=lic.customer.name,
        lic_name=lic.product.name,
        exp_date=expiration_map[lic.id],
    ))

I expect, that it is a new project and you don’t need migrations yet for modified models. I wrote similar code so many times enough that I didn’t verified it now. A mistake could happen and you can inform me sufficient time before the end of bounty easily.


EDIT after edited question:
Aggregation functions work correctly on many-to-many fields without explicit model for join table in the current Django version:

>>> expiring = (
...     License.objects.values('id',
...                            'account__customer',
...                            'account__customer__name')
...     .annotate(max_exp_date=models.Max('maintenance__expiration_date'))
...     .filter(max_exp_date__gte=remind_start, max_exp_date__lte=remind_end)
... )

and look at compiled SQL:

>>> str(expiring.query)
SELECT app_license.id, app_account.customer_id, app_customer.name, MAX(app_maintenance.expiration_date) AS max_exp_date
    FROM app_license INNER JOIN app_account ON (app_license.account_id = app_account.id)
    INNER JOIN app_customer ON (app_account.customer_id = app_customer.id)
    LEFT OUTER JOIN app_license_maintenance ON (app_license.id = app_license_maintenance.license_id)
    LEFT OUTER JOIN app_maintenance ON (app_license_maintenance.maintenance_id = app_maintenance.id)
    GROUP BY app_license.id, app_account.customer_id, app_customer.name
    HAVING (MAX(app_maintenance.expiration_date) >= 2017-04-07T13:45:35.485755 AND
            MAX(app_maintenance.expiration_date) <= 2017-03-08T13:45:35.485755
            )

In generally this is compiled by two outer joins.


If you find a more complicated case, where it doesn’t work or the query is slow because it is more complicated for some database engines to optimize with outer joins, you can everytimes get the implicit model and run the query on it because it is the top model in the relation hierarchy:

we can explore the implicit intermediate model of the table:

>>> License.maintenance.through
app.models.License_maintenance
>>> LicenseMaintenance = License.maintenance.through
>>> LicenseMaintenance._meta.fields
(<django.db.models.fields.AutoField: id>,
 <django.db.models.fields.related.ForeignKey: license>,
 <django.db.models.fields.related.ForeignKey: maintenance>)

and use it: (all joins are automatically compiled to inner joins)

>>> expiring = (
...     LicenseMaintenance.objects.values('license',
...                                       'license__account__customer',
...                                       'license__account__customer__name')
...     .annotate(max_exp_date=models.Max('maintenance__expiration_date'))
...     .filter(max_exp_date__lte=remind_start, max_exp_date__gte=remind_end)
... )
👤hynekcer

1đź‘Ť

In this situation you have 2 options:
First is to use prefetch_related:

from django.db.models import Prefetch

now = timezone.now()
maintenance_qs = Maintenance.objects.filter(expiry_date__lte=now).order_by('-expire_date')
license_qs = License.objects.filter(maintenances__expiry_date__lte=now).\
    prefetch_related(
        Prefetch('maintenances', queryset=maintenance_qs)
    ).order_by(-'maintenances__expiry_date')
customers = Customer.objects.prefetch_related(Prefetch('licenses', queryset=license_qs))

It will hit database 3 times, you can read more about prefetch_related and Prefetch object. It will return all licenses and all maintenance but it will be sorted and you can take only 1 item. You can use it like this.

for customer in customers:
    last_license = customer.licenses.all()[0]
    last_maintenance = last_license.maintenances.all()[0]

Or you can try to use raw SQL. Your query look like:

customers = Customer.objects.raw(
'''
SELECT * FROM (
    SELECT "yourapp_customer"."id", 
           "yourapp_license"."id", 
           "yourapp_maintenance"."id",
           "yourapp_maintanance"."start_date",
           "yourapp_maintanance"."expiration_date",
           MAX("yourapp_maintanance"."expiration_date") over (partition by "yourapp_customer"."id") as last_expired
    FROM "yourapp_customer"
    LEFT OUTER JOIN "yourapp_customer_licenses" ON
        "yourapp_customer"."id" = "yourapp_customer_licenses"."customer_id"
    LEFT OUTER JOIN "yourapp_license" ON
        "yourapp_license"."id" = "yourapp_customer_licenses"."license_id"
    LEFT OUTER JOIN "yourapp_license_maintenances" ON
        "yourapp_license"."id" = "yourapp_license_maintenances"."license_id"
    LEFT OUTER JOIN "yourapp_maintanance" ON
        "yourapp_maintanance"."id" = "yourapp_license_maintenances"."maintanance_id"
    WHERE "yourapp_maintanance"."expiration_date" < NOW()
) AS T
where expiration_date = last_expired
'''
)

It should work much faster, but using this query you can’t construct License and Maintenance objects. All properties will be stored in Customer model. You can read more about window functions

👤Dima Kudosh

Leave a comment