[Django]-How to use subquery in django?

6đź‘Ť

âś…

This may not be exactly what you’re looking for, but it might get you closer. Take a look at Django’s annotate.

Here is an example of something that may help:

  from django.db.models import Max
  Customer.objects.all().annotate(most_recent_purchase=Max('purchase__date'))

This will give you a list of your customer models each one of which will have a new attribute called “most_recent_purchase” and will contain the date on which they made their last purchase. The sql produced looks like this:

SELECT "demo_customer"."id", 
       "demo_customer"."user_id", 
       MAX("demo_purchase"."date") AS "most_recent_purchase"
FROM "demo_customer"
LEFT OUTER JOIN "demo_purchase" ON ("demo_customer"."id" = "demo_purchase"."customer_id")
GROUP BY "demo_customer"."id",
         "demo_customer"."user_id"

Another option, would be adding a property to your customer model that would look something like this:

  @property
  def latest_purchase(self):
    return self.purchase_set.order_by('-date')[0]

You would obviously need to handle the case where there aren’t any purchases in this property, and this would potentially not perform very well (since you would be running one query for each customer to get their latest purchase).

I’ve used both of these techniques in the past and they’ve both worked fine in different situations. I hope this helps. Best of luck!

5đź‘Ť

Whenever there is a difficult query to write using Django ORM, I first try the query in psql(or whatever client you use). The SQL that you want is not this:

SELECT * FROM (
  SELECT DISTINCT ON 
    "shop_purchase.customer_id" "shop_purchase.id" "shop_purchase.date" 
  FROM "shop_purchase" 
  ORDER BY "shop_purchase.customer_id" ASC, "shop_purchase.date" DESC;
  ) AS result 
ORDER BY date DESC;

In the above SQL, the inner SQL is looking for distinct on a combination of (customer_id, id, and date) and since id will be unique for all, you will get all records from the table. I am assuming id is the primary key as per convention.

If you need to find the last purchase of every customer, you need to do something like:

SELECT  "shop_purchase.customer_id", max("shop_purchase.date")
FROM shop_purchase
GROUP BY 1 

But the problem with the above query is that it will give you only the customer name and date. Using that will not help you in finding the records when you use these results in a subquery.

To use IN you need a list of unique parameters to identify a record, e.g., id

If in your records id is a serial key, then you can leverage the fact that the latest date will be the maximum id as well. So your SQL becomes:

SELECT  max("shop_purchase.id") 
FROM shop_purchase
GROUP BY "shop_purchase.customer_id";

Note that I kept only one field (id) in the selected clause to use it in a subquery using IN.

The complete SQL will now be:

SELECT * 
FROM shop_customer 
WHERE "shop_customer.id" IN 
    (SELECT  max("shop_purchase.id") 
     FROM shop_purchase
     GROUP BY "shop_purchase.customer_id");

and using the Django ORM it looks like:

(Purchase.objects.filter(
    id__in=Purchase.objects
                   .values('customer_id')
                   .annotate(latest=Max('id'))
                   .values_list('latest', flat=True)))

Hope it helps!

👤Ajay Yadav

3đź‘Ť

I have a similar situation and this is how I’m planning to go about it:

query = Purchase.objects.distinct('customer').order_by('customer').query
query = 'SELECT * FROM ({}) AS result ORDER BY sent DESC'.format(query)
return Purchase.objects.raw(query)

Upside it gives me the query I want. Downside is that it is raw query and I can’t append any other queryset filters.

👤Dustin

1đź‘Ť

This is my approach if I need some subset of data (N items) along with the Django query. This is example using PostgreSQL and handy json_build_object() function (Postgres 9.4+), but same way you can use other aggregate function in other database system. For older PostgreSQL versions you can use combination of array_agg() and array_to_string() functions.

Imagine you have Article and Comment models and along with every article in the list you want to select 3 recent comments (change LIMIT 3 to adjust size of subset or ORDER BY c.id DESC to change sorting of subset).

qs = Article.objects.all()
qs = qs.extra(select = {
    'recent_comments': """
    SELECT
        json_build_object('comments',
            array_agg(
              json_build_object('id', id, 'user_id', user_id, 'body', body)
            )
        )
    FROM (
        SELECT
          c.id,
          c.user_id,
          c.body
        FROM app_comment c
        WHERE c.article_id = app_article.id
        ORDER BY c.id DESC
        LIMIT 3
    ) sub
    """
})

for article in qs:
    print(article.recent_comments)

# Output:
# {u'comments': [{u'user_id': 1, u'id': 3, u'body': u'foo'}, {u'user_id': 1, u'id': 2, u'body': u'bar'}, {u'user_id': 1, u'id': 1, u'body': u'joe'}]}
# ....
👤darklow

Leave a comment