[Django]-Django: Ordering objects by their children's attributes

0πŸ‘

βœ…

Lastly, would the option of just adding a new field to each one to show the date of the last book and just updating that the whole time be better?

Actually it would! This is a normal denormalization practice and can be done like this:

class Author(models.Model):
    name = models.CharField(max_length=200, unique=True)
    latest_pub_date = models.DateTimeField(null=True, blank=True)

    def update_pub_date(self):
        try:
            self.latest_pub_date = self.book_set.order_by('-pub_date')[0]
            self.save()
        except IndexError:
            pass # no books yet!

class Book(models.Model):
    pub_date = models.DateTimeField()
    author = models.ForeignKey(Author)

    def save(self, **kwargs):
        super(Book, self).save(**kwargs)
        self.author.update_pub_date()

    def delete(self):
        super(Book, self).delete()
        self.author.update_pub_date()

This is the third common option you have besides two already suggested:

  • doing it in SQL with a join and grouping
  • getting all the books to Python side and remove duplicates

Both these options choose to compute pub_dates from a normalized data at the time when you read them. Denormalization does this computation for each author at the time when you write new data. The idea is that most web apps do reads most often than writes so this approach is preferable.

One of the perceived downsides of this is that basically you have the same data in different places and it requires you to keep it in sync. It horrifies database people to death usually :-). But this is usually not a problem until you use your ORM model to work with dat (which you probably do anyway). In Django it’s the app that controls the database, not the other way around.

Another (more realistic) downside is that with the naive code that I’ve shown massive books update may be way slower since they ping authors for updating their data on each update no matter what. This is usually solved by having a flag to temporarily disable calling update_pub_date and calling it manually afterwards. Basically, denormalized data requires more maintenance than normalized.

πŸ‘€isagalaev

3πŸ‘

from django.db.models import Max
Author.objects.annotate(max_pub_date=Max('books__pub_date')).order_by('-max_pub_date')

this requires that you use django 1.1

and i assumed you will add a β€˜related_name’ to your author field in Book model, so it will be called by Author.books instead of Author.book_set. its much more readable.

πŸ‘€Ofri Raviv

1πŸ‘

Or, you could play around with something like this:

Author.objects.filter(book__pub_date__isnull=False).order_by('-book__pub_date')

πŸ‘€ayaz

0πŸ‘

 def remove_duplicates(seq): 
    seen = {}
    result = []
    for item in seq:
        if item in seen: continue
        seen[item] = 1
        result.append(item)
    return result


# Get the authors of the most recent books
query_result = Books.objects.order_by('pub_date').values('author')
# Strip the keys from the result set and remove duplicate authors
recent_authors = remove_duplicates(query_result.values())
πŸ‘€DevDevDev

0πŸ‘

Building on ayaz’s solution, what about:
Author.objects.filter(book__pub_date__isnull=False).distinct().order_by(β€˜-book__pub_date’)

πŸ‘€Josh Ourisman

Leave a comment