[Fixed]-Django Serializer Nested Creation: How to avoid N+1 queries on relations

6👍

The DRF serializer is not the place (in my own opinion) to optimize a DB query. Serializer has 2 jobs:

  1. Serialize and check the validity of input data.
  2. Serialize output data.

Therefore the correct place to optimize your query is the corresponding view.
We will use the select_related method that:

Returns a QuerySet that will “follow” foreign-key relationships, selecting additional related-object data when it executes its query. This is a performance booster which results in a single more complex query but means later use of foreign-key relationships won’t require database queries.
to avoid the N+1 database queries.

You will need to modify the part of your view code that creates the corresponding queryset, in order to include a select_related call.
You will also need to add a related_name to the Tag.category field definition.

Example:

# In your Tag model:
category = models.ForeignKey(
    'app.TagCategory', on_delete=models.PROTECT, related_name='categories'
)

# In your queryset defining part of your View:
class BookViewSet(views.APIView):

    queryset = Book.objects.all().select_related(
        'tags', 'tags__categories'
    )  # We are using the related_name of the ForeignKey relationships.

If you want to test something different that uses also the serializer to cut down the number of queries, you can check this article.

1👍

I know this question has been around for a long time, but I had the same problem and I was looking for the solution for several days, in the end I found another solution that worked for me.

I leave it here in case it helps someone, in this way it no longer makes a query for each relationship, now it is only a query for all and in to_internal_value it validates the foreign key

class TagSerializer(serializers.ModelSerializer):
    ...
    category_id = serializers.PrimaryKeyRelatedField(queryset = Category.objects.all(), source='category', write_only=True)
    ...

    def __init__(self, *args, **kwargs):
        self.categories = Category.objects.all().values_list('id', flat=True)
        super().__init__(*args, **kwargs)

    def to_internal_value(self, data):
        category_id = data.pop('category_id', None)

        if category_id is not None:
            if not category_id in self.categories:
                raise serializers.ValidationError({
                    'category_id': 'Category does not exist'
                })
        return super().to_internal_value(data)

0👍

I think the issue here is that the Tag constructor is automatically converting the category id that you pass in as category into a TagCategory instance by looking it up from the database. The way to avoid that is by doing something like the following if you know that all of the category ids are valid:


    def create(self, validated_data):
        with transaction.atomic():
            tags = validated_data.pop('tags')
            book = Book.objects.create(**validated_data)
            tag_instances = [ Tag(book_id=book.id, page=x['page'], category_id=x['category']) for x in tags ]
            Tag.objects.bulk_create(tag_instances)
        return book
👤2ps

0👍

I’ve come up with an answer that gets things working (but that I’m not thrilled about): Modify the Tag Serializer like this:

class TagSerializer(serializers.ModelSerializer):

    category_id = serializers.IntegerField()

    class Meta:
        model = Tag
        exclude = ['id', 'book', 'category']

This allows me to read/write a category_id without having the overhead of validations. Adding category to exclude does mean that the serializer will ignore category if it’s set on the instance.

0👍

Problem is that you don’t set created tags to the book instance so serializer try to get this while returning.

You need to set it to the book as a list:

def create(self, validated_data):
    with transaction.atomic():
        book = Book.objects.create(**validated_data)

        # Add None as a default and check that tags are provided
        # If you don't do that, serializer will raise error if request don't have 'tags'

        tags = validated_data.pop('tags', None)
        tags_to_create = []

        if tags:
            tags_to_create = [Tag(book=book, **tag) for tag in tags]
            Tag.objects.bulk_create(tags_to_create)

        # Here I set tags to the book instance
        setattr(book, 'tags', tags_to_create)

    return book

Provide Meta.fields tuple for TagSerializer (it’s weird that this serializer don’t raise error saying that fields tuple is required)

class TagSerializer(serializers.ModelSerializer):
    class Meta:
        model = Tag
        fields = ('category', 'page',)

Prefetching tag.category should be NOT necessary in this case because it’s just id.

You will need prefetching Book.tags for GET method. The simplest solution is to create static method for serializer and use it in viewset get_queryset method like this:

class BookSerializer(serializers.ModelSerializer):
    ...
    @staticmethod
    def setup_eager_loading(queryset): # It can be named any name you like
        queryset = queryset.prefetch_related('tags')

        return queryset

class BookViewSet(views.APIView):
    ...
    def get_queryset(self):
        self.queryset = BookSerializer.setup_eager_loading(self.queryset)
        # Every GET request will prefetch 'tags' for every book by default

        return super(BookViewSet, self).get_queryset()
👤mon io

0👍

select_related function will check ForeignKey in the first time.
Actually,this is a ForeignKey check in the relational database and you can use SET FOREIGN_KEY_CHECKS=0; in database to close inspection.

Leave a comment