[Django]-Order of Serializer Validation in Django REST Framework

125👍

Since most likely your username field has unique=True set, Django REST Framework automatically adds a validator that checks to make sure the new username is unique. You can actually confirm this by doing repr(serializer()), which will show you all of the automatically generated fields, which includes the validators.

Validation is run in a specific, undocumented order

  1. Field deserialization called (serializer.to_internal_value and field.run_validators)
  2. serializer.validate_[field] is called for each field
  3. Serializer-level validators are called (serializer.run_validation followed by serializer.run_validators)
  4. serializer.validate is called

So the problem that you are seeing is that the field-level validation is called before your serializer-level validation. While I wouldn’t recommend it, you can remove the field-level validator by setting extra_kwargs in your serilalizer’s meta.

class Meta:
    extra_kwargs = {
        "username": {
            "validators": [],
        },
    }

You will need to re-implement the unique check in your own validation though, along with any additional validators that have been automatically generated.

41👍

I was also trying to understand how the control flows during serializer validation and after carefully going through the source code of djangorestframework-3.10.3 I came up with below request flow diagram. I have described the flow and what happens in the flow to the best of my understanding without going into too much detail as it can be looked up from source.

Ignore the incomplete method signatures. Only focusing on what methods are called on what classes.

DRF_Validation_Control_Flow

Assuming you have an overridden is_valid method on your serializer class (MySerializer(serializers.Serializer)) when you call my_serializer.is_valid() the following takes place.

  1. MySerializer.is_valid() is executed.
  2. Assuming you are calling the super class (BaseSerializer) is_valid method (like: super(MySerializer, self).is_valid(raise_exception) in your MySerializer.is_valid() method, that will be called.
  3. Now since MySerializer is extending serializers.Serializer, the run_validation() method from serializer.Serializers is called. This is validating only the data dict the first. So we haven’t yet started field level validations.
  4. Then the validate_empty_values from fields.Field gets called. This again happens on the entire data and not a single field.
  5. Then the Serializer.to_internal_method is called.
  6. Now we loop over each fields defined on the serializer. And for each field, first we call the field.run_validation() method. If the field has overridden the Field.run_validation() method then that will be called first. In case of a CharField it is overridden and calls the run_validation method of Field base class. Step 6-2 in the figure.
  7. On that field we again call the Field.validate_empty_values()
  8. The to_internal_value of the type of field is called next.
  9. Now there is a call to the Field.run_validators() method. I presume this is where the additional validators that we add on the field by specifying the validators = [] field option get executed one by one
  10. Once all this is done, we are back to the Serializer.to_internal_value() method. Now remember that we are doing the above for each field within that for loop. Now the custom field validators you wrote in your serializer (methods like validate_field_name) are run. If an exception occurred in any of the previous steps, your custom validators wont run.
  11. read_only_defaults()
  12. update validate data with defaults I think
  13. run object level validators. I think the validate() method on your object is run here.

4👍

I don’t believe the above solutions work any more. In my case, my model has fields ‘first_name’ and ‘last_name’, but the API will only receive ‘name’.

Setting ‘extra_kwargs’ and ‘validators’ in the Meta class seems to have no effect, first_name and last_name are allways deemed required, and validators are always called. I can’t overload the first_name/last_name character fields with

anotherrepfor_first_name = serializers.CharField(source=first_name, required=False)

as the names make sense. After many hours of frustration, I found the only way I could override the validators with a ModelSerializer instance was to override the class initializer as follows (forgive the incorrect indentation):

class ContactSerializer(serializers.ModelSerializer):
name = serializers.CharField(required=True)

class Meta:
    model = Contact
    fields = [ 'name', 'first_name', 'last_name', 'email', 'phone', 'question' ]

def __init__(self, *args, **kwargs):
    self.fields['first_name'] = serializers.CharField(required=False, allow_null=True, allow_blank=True)
    self.fields['last_name'] = serializers.CharField(required=False, allow_null=True, allow_blank=True)
    return super(ContactSerializer, self).__init__(*args, **kwargs)

def create(self, validated_data):
    return Contact.objects.create()

def validate(self, data):
    """
    Remove name after getting first_name, last_name
    """
    missing = []
    for k in ['name', 'email', 'question']:
        if k not in self.fields:
            missing.append(k)
    if len(missing):
        raise serializers.ValidationError("Ooops! The following fields are required: %s" % ','.join(missing))
    from nameparser import HumanName
    names = HumanName(data['name'])
    names.capitalize()
    data['last_name'] = names.last
    if re.search(r'\w+', names.middle):
        data['first_name'] = ' '.join([names.first, names.middle]) 
    else:
        data['first_name'] = names.first
    del(data['name'])

    return data

Now the doc says that allowing blank and null with character fields is a no no, but this is a serializer, not a model, and as the API gets called by all kinds of cowboys, I need to cover my bases.

1👍

Here’s the approach that worked for me.

  • Use a sentinel error type that gets caught in an overridden view function
  • The sentinel is raised from the custom serializer

The sentinel error type:

from django.core.exceptions import ValidationError

class CustomValidationErrors(ValidationError):
    """ custom validation error for the api view to catch the status code """

And in the serializer we override errors, _errors, validated_data, and _validated_data, as well as is_valid:

class CustomSerializer(serializers.ModelSerializer):
    # fields that usually run validation before parent serializer validation
    child_field1 = Child1Serializer()
    child_field2 = Child2Serializer()

    # override DRF fields
    errors = {}
    _errors = None
    validated_data = {}
    _validated_data = []

    def is_valid(self, *args, **kwargs):
        # override drf.serializers.Serializer.is_valid
        # and raise CustomValidationErrors from parent validate
        self.validate(self.initial_data)
        return not bool(self.errors)

    def validate(self, attrs):

        self._errors = {}

        if len(attrs.get("child_field1", {}).get("name", "")) > 100:
            self._errors.update({"child_field1": {"name": "child 1 name > 100"}})

        if len(attrs.get("child_field2", {}).get("description", "")) > 1000:
            self._errors.update({"child_field2.description": "child 2 description > 100"})

        if len(self._errors):
            # set the overriden DRF values
            self.errors = self._errors
            # raise the sentinel error type
            raise CustomValidationErrors(self._errors)

        # set the overriden DRF serializer values
        self._errors = None
        self.validated_data = attrs
        self._validated_data = [[k, v] for k, v in attrs.items()]
        return attrs

    class Meta:
        model = CustomModel
    

And in the view we can override the default method, and catch the sentinel error type:

class CustomSerializerView(ListCreateAPIView):
    serializer_class = CustomeSerializer

    def post(self, *args, **kwargs):
        try:
            # if this fails for any exception
            # other than CustomValidationErrors
            # it will return the default error

            return super().post(*args, **kwargs)

        except CustomValidationErrors as e:
            ############
            # returns a 400 with the following
            # {"child_field1": 
            #       [[{"name": "child 1 name > 100"}]],
            #  "child_field2.description": 
            #       [["child 2 description > 100"]]
            # }
            ############
            return Response(e.error_dict, status=400)

drf version:

  • djangorestframework==3.11.0

0👍

As of django rest

versions:

Django                    4.2.6
djangorestframework       3.14.0

serializer validation goes like this:

def run_validation(self, data=empty):
    """
    Validate a simple representation and return the internal value.

    The provided data may be `empty` if no representation was included
    in the input.

    May raise `SkipField` if the field should not be included in the
    validated data.
    """
    (is_empty_value, data) = self.validate_empty_values(data)
    if is_empty_value:
        return data
    value = self.to_internal_value(data)
    self.run_validators(value)
    return value

If the field is not present in the request body and you have a custom validation. Validdation such as field validation or validators won’t run

def validate_empty_values(self, data):
    """
    Validate empty values, and either:

    * Raise `ValidationError`, indicating invalid data.
    * Raise `SkipField`, indicating that the field should be ignored.
    * Return (True, data), indicating an empty value that should be
      returned without any further validation being applied.
    * Return (False, data), indicating a non-empty value, that should
      have validation applied as normal.
    """
    if self.read_only:
        return (True, self.get_default())

    if data is empty:
        if getattr(self.root, 'partial', False):
            raise SkipField()
        if self.required:
            self.fail('required')
        return (True, self.get_default())

    if data is None:
        if not self.allow_null:
            self.fail('null')
        # Nullable `source='*'` fields should not be skipped when its named
        # field is given a null value. This is because `source='*'` means
        # the field is passed the entire object, which is not null.
        elif self.source == '*':
            return (False, None)
        return (True, None)

    return (False, data)

See the fail function raised ValidationError

def fail(self, key, **kwargs):
    """
    A helper method that simply raises a validation error.
    """
    try:
        msg = self.error_messages[key]
    except KeyError:
        class_name = self.__class__.__name__
        msg = MISSING_ERROR_MESSAGE.format(class_name=class_name, key=key)
        raise AssertionError(msg)
    message_string = msg.format(**kwargs)
    raise ValidationError(message_string, code=key)

Leave a comment