[Django]-Store Subtitles in a Database

2πŸ‘

βœ…

Now, I don’t understand how I should store this object in the SQL database. Initially, I thought I would store the whole object as a string in a database.

If the data has a clear structure, you should not store it as a JSON blob in a relational database. While relational databases have some support for JSON nowadays, it is still not very effective, and normally it means you can not effectively filter, aggregate, and manipulate data, nor can you check referential integrity.

You can work with two models that look like:

from django.db import models
from django.db.models import F, Q


class Subtitle(models.Model):
    text = models.CharField(max_length=128)
    language = models.CharField(max_length=128)


class Segment(models.Model):
    startTimestamp = models.DurationField()
    endTimestamp = models.DurationField()
    subtitle = models.ForeignKey(
        Subtitle, on_delete=models.CASCADE, related_name='segments'
    )
    text = models.CharField(max_length=512)

    class Meta:
        ordering = ('subtitle', 'startTimestamp', 'endTimestamp')
        constraints = [
            models.CheckConstraint(
                check=Q(startTimestamp__gt=F('endTimestamp')),
                name='start_before_end',
            )
        ]

This will also guarantee that the startTimestamp is before the endTimestamp for example, that these fields store durations (and not "foo" for example).

You can convert from and to JSON with serializersΒ [drf-doc]:

from rest_framework import serializers


class SegmentSerializer(serializers.ModelSerializer):
    class Meta:
        model = Segment
        fields = ['startTimestamp', 'endTimestamp', 'text']


class SubtitleSerializer(serializers.ModelSerializer):
    segments = SegmentSerializer(many=True)

    class Meta:
        model = Subtitle
        fields = ['text', 'language', 'segments']

Leave a comment