[Answer]-Django running a migration on multi-tenant DB

0đź‘Ť

âś…

Fine, not so many people seem to have experience or to be concerned with this quite specific problem. I have tried some things here and there and I also got some support from the South mailing-list that helped me to understand some points.

Basically, the solution I implemented is the following:

I have a quite normal migration file autogenerated via South’s schemamigration. But I have changed the table name of the add_column and delete_column to schema.table_name. The schema is provided by importing the multi-tenant middleware.

The migration is then applied only if the schema is not run against the public schema. It is actually not meant to be run standalone, or only with database and schema kwargs, but rather from a migration runner that is a new django command.

The runner has unfortunately to call the migration externally, in order to go through the middleware each time again. One other trick is that we have to get the previous state of migration, in order to fake it back to the previous state for south after each tenant migration.

Here is my snippet :

from subprocess import call
import os
from django.core.management.base import BaseCommand    
from south.models import MigrationHistory
from myapp.models import MyModel

class Command(BaseCommand):

    def handle(self, *args, **options):
        #the only allowed arg is the prefix version and it should have a length of 4 (i.e. 0002)
        applied = MigrationHistory.objects.filter(app_name='myapp').latest('applied')
        current_version = applied.migration[:4]
        call_args = ['python', os.path.join('bin', 'manage.py'), 'migrate', 'myorderbird.app.backups']
        if len(args) == 1 and len(args[0]) == 4:
            call_args.append(args[0])

        obje_call_args = None
        for obje in MyModel.objects.all():
            if obje.schema_exists:
                # fake the migration of the previous venue back to the current version
                if obje_call_args:
                    obje_call_args = obje_call_args[:4] + [current_version, '--fake'] + obje_call_args[len(obje_call_args)-3:]
                    call(obje_call_args)
                # migrate the venue in the loop
                obje_call_args = list(call_args)
                obje_call_args.extend(['--database={}'.format(obje.db), '--schema={}'.format(obje.schema)])
                call(venue_call_args)
👤Rmatt

1đź‘Ť

This is an outline of a solution, as posted on the South mailing list. The question as phrased is a little different from the one that was posted on the list: There, it was also mentioned that there are “common” tables, shared between all tenants, in a separate schema. Rmatt’s own answer refers to this as the public schema.

The basic idea of my solution: Save the migration history for each database (schema) in the schema. To do this, we need to employ some database and Django tricks.

This implies that history records for migrations of apps on the public schema are saved in the public schema, while history for migrations of tenant apps is saved in the tenant schema — effectively sharding the migration history table. Django does not really support this kind of sharding; it is easy enough to set up the writing by instance content, but there’s no way to set up the reading.

So I suggested to create, per tenant, a “tenant-helper” schema, containing one view, named south_migrationhistory, which is a union of the south_migrationhistory tables from the public and tenant schemata. Then, set up a database router for the South MigrationHistory model, instructing it to:

  • syncdb to both public and tenant schemata
  • read always from tenant-helper schema
  • write to public or tenant schema, according to the app the migration belongs to

The result allows proper treatment of dependencies from the tenant app migrations to the public app migrations; and it means all you need to do, to migrate forwards, is loop on the migrate --all (or syncdb --migrate) command — no need to fake backward migrations. The migrations for the public schema will be run with the migrations for the first tenant in the loop, and all other tenants will “see” them.

As an afterthought, it is probably possible to do this also without a helper schema – by renaming the south_migrationhistory table in the tenant schema, and installing a view with that name in the schema that returns the above-mentioned union when queried, and has an “instead-of insert” trigger to write to the renamed table.

👤Shai Berger

Leave a comment