[Django]-Database migrations on django production

23đź‘Ť

âś…

I think there are two parts to this problem.

First is managing the database schema and it’s changes. We do this using South, keeping both the working models and the migration files in our SCM repository. For safety (or paranoia), we take a dump of the database before (and if we are really scared, after) running any migrations. South has been adequate for all our requirements so far.

Second is deploying the schema change which goes beyond just running the migration file generated by South. In my experience, a change to the database normally requires a change to deployed code. If you have even a small web farm, keeping the deployed code in sync with the current version of your database schema may not be trivial – this gets worse if you consider the different caching layers and effect to an already active site user. Different sites handle this problem differently, and I don’t think there is a one-size-fits-all answer.


Solving the second part of this problem is not necessarily straight forward. I don’t believe there is a one-size-fits-all approach, and there is not enough information about your website and environment to suggest a solution that would be most suitable for your situation. However, I think there are a few considerations that can be kept in mind to help guide deployment in most situations.

Taking the whole site (web servers and database) offline is an option in some cases. It is certainly the most straight forward way to manage updates. But frequent downtime (even when planned) can be a good way to go our of business quickly, makes it tiresome to deploy even small code changes, and might take many hours if you have a large dataset and/or complex migration. That said, for sites I help manage (which are all internal and generally only used during working hours on business days) this approach works wonders.

Be careful if you do the changes on a copy of your master database. The main problem here is that your site is still live, and presumably accepting writes to the database. What happens to data written to the master database while you are busy migrating the clone for later use? Your site has to either be down the whole time or put in some read-only state temporarily otherwise you’ll lose them.

If your changes are backwards compatible, and you have a web farm, sometimes you can get away with updating the live production database server (which I think is unavoidable in most situations) and then incrementally updating nodes in the farm by taking them out of the load balancer for a short period. This can work ok – however the main problem here is if a node that has already been updated sends a request for a url which isn’t supported by an older node you will get fail as you cant manage that at the load balancer level.

I’ve seen/heard a couple of other ways work well.

The first is wrapping all code changes in a feature lock which is then configurable at run-time through some site-wide configuration options. This essentially means you can release code where all your changes are turned off, and then after you have made all the necessary updates to your servers you change your configuration option to enable the feature. But this makes quite heavy code…

The second is letting the code manage the migration. I’ve heard of sites where changes to the code is written in such a way that it handles the migration at runtime. It is able to detect the version of the schema being used, and the format of the data it got back – if the data is from the old schema it does the migration in place, if the data is already from the new schema it does nothing. From natural site usage a high portion of your data will be migrated by people using the site, the rest you can do with a migration script whenever you like.

But I think at this point Google becomes your friend, because as I say, the solution is very context specific and I’m worried this answer will start to get meaningless… Search for something like “zero down time deployment” and you’ll get results such as this with plenty of ideas…

4đź‘Ť

I use South for a production server with a codebase of ~40K lines and we have had no problems so far. We have also been through a couple of major refactors for some of our models and we have had zero problems.

One thing that we also have is version control on our models which helps us revert any changes we make to models on the software side with South being more for the actual data. We use Django Reversion

3đź‘Ť

I have sometimes taken an unconventional approach (reading the other answers perhaps it’s not that unconventional) to this problem. I never tried it with django so I just did some experiments with it.

In short, I let the code catch the exception resulting from the old schema and apply the appropriate schema upgrade. I don’t expect this to be the accepted answer – it is only appropriate in some cases (and some might argue never). But I think it has an ugly-duckling elegance.

Of course, I have a test environment which I can reset back to the production state at any point. Using that test environment, I update my schema and write code against it – as usual.

I then revert the schema change and test the new code again. I catch the resulting errors, perform the schema upgrade and then re-try the erring query.

The upgrade function must be written so it will “do no harm” so that if it’s called multiple times (as may happen when put into production) it only acts once.

Actual python code – I put this at the end of my settings.py to test the concept, but you would probably want to keep it in a separate module:

from django.db.models.sql.compiler import SQLCompiler
from MySQLdb import OperationalError

orig_exec = SQLCompiler.execute_sql
def new_exec(self, *args, **kw):
    try:
        return orig_exec(self, *args, **kw)
    except OperationalError, e:
        if e[0] != 1054: # unknown column
            raise
        upgradeSchema(self.connection)
        return orig_exec(self, *args, **kw)
SQLCompiler.execute_sql = new_exec

def upgradeSchema(conn):
    cursor = conn.cursor()
    try:
        cursor.execute("alter table users add phone varchar(255)")
    except OperationalError, e:
        if e[0] != 1060: # duplicate column name
            raise

Once your production environment is up to date, you are free to remove this self-upgrade code from your codebase. But even if you don’t, the code isn’t doing any significant unnecessary work.

You would need to tailor the exception class (MySQLdb.OperationalError in my case) and numbers (1054 “unknown column” / 1060 “duplicate column” in my case) to your database engine and schema change, but that should be easy.

You might want to add some additional checks to ensure the sql being executed is actually erring because of the schema change in question rather than some other problem, but even if you don’t, this should re-raise unrelated exception. The only penalty is that in that situation you’d be trying the upgrade and the bad query twice before raising the exception.

One of my favorite things about python is one’s ability to easily override system methods at run-time like this. It provides so much flexibility.

👤Julian

2đź‘Ť

If your database is non-trivial and Postgresql you have a whole bunch of excellent options SQL-wise, including:

  • snapshotting and rollback
  • live replication to a backup server
  • trial upgrade then live

The trial upgrade option is nice (but best done in collaboration with a snapshot)

su postgres
pg_dump <db> > $(date "+%Y%m%d_%H%M").sql
psql template1
# create database upgrade_test template current_db
# \c upgradetest
# \i upgrade_file.sql
...assuming all well...
# \q
pg_dump <db> > $(date "+%Y%m%d_%H%M").sql # we're paranoid
psql <db>
# \i upgrade_file.sql

If you like the above arrangement, but you are worried about the time it takes to run upgrade twice, you can lock db for writes and then if the upgrade to upgradetest goes well you can then rename db to dbold and upgradetest to db. There are lots of options.

If you have an SQL file listing all the changes you want to make, an extremely handy psql command \set ON_ERROR_STOP 1. This stops the upgrade script in its tracks the moment something goes wrong. And, with lots of testing, you can make sure nothing does.

There are a whole host of database schema diffing tools available, with a number noted in this StackOverflow answer. But it is basically pretty easy to do by hand …

pg_dump --schema-only production_db > production_schema.sql
pg_dump --schema-only upgraded_db > upgrade_schema.sql
vimdiff production_schema.sql upgrade_schema.sql
or
diff -Naur production_schema.sql upgrade_schema.sql > changes.patch
vim changes.patch (to check/edit)
👤rorycl

1đź‘Ť

South isnt used everywhere. Like in my orgainzation we have 3 levels of code testing. One is local dev environment, one is staging dev enviroment, and third is that of a production .

Local Dev is on the developers hands where he can play according to his needs. Then comes staging dev which is kept identical to production, ofcourse, until a db change has to be done on the live site, where we do the db changes on staging first, and check if everything is working fine and then we manually change the production db making it identical to staging again.

👤iamkhush

0đź‘Ť

If its not trivial, you should have pre-prod database/ app that mimic the production one. To avoid downtime on production.

👤araldhafeeri

Leave a comment