[Fixed]-Using Django DB connection in custom threaded scripts

0πŸ‘

βœ…

Each item in the connections object returns a thread-local connection to that database. By default, these connections cannot be shared between threads; attempting to do so will result in a DatabaseError.

Always use connections[alias] within the thread that executes your queries. Never access connections[alias] in the parent thread and pass the object to the child thread. This will ensure that every connection object you use is local to the current thread, avoiding any threading issues.

To fix your code and make it thread-safe, you would change it like this:

from django.db import connections
import threading

class Transform(object):
    def transform_data(self, listing):
        # Access the database connection on the global `connections` object
        # from within the child thread.
        cursor = connections['legacy'].cursor()
        cursor.execute('SELECT ... WHERE id = %s', listing.id)
        data  = cursor.fetchall()
        ...

    def run(self):
        for listing in listings:
            threading.Thread(target=self.transform_data, args=[listing])
πŸ‘€knbk

1πŸ‘

Ideally each thread should be using its own connection. If you do that when you execute the select query inside transform_data you are essentially getting a snapshot of the data at that point in time. You can retrieve the rows without having to worry about their being updated or deleted by other threads provided that the other threads have their own connection.

If all threads share the same connection what exactly happens is very dependent on what database you are using and transaction isolation level

πŸ‘€e4c5

Leave a comment