[Django]-How to tell if a task has already been queued in django-celery?

1👍

You can cheat a bit by storing the result manually in the database. Let me explain how this will help.

For example, if using RDBMS (table with columns – task_id, state, result):

View part:

  1. Use transaction management.
  2. Use SELECT FOR UPDATE to get row where task_id == “long-task-%d” % user_id. SELECT FOR UPDATE will block other requests until this one COMMITs or ROLLBACKs.
  3. If it doesn’t exist – set state to PENDING and start the ‘some_long_task’, end the request.
  4. If the state is PENDING – inform the user.
  5. If the state is SUCCESS – set state to PENDING, start the task, return the file pointed to by ‘result’ column. I base this on the assumption, that you want to re-run the task on getting the result. COMMIT
  6. If the state is ERROR – set state to PENDING, start the task, inform the user. COMMIT

Task part:

  1. Prepare the file, wrap in try, catch block.
  2. On success – UPDATE the proper row with state = SUCCESS, result.
  3. On failure – UPDATE the proper row with state = ERROR.

5👍

I solved this with Redis. Just set a key in redis for each task and then remove the key from redis in task’s after_return method. Redis is lightweight and fast.

5👍

I don’t think (as Tomek and other have suggested) that using the database is the way to do this locking. django has built-in cache framework, which should be sufficient to accomplish this locking, and must faster. See:

http://docs.celeryproject.org/en/latest/tutorials/task-cookbook.html#cookbook-task-serial

Django can be configured to use memcached as its cache backend, and this can be distributed across multiple machines … this seems better to me. Thoughts?

Leave a comment