[Django]-What's the best way to implement parallel tasks with Django and Elastic beanstalk?

3đź‘Ť

Update:
I found an alternative solution for this that is simpler and more stable. See my answer in this question: How do you run a worker with AWS Elastic Beanstalk?

I just needed to figure this out for a project I am working on. It took some tinkering, but in the end the solution is quite easy to implement. You can add three files “dynamically” to the server using the files: directive in a ebextension hook. The three files are:

  1. A script that starts the daeomon (located in /etc/init.d/)
  2. A config file, configuring the daemon starting script, located in /etc/default/
  3. A shell script that copies the env vars from your app to the environment of celeryd and starts the service (post deployment)

The start script can be the default from the repository, so it is sourced directly from github.

The config has to be adopted to your project. You need to add your own app’s name in to the CELERY_APP setting and you can pass additional arguments to the worker through the CELERYD_OPTS setting (for instance, the concurrency value could be set there).

Then you also need to pass your environment variables for your project to the worker daemon, as it needs the same environment variables as the main app. An example are the AWS secret keys that the celery worker needs to have to be able to connect to SQS and possibly S3. You can do that by simply appending the env vars from the current app to the configuration file:

cat /opt/python/current/env | tee -a /etc/default/celeryd

Finally the celery worker should be started. This step needs to happen after the codebase has been deployed to the server, so it needs to be activated “post” deployment. You can do that by using the undocumented post-deploy hooks. Any shell file in /opt/elasticbeanstalk/hooks/appdeploy/post/ will be executed by elasticbeanstalk post deployment. So you can add a service celeryd restart command into a script file in that folder. For convenience, I placed both the copying of environment variables and the start command in one file.

Note that you can not use the services: directive directly to start the daemon, as this will try to start the celeryd worker before the codebase is deployed to the server, so that won’t work (hence the “post” deploy script).

Ok, all that put together, the only thing needed is to create a file ./ebextensions/celery.config in the main directory of your codebase with the following content (adopted to your codebase of course):

files:
  "/etc/init.d/celeryd":
    mode: "000755"
    owner: root
    group: root
    source: https://raw2.github.com/celery/celery/22ae169f570f77ae70eab03346f3d25236a62cf5/extra/generic-init.d/celeryd

  "/etc/default/celeryd":
    mode: "000755"
    owner: root
    group: root
    content: |
      CELERYD_NODES="worker1"
      CELERY_BIN="/opt/python/run/venv/bin/celery"
      CELERY_APP="yourappname"
      CELERYD_CHDIR="/opt/python/current/app"
      CELERYD_OPTS="--time-limit=30000"
      CELERYD_LOG_FILE="/var/log/celery/%N.log"
      CELERYD_PID_FILE="/var/run/celery/%N.pid"
      CELERYD_USER="ec2-user"
      CELERYD_GROUP="ec2-user"
      CELERY_CREATE_DIRS=1

  "/opt/elasticbeanstalk/hooks/appdeploy/post/myapp_restart_celeryd.sh":
    mode: "000755"
    owner: root
    group: root
    content: |
      #!/usr/bin/env bash
      # Copy env vars to celeryd and restart service
      su -c "cat /opt/python/current/env | tee -a /etc/default/celeryd" $EB_CONFIG_APP_USER
      su -c "service celeryd restart" $EB_CONFIG_APP_USER

services: 
  sysvinit:
    celeryd:
      enabled: true
      ensureRunning: false

Hope this helps.

👤yellowcap

Leave a comment