10👍
You are processing ~2.78 tasks/second (5000 tasks in 30 mins) which I can agree isn’t that high. You have 3 nodes each running with a concurrency of 8 so you should be able to process 24 tasks in parallel.
Things to check:
CELERYD_PREFETCH_MULTIPLIER
– This is set to 4 by default but if you have lots of short tasks it can be worthwhile to increase it. It will reduce the impact of the time to take the messages from the broker at the cost that tasks will not be as evenly distributed across workers.
DB connection/queries – I count 5+ DB queries being executed for the successful case. If you are using the default result backend for django-celery there are additional queries for storing the task result in the DB. django-celery will also close and reopen the DB connection after each task which adds some overhead. If you have 5 queries and each one takes 100ms then your task will take at least 500ms with or without celery. Running the queries by themselves is one thing but you also need to ensure that nothing in your task is locking the table/rows preventing other tasks from running efficiently in parallel.
Gateway response times – Your task appears to call a remote service which I’m assuming is an SMS gateway. If that server is slow to respond then your task will be slow. Again the response times might be different for a single call vs when you are doing this at peak load. In the US, long-code SMS can only be sent at a rate of 1 per second and depending on where the gateway is doing that rate-limiting then it might be slowing down your task.