It sounds like you have one worker per celeryd. That seems wrong. You should have dozens of workers per celeryd. Keep raising the number of workers (and lowering the number of celeryd’s) until your system is very busy and very slow.



S. Lott is right. The main instance consumes messages and delegates them to worker pool processes. There is probably no point in running 300 pool processes on a single machine! Try 4 or 5 multiplied by the number of CPU cores. You may gain something by running more than on celeryd with a few processes each, some people have, but you would have to experiment for your application.

See http://celeryq.org/docs/userguide/workers.html#concurrency

For the upcoming 2.2 release we’re working on Eventlet pool support, this may
be a good alternative for IO-bound tasks, that will enable you to run 1000+ threads
with minimal memory overhead, but it’s still experimental and bugs are being fixed
for the final release.

See http://groups.google.com/group/celery-users/browse_thread/thread/94fbeccd790e6c04

The upcoming 2.2 release also have support for autoscale, which adds/removes process on demand. See the Changelog:
(this changelog is not completly written yet)



The natural number of workers is close to the number of cores you have. The workers are there so that cpu-intensive tasks can use an entire core efficiently. The broker is there so that requests that don’t have a worker on hand to process them are kept queued. The number of queues can be high, but that doesn’t mean you need a high number of brokers either. A single broker should suffice, or you could shard queues to one broker per machine if it later turns out fast worker-queue interaction is beneficial.

Your problem seems unrelated to that. I’m guessing that your agencies don’t provide a message queue api, and you have to keep around lots of requests. If so, you need a few (emphasis on not many) evented processes, for example twisted or node.js based.



Use autoscaling. This allows the number of workers under each celeryd instance to be increased or descreased as needed. http://docs.celeryproject.org/en/latest/userguide/workers.html#autoscaling

