[Django]-Celery worker concurrency

4👍

First of all that command will not start 50 workers, but 1 worker with 50 processes. I’d also recommend to just use as many processes as you have cores available. (Let’s say 8 for the rest of my answer.)

My guess here is that the other processes are idle because you only perform one task. If you want to do concurrent work, you’ll have to split up your work in parts that can be executed concurrently. The easiest way to do this is just make a separate task for every link you want to scrape. The worker will then start working on scraping 8 links and when it finishes 1 it will start on the next one until it has finished scraping all 150.

so your calling code of your task should roughly like:

for link in links:
    scrape_link.delay(link)

with scrape_link your task function that will look something like:

@app.task
def scrape_link(link):
    #scrape the link and its sub-links

Leave a comment