1👍
You should be able to do this with gevent, something like:
import gevent
from django.core.management.base import BaseCommand
class Command(BaseCommand):
def handle(self, *args, **options):
pexel_crawler = PexelCrawler()
magdeleine_crawler = MagdeleineCrawler()
pexel_job = gevent.spawn(pexel_crawler.crawl)
magdeleine_job = gevent.spawn(magdeleine_crawler.crawl)
gevent.joinall([pexel_job, magdeleine_job])
I believe that will work, and keep the management command running in the foreground for as long as both crawlers are running. I would be careful though, because if this works as expected, it will truly be an infinite loop and never stop.
1👍
I suggest you to use Celery for that task.
Crawl operation can take many time and if we invoke it from cmd it’s ok we control task but on production you will call it from cron/view/etc so better to have control over
task life cycle.
Install Celery and Django management tool djcelery
pip install celery
pip install djcelery
For message broker i suggest to install RabbitMQ
apt-get install rabbitmq-server
in settings.py of your Django project add
import djcelery
djcelery.setup_loader()
CELERYBEAT_SCHEDULER = 'djcelery.schedulers.DatabaseScheduler' #To make crawl call by shedule.
Create file tasks.py in your project and put this code.
from __future__ import absolute_import
from celery import shared_task
from django.core.management import call_command
@shared_task
def run_task():
print call_command('your_management_command', verbosity=3, interactive=False)
To control your task install flower.
apt-get install flower
Run your task at first:
Run rabbitmq server
service rabbitmq-server start
Then run celery
service celeryd start
And then flower to control execution of your tasks.
service flower start
That’s it you can now run your crawler tasks and you would have any troubles with this.
- [Answered ]-Django Boolean Field used in view.py
- [Answered ]-Not able to open http://localhost:8000/ but opening in public ip ngnix Django