3
I didn’t actually test the integration with Django REST framework, but the following snippet would allow you to run a Spider from a python script, collecting the resulting items to handle them later.
from scrapy import signals
from scrapy.crawler import Crawler, CrawlerProcess
from ... import MysiteSpider
items = []
def collect_items(item, response, spider):
items.append(item)
crawler = Crawler(MysiteSpider)
crawler.signals.connect(collect_items, signals.item_scraped)
process = CrawlerProcess()
process.crawl(crawler)
process.start() # the script will block here until the crawling is finished
# at this point, the "items" variable holds the scraped items
For the record, this works, but there might be better ways to do it
Further reading:
Source:stackexchange.com