[Django]-Use Django's models in a Scrapy project (in the pipeline)

9πŸ‘

βœ…

In here i have create a sample project which uses scrapy inside django. And uses Django models and ORM in the one of the pipelines.

https://github.com/bipul21/scrapy_django

The directory structure starts with your django project.
In this case the the project name is django_project.
Once inside the base project you create your scrapy project i.e. scrapy_project here

In your scrapy project settings add the following line to setup initialize django

import os
import sys
import django

sys.path.append(os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), ".."))
os.environ['DJANGO_SETTINGS_MODULE'] = 'django_project.settings'

django.setup()

In the pipeline i have made a simple query to Question Model

from questions.models import Questions

class ScrapyProjectPipeline(object):
    def process_item(self, item, spider):
        try:
            question = Questions.objects.get(identifier=item["identifier"])
            print "Question already exist"
            return item
        except Questions.DoesNotExist:
            pass

        question = Questions()
        question.identifier = item["identifier"]
        question.title = item["title"]
        question.url = item["url"]
        question.save()
        return item

You can check in the project for any further details like model schema.

πŸ‘€Bipul Jain

Leave a comment