[Django]-How to import django models in scrapy pipelines.py file



from .. models import MyModel 


from ... models import MyModel

Every dot represent the location


In the pipelines you don’t import django models, you use scrapy models bounded to a django model.
You have to add Django Settings at scrapy settings, not after.

To use django models in scrapy project you have to use django_Item
https://github.com/scrapy-plugins/scrapy-djangoitem (import to your pythonpath)

My recommended file structure is:

     |     |-Djangoproject
     |     |-DjangoAPP

Then in your scrapy project you hace to add pythonpath ull path to the django project:

**# Setting up django's project full path.**
import sys
sys.path.insert(0, '/home/PycharmProject/scrap/DjangoProject')

# Setting up django's settings module name.
import os
os.environ['DJANGO_SETTINGS_MODULE'] = 'DjangoProject.settings'

Then in your items.py you cand bound your Django models to scrapy models:

from DjangoProject.models import Person, Job
from scrapy_djangoitem import DjangoItem

class Person(DjangoItem):
    django_model = Person
class Job(DjangoItem):
    django_model = Job

Then u can use the .save() method in pipelines after yeld of an object:


from scrapy.spider import BaseSpider
from mybot.items import PersonItem

class ExampleSpider(BaseSpider):
    name = "example"
    allowed_domains = ["dmoz.org"]
    start_urls = ['http://www.dmoz.org/World/Espa%C3%B1ol/Artes/Artesan%C3%ADa/']

    def parse(self, response):
        # do stuff
        return PersonItem(name='zartch')


from myapp.models import Person

class MybotPipeline(object):
    def process_item(self, item, spider):
        obj = Person.objects.get_or_create(name=item['name'])
        return obj

I have a repository with the minimal code working: (you just have to set the path of your django project in scrapy settings)

You have to change my Django Project path to your DjangoProject path:

sys.path.insert(0, '/home/zartch/PycharmProjects/Scrapy-Django-Minimal/myweb')

Leave a comment