[Django]-How to import django models in scrapy pipelines.py file

0👍

Try:

from .. models import MyModel 

OR

from ... models import MyModel

Every dot represent the location

1👍

In the pipelines you don’t import django models, you use scrapy models bounded to a django model.
You have to add Django Settings at scrapy settings, not after.

To use django models in scrapy project you have to use django_Item
https://github.com/scrapy-plugins/scrapy-djangoitem (import to your pythonpath)

My recommended file structure is:

Projects
 |-DjangoScrapy
     |-DjangoProject
     |     |-Djangoproject
     |     |-DjangoAPP
     |-ScrapyProject
            |-ScrapyProject
                 |-Spiders

Then in your scrapy project you hace to add pythonpath ull path to the django project:

**# Setting up django's project full path.**
import sys
sys.path.insert(0, '/home/PycharmProject/scrap/DjangoProject')

# Setting up django's settings module name.
import os
os.environ['DJANGO_SETTINGS_MODULE'] = 'DjangoProject.settings'

Then in your items.py you cand bound your Django models to scrapy models:

from DjangoProject.models import Person, Job
from scrapy_djangoitem import DjangoItem

class Person(DjangoItem):
    django_model = Person
class Job(DjangoItem):
    django_model = Job

Then u can use the .save() method in pipelines after yeld of an object:

spider.py

from scrapy.spider import BaseSpider
from mybot.items import PersonItem

class ExampleSpider(BaseSpider):
    name = "example"
    allowed_domains = ["dmoz.org"]
    start_urls = ['http://www.dmoz.org/World/Espa%C3%B1ol/Artes/Artesan%C3%ADa/']

    def parse(self, response):
        # do stuff
        return PersonItem(name='zartch')

pipelines.py

from myapp.models import Person

class MybotPipeline(object):
    def process_item(self, item, spider):
        obj = Person.objects.get_or_create(name=item['name'])
        return obj

I have a repository with the minimal code working: (you just have to set the path of your django project in scrapy settings)
https://github.com/Zartch/Scrapy-Django-Minimal

in:
https://github.com/Zartch/Scrapy-Django-Minimal/blob/master/mybot/mybot/settings.py
You have to change my Django Project path to your DjangoProject path:

sys.path.insert(0, '/home/zartch/PycharmProjects/Scrapy-Django-Minimal/myweb')
👤Zartch

Leave a comment