26
If anyone else is having the same problem, this is how I solved it.
I added this to my scrapy settings.py file:
def setup_django_env(path):
import imp, os
from django.core.management import setup_environ
f, filename, desc = imp.find_module('settings', [path])
project = imp.load_module('settings', f, filename, desc)
setup_environ(project)
setup_django_env('/path/to/django/project/')
Note: the path above is to your django project folder, not the settings.py file.
Now you will have full access to your django models inside of your scrapy project.
21
The opposite solution (setup scrapy in a django management command):
# -*- coding: utf-8 -*-
# myapp/management/commands/scrapy.py
from __future__ import absolute_import
from django.core.management.base import BaseCommand
class Command(BaseCommand):
def run_from_argv(self, argv):
self._argv = argv
self.execute()
def handle(self, *args, **options):
from scrapy.cmdline import execute
execute(self._argv[1:])
and in djangoโs settings.py:
import os
os.environ['SCRAPY_SETTINGS_MODULE'] = 'scrapy_project.settings'
Then instead of scrapy foo
run ./manage.py scrapy foo
.
UPD: fixed the code to bypass djangoโs options parsing.
- [Django]-Sending JSON using the django test client
- [Django]-Django Blob Model Field
- [Django]-Should I be adding the Django migration files in the .gitignore file?
16
Add DJANGO_SETTINGS_MODULE env in your scrapy projectโs settings.py
import os
os.environ['DJANGO_SETTINGS_MODULE'] = 'your_django_project.settings'
Now you can use DjangoItem in your scrapy project.
Edit:
You have to make sure that the your_django_project
projects settings.py
is available in PYTHONPATH
.
- [Django]-Multiple copies of a pytest fixture
- [Django]-Django filter many-to-many with contains
- [Django]-How to See if a String Contains Another String in Django Template
2
For Django 1.4, the project layout has changed. Instead of /myproject/settings.py, the settings module is in /myproject/myproject/settings.py.
I also added pathโs parent directory (/myproject) to sys.path to make it work correctly.
def setup_django_env(path):
import imp, os, sys
from django.core.management import setup_environ
f, filename, desc = imp.find_module('settings', [path])
project = imp.load_module('settings', f, filename, desc)
setup_environ(project)
# Add path's parent directory to sys.path
sys.path.append(os.path.abspath(os.path.join(path, os.path.pardir)))
setup_django_env('/path/to/django/myproject/myproject/')
- [Django]-Django Test Client Method Override Header
- [Django]-Django โ Excluding some fields in Inline Admin Interface
- [Django]-Access Django model's fields using a string instead of dot syntax?
1
Check out django-dynamic-scraper, it integrates a Scrapy spider manager into a Django site.
- [Django]-Django : Unable to import model from another App
- [Django]-Django models avoid duplicates
- [Django]-Replace textarea with rich text editor in Django Admin?
0
Why not create a __init__.py
file in the scrapy project folder and hook it up in INSTALLED_APPS
? Worked for me. I was able to simply use:
piplines.py
from my_app.models import MyModel
Hope that helps.
- [Django]-Change Django Templates Based on User-Agent
- [Django]-How to see details of Django errors with Gunicorn?
- [Django]-Manager isn't available; User has been swapped for 'pet.Person'
0
setup-environ
is deprecated. You may need to do the following in scrapyโs settings file for newer versions of django 1.4+
def setup_django_env():
import sys, os, django
sys.path.append('/path/to/django/myapp')
os.environ['DJANGO_SETTINGS_MODULE'] = 'myapp.settings'
django.setup()
- [Django]-Redis Python โ how to delete all keys according to a specific pattern In python, without python iterating
- [Django]-Django upgrading to 1.9 error "AppRegistryNotReady: Apps aren't loaded yet."
- [Django]-Running Scrapy spiders in a Celery task
0
Minor update to solve KeyError. Python(3)/Django(1.10)/Scrapy(1.2.0)
from django.core.management.base import BaseCommand
class Command(BaseCommand):
help = 'Scrapy commands. Accessible from: "Django manage.py". '
def __init__(self, stdout=None, stderr=None, no_color=False):
super().__init__(stdout=None, stderr=None, no_color=False)
# Optional attribute declaration.
self.no_color = no_color
self.stderr = stderr
self.stdout = stdout
# Actual declaration of CLI command
self._argv = None
def run_from_argv(self, argv):
self._argv = argv
self.execute(stdout=None, stderr=None, no_color=False)
def handle(self, *args, **options):
from scrapy.cmdline import execute
execute(self._argv[1:])
The SCRAPY_SETTINGS_MODULE declaration is still required.
os.environ.setdefault('SCRAPY_SETTINGS_MODULE', 'scrapy_project.settings')
- [Django]-Django and Middleware which uses request.user is always Anonymous
- [Django]-How to update an object from edit form in Django?
- [Django]-How to add url parameters to Django template url tag?