51đź‘Ť
Make sure you are not keeping global references to data. That prevents the python garbage collector from releasing the memory.
Don’t use mod_python
. It loads an interpreter inside apache. If you need to use apache, use mod_wsgi
instead. It is not tricky to switch. It is very easy. mod_wsgi
is way easier to configure for django than brain-dead mod_python
.
If you can remove apache from your requirements, that would be even better to your memory. spawning
seems to be the new fast scalable way to run python web applications.
EDIT: I don’t see how switching to mod_wsgi could be “tricky“. It should be a very easy task. Please elaborate on the problem you are having with the switch.
28đź‘Ť
If you are running under mod_wsgi, and presumably spawning since it is WSGI compliant, you can use Dozer to look at your memory usage.
Under mod_wsgi just add this at the bottom of your WSGI script:
from dozer import Dozer
application = Dozer(application)
Then point your browser at http://domain/_dozer/index to see a list of all your memory allocations.
I’ll also just add my voice of support for mod_wsgi. It makes a world of difference in terms of performance and memory usage over mod_python. Graham Dumpleton’s support for mod_wsgi is outstanding, both in terms of active development and in helping people on the mailing list to optimize their installations. David Cramer at curse.com has posted some charts (which I can’t seem to find now unfortunately) showing the drastic reduction in cpu and memory usage after they switched to mod_wsgi on that high traffic site. Several of the django devs have switched. Seriously, it’s a no-brainer 🙂
- [Django]-Get protocol + host name from URL
- [Django]-How to spread django unit tests over multiple files?
- [Django]-Django multiple template inheritance – is this the right style?
15đź‘Ť
These are the Python memory profiler solutions I’m aware of (not Django related):
Disclaimer: I have a stake in the latter.
The individual project’s documentation should give you an idea of how to use these tools to analyze memory behavior of Python applications.
The following is a nice “war story” that also gives some helpful pointers:
- [Django]-Django – how to visualize signals and save overrides?
- [Django]-Are Django SECRET_KEY's per instance or per app?
- [Django]-Redirect to Next after login in Django
5đź‘Ť
Additionally, check if you do not use any of known leakers. MySQLdb is known to leak enormous amounts of memory with Django due to bug in unicode handling. Other than that, Django Debug Toolbar might help you to track the hogs.
- [Django]-Detect mobile, tablet or Desktop on Django
- [Django]-What's the best way to store a phone number in Django models?
- [Django]-ModuleNotFoundError: No module named 'grp' on windows
4đź‘Ť
In addition to not keeping around global references to large data objects, try to avoid loading large datasets into memory at all wherever possible.
Switch to mod_wsgi in daemon mode, and use Apache’s worker mpm instead of prefork. This latter step can allow you to serve many more concurrent users with much less memory overhead.
- [Django]-How can i test for an empty queryset in Django?
- [Django]-How do I get the object if it exists, or None if it does not exist in Django?
- [Django]-What does error mean? : "Forbidden (Referer checking failed – no Referer.):"
4đź‘Ť
Webfaction actually has some tips for keeping django memory usage down.
The major points:
- Make sure debug is set to false (you already know that).
- Use “ServerLimit” in your apache config
- Check that no big objects are being loaded in memory
- Consider serving static content in a separate process or server.
- Use “MaxRequestsPerChild” in your apache config
- Find out and understand how much memory you’re using
- [Django]-What is the difference between null=True and blank=True in Django?
- [Django]-Django – makemigrations – No changes detected
- [Django]-What does "'tests' module incorrectly imported" mean?
3đź‘Ť
Another plus for mod_wsgi: set a maximum-requests
parameter in your WSGIDaemonProcess
directive and mod_wsgi will restart the daemon process every so often. There should be no visible effect for the user, other than a slow page load the first time a fresh process is hit, as it’ll be loading Django and your application code into memory.
But even if you do have memory leaks, that should keep the process size from getting too large, without having to interrupt service to your users.
- [Django]-Missing Table When Running Django Unittest with Sqlite3
- [Django]-How to upload a file in Django?
- [Django]-What's the idiomatic Python equivalent to Django's 'regroup' template tag?
3đź‘Ť
Here is the script I use for mod_wsgi (called wsgi.py, and put in the root off my django project):
import os
import sys
import django.core.handlers.wsgi
from os import path
sys.stdout = open('/dev/null', 'a+')
sys.stderr = open('/dev/null', 'a+')
sys.path.append(path.join(path.dirname(__file__), '..'))
os.environ['DJANGO_SETTINGS_MODULE'] = 'myproject.settings'
application = django.core.handlers.wsgi.WSGIHandler()
Adjust myproject.settings and the path as needed. I redirect all output to /dev/null since mod_wsgi by default prevents printing. Use logging instead.
For apache:
<VirtualHost *>
ServerName myhost.com
ErrorLog /var/log/apache2/error-myhost.log
CustomLog /var/log/apache2/access-myhost.log common
DocumentRoot "/var/www"
WSGIScriptAlias / /path/to/my/wsgi.py
</VirtualHost>
Hopefully this should at least help you set up mod_wsgi so you can see if it makes a difference.
- [Django]-Creating a JSON response using Django and Python
- [Django]-Django Rest JWT login using username or email?
- [Django]-How to compare two JSON objects with the same elements in a different order equal?
1đź‘Ť
Caches: make sure they’re being flushed. Its easy for something to land in a cache, but never be GC’d because of the cache reference.
Swig’d code: Make sure any memory management is being done correctly, its really easy to miss these in python, especially with third party libraries
Monitoring: If you can, get data about memory usage and hits. Usually you’ll see a correlation between a certain type of request and memory usage.
- [Django]-Django-tables2: How to use accessor to bring in foreign columns?
- [Django]-Celery discover tasks in files with other filenames
- [Django]-How to save pillow image object to Django ImageField?
1đź‘Ť
We stumbled over a bug in Django with big sitemaps (10.000 items). Seems Django is trying to load them all in memory when generating the sitemap: http://code.djangoproject.com/ticket/11572 – effectively kills the apache process when Google pays a visit to the site.
- [Django]-Stack trace from manage.py runserver not appearing
- [Django]-Django: Get list of model fields?
- [Django]-Convert Django Model object to dict with all of the fields intact