1
On a first glance the code looks ok, but checking for security vulnerabilities is something not to be taken lightly and needs a bit of time investment to check on your own.
For instance, test if providing a string such as <script>alert('hello')</script>
is executed. Apart from this simplistic test, there are many things to check. There is much documentation on the matter.
Furthermore, as mentioned in my comment, I would strongly recommend you to use an established library for sanitizing input. Such a library is bleach:
Bleach is a whitelist-based HTML sanitization and text linkification library. It is designed to take untrusted user input with some HTML.
Because Bleach uses html5lib to parse document fragments the same way browsers do, it is extremely resilient to unknown attacks, much more so than regular-expression-based sanitizers.
This way you are sure at least that your attack surface is smaller, since this software is much more tested and you owuld only have to worry about your allowed HTML tags rather than if your code works.
Usage example:
import bleach
mystring = bleach.clean(form.cleaned_data['mystring'],
tags=ALLOWED_TAGS,
attributes=ALLOWED_ATTRIBUTES,
styles=ALLOWED_STYLES,
strip=False, strip_comments=True)
1
This is likely unsafe. BeautifulSoup defaults to using the lxml.html parser, and one can likely exploit differences between this and the browsers’ parsers (all following the HTML spec) to smuggle through strings that the browser will see as an element but your code will not. Using BeautifulSoup with html5lib would alleviate that possible attack surface, as then you have a parser identical to browsers.
You probably don’t want to allow width, height, and class as it would allow the attacker to make their image the entire size of the page.
Still, in general, I would agree with Wtower’s answer that using an established third-party library is likely safer.
- [Answered ]-Try to join a OneToOne relationship in Django
- [Answered ]-Most common regex on Django URLs
- [Answered ]-In Django, how to include template folder I have created with the base file named base.html on settings.py?
- [Answered ]-Django not displaying correct URL after reverse