[Answer]-Replace terms introduced by TinyMCE when the text is viewed

1👍

Dealing directly with HTML code is never a good idea, if you simply do a replace on the html text you may get into problems like this:

<img src="static.example.com/jinja-templating"/>

becoming:

<img src="static.example.com/<a href='/glossary?word=jinja'>jinja</a>-templating"/>

which is absolutely destructive. No words.

So what can I do?

HTML Parser

I highly recommend learning and using an HTML parser like BeautifulSoup

Regex

Regex is also not considered safe when dealing directly with html, however at
times it can get the job done. For your case I decided to come up with a regular
expression which might get it done.

import re

html = '<div id="term"><span style="term:10px">term</span><img src="static.example.com/term"/></div><div>the technology term is amazing</div>'
glossaried = re.sub(r'>([^<>]*)term([^<>]*)<',r'>\1<a href="/glossary?word=term">term</a>\2<', html)
print glossaried

'<div id="term"><span style="term:10px"><a href="/glossary?word=term">term</a></span><img src="static.example.com/term"/></div><div>the technology <a href="/glossary?word=term">term</a> is amazing</div>'

Leave a comment