[Answer]-Element 'html_block' was not found in OrderedDict

1👍

You have discovered a bug in Python-Markdown. Specifically, there is an incompatibility between setting safe_mode to “escape” and using the “extra” extension. Under the hood, when safe_mode is set to “escape” the parser simply does not insert the preprocessor which finds all of the HTML blocks (which is named ‘html_block’). Later, any HTML is then escaped as it is not specifically marked as known safe HTML. However, the “extra” extension attempts to modify the behavior of the ‘html_block’ preprocessor (to enable the “markdown=1” behavior), and because you are in safe_mode that preprocessor doesn’t exist and it fails.Update: This is not the problem as this situation is accounted for. There is no bug just user error as described below.

Interestingly, I noticed that you have both “extra” and “footnotes” listed as extensions. However, the “footnotes” extension is part of “extra”. In other words, by loading “extra” you already get “footnotes” and don’t need to load it a second time. The same can be said for “attr_list”. This is what is causing the error. Trying to load the same extensions twice. In fact, the complete list of extensions which “extra” includes for you can be found here:

  • Abbreviations
  • Attribute Lists
  • Definition Lists
  • Fenced Code Blocks
  • Footnotes
  • Tables
  • Smart Strong

The only thing you loose by loading each of those extensions individually rather than all together as part of “extra” is less typing and the “markdown=1” feature (which allows Markdown to be parsed inside raw HTML blocks). Interestingly, if you are using safe_mode, then the “markdown=1” feature of “extra” is of no use to you. Therefore, rather than load “extra” you could simply load each of the individual extensions listed above and then safe_mode would still work.

That said, safe_mode is being deprecated and will no longer be available in the next release of Python-Markdown. As explained in the the release notes, rather than using safe_mode you should be passing untrusted content through an HTML sanitizer (like Bleach) after it is converted to HTML by markdown:

import bleach
html = bleach.clean(markdown.markdown(text), **MARKDOWN_KWARGS)

If you do that, then you can still get some security when parsing markdown from an untrusted source, you won’t encounter the above mentioned bug, and your code will continue to work in future releases of Python-Markdown.

👤Waylan

Leave a comment