Changed
Refactor TOC Sanitation
- All postprocessors are now run on heading content.
- Footnote references are now stripped from heading content. Fixes #660.
- A more robust
striptags
is provided to convert headings to plain text.
Unlike, themarkupsafe
implementation, HTML entities are not unescaped. - The plain text
name
, richhtml
, and unescaped rawdata-toc-label
are
saved totoc_tokens
, allowing users to access the full rich text content of
the headings directly fromtoc_tokens
. - The value of
data-toc-label
is sanitized separate from heading content
before being written toname
. This fixes a bug which allowed markup through
in certain circumstances. To access the raw unsanitized data, retrieve the
value fromtoken['data-toc-label']
directly. - An
html.unescape
call is made just prior to callingslugify
so that
slugify
only operates on Unicode characters. Note thathtml.unescape
is
not run onname
,html
, ordata-toc-label
. - The functions
get_name
andstashedHTML2text
defined in thetoc
extension
are both deprecated. Instead, third party extensions should use some
combination of the new functionsrun_postprocessors
,render_inner_html
and
striptags
.
Fixed
- Include
scripts/*.py
in the generated source tarballs (#1430). - Ensure lines after heading in loose list are properly detabbed (#1443).
- Give smarty tree processor higher priority than toc (#1440).
- Permit carets (
^
) and square brackets (]
) but explicitly exclude
backslashes (\
) from abbreviations (#1444). - In attribute lists (
attr_list
,fenced_code
), quoted attribute values are
now allowed to contain curly braces (}
) (#1414).