jgm/pandoc 2.11.3 on GitHub

Click to expand changelog

With --bibliography (or bibliography in metadata), a URL may now be provided, and pandoc will fetch the resource. In addition, if a file path is provided and it is not found relative to the working directory, the resource path will be searched (#6940).
Add sourcepos extension for commonmark, gfm, commonmark_x (#4565). With the sourcepos extension set set, data-pos attributes are added to the AST by the commonmark reader. No other readers are affected. The data-pos attributes are put on elements that accept attributes; for other elements, an enlosing Div or Span is added to hold the attributes.
Change extensions for commonmark_x: replace auto_identifiers with gfm_auto_identifiers (#6863). commonmark_x never actually supported auto_identifiers (it didn’t do anything), because the underlying library implements gfm-style identifiers only. Attempts to add the auto_identifiers extension to commonmark will now fail with an error.
HTML reader:
- Split module into several submodules (Albert Krewinkel). Reducing module size should reduce memory use during compilation.
- Support advanced table features (Albert Krewinkel): block level content in captions, row and colspans, body headers, row head columns, footers, attributes.
- Disable round-trip testing for tables. Information for cell alignment in a column is not preserved during round-trips (Albert Krewinkel).
- Allow finer grained options for tag omission (Albert Krewinkel).
- Simplify list attribute handling (Albert Krewinkel).
- Pay attention to lang attributes on body element (#6938). These (as well as lang attributes on the html element) should update lang in metadata.
- Retain attribute prefixes and avoid duplicates (#6938). Previously we stripped attribute prefixes, reading xml:lang as lang for example. This resulted in two duplicate lang attributes when xml:lang and lang were both used. This commit causes the prefixes to be retained, and also avoids invald duplicate attributes.
Commonmark reader:
- Refactor specFor.
- Set input name to "" to avoid clutter in sourcepos output.
Org reader:
- Parse #+LANGUAGE into lang metadata field (#6845, Albert Krewinkel).
- Preserve targets of spurious links (#6916, Albert Krewinkel). Links with (internal) targets that the reader doesn’t know about are converted into emphasized text. Information on the link target is now preserved by wrapping the text in a Span of class spurious-link, with an attribute target set to the link’s original target. This allows to recover and fix broken or unknown links with filters.
DocBook reader:
- Table text width support (#6791, Nils Carlson). Table width in relation to text width is not natively supported by docbook but is by the docbook fo stylesheets through an XML processing instruction, <?dbfo table-width="50%"?>.
LaTeX reader:
- Improve parsing of command options (#6869, #6873). In cases where we run into trouble parsing inlines til the closing ], e.g. quotes, we return a plain string with the option contents. Previously we mistakenly included the brackets in this string.
- Preserve center environment (#6852, Igor Pashev). The contents of the center environment are put in a Div with class center.
- Don’t parse \rule with width 0 as horizontal rule. These are sometimes used as spacers in LaTeX.
- Don’t apply theorem default styling to a figure inside (#6925). If we put an image in italics, then when rendering to Markdown we no longer get an implicit figure.
Dokuwiki reader:
- Handle unknown interwiki links better (#6932). DokuWiki lets the user define his own Interwiki links. Previously pandoc reacted to these by emitting a google search link, which is not helpful. Instead, we now just emit the full URL including the wikilink prefix, e.g. faquk>FAQ-mathml. This at least gives users the ability to modify the links using filters.
Markdown writer:
- Properly handle boolean values in writing YAML metadata (#6388).
- Ensure that a new csl-block begins on a new line (#6921). This just looks better and doesn’t affect the semantics.
RST writer:
- Better image handling (#6948). An image alone in its paragraph (but not a figure) is now rendered as an independent image, with an alt attribute if a description is supplied. An inline image that is not alone in its paragraph will be rendered, as before, using a substitution. Such an image cannot have a “center”, “left”, or “right” alignment, so the classes align-center, align-left, or align-right are ignored. However, align-top, align-middle, align-bottom will generate a corresponding align attribute.
Docx writer:
- Keep raw openxml strings verbatim (#6933, Albert Krewinkel).
- Use Content instead of Element. This allows us to inject raw OpenXML into the document without reparsing it into an Element, which is necessary if you want to inject an open tag or close tag.
- Fix bullets/lists indentation, so that the first level is slightly indented to the right instead of right on the margin (cholonam).
- Support bold and italic in “complex script” (#6911). Previously bold and italics didn’t work properly in LTR text. This commit causes the w:bCs and w:iCs attributes to be used, in addition to w:b and w:i, for bold and italics respectively.
ICML writer:
- Fix image bounding box for custom widths/heighta (Mauro Bieg, #6936).
LaTeX writer:
- Improve table spacing (#6842, #6860). Remove the \strut that was added at the end of minipage environments in cells. Replace \tabularnewline with \\ \addlinespace.
- Improve calculation of column spacing (#6883).
- Extract table handling into separate module (Albert Krewinkel).
- Fix bug with nested csl- display Spans (#6921).
- Improve longtable output (#6883). Don’t create minipages for regular paragraphs. Put width and alignment information in the longtable column descriptors.
OpenDocument writer:
- Support for table width as a percentage of text width (#6792, Nils Carson).
- Implement Div and Span ident support (#6755, Nils Carson). Spans and Divs containing an ident in the Attr will become bookmarks or sections with idents in OpenDocument format.
- Add two extensions, xrefs_name and xrefs_number (#6774, Nils Carlson). Links to headings, figures and tables inside the document are substituted with cross-references that will use the name or caption of the referenced item for xrefs_name or the number for xrefs_number. For the xrefs_number to be useful heading numbers must be enabled in the generated document and table and figure captions must be enabled using for example the native_numbering extension. In order for numbers and reference text to be updated the generated document must be refreshed.
JATS writer:
- Support advanced table features (Albert Krewinkel).
- Support author affiliations (#6687, Albert Krewinkel).
Docbook writer:
- Use correct id attribute consistently (Jan Tojnar). DocBook5 should always use xml:id instead of id.
- Handle admonition titles better (Jan Tojnar). Docbook reader produces a Div with title class for <title> element within an “admonition” element. Markdown writer then turns this into a fenced div with title class attribute. Since fenced divs are block elements, their content is recognized as a paragraph by the Markdown reader. This is an issue for Docbook writer because it would produce an invalid DocBook document from such AST – the <title> element can only contain “inline” elements. Handle this special case separately by unwrapping the paragraph before creating the <title> element.
- Add XML namespaces to top-level elements (#6923, Jan Tojnar). Previously, we only added xmlns attributes to chapter elements, even when running with --top-level-division=section. These namespaces are now added to part and section elements too, when they are the selected top-level divisions. We do not need to add namespaces to documents produced with --standalone flag, since those will already have xmlns attribute on the root element in the template.
HTML writer:
- Fix handling of nested csl- display spans (#6921). Previously inner Spans used to represent CSL display attributes were not rendered as div tags as intended.
EPUB writer:
- Include title page in landmarks (#6919). Note that the toc is also included if --toc is specified.
- Add frontmatter type on body element for nav.xhtml (#6918).
EPUB templates: use preserveAspectRatio=“xMidYMid” for cover image (#6895, Shin Sang-jae). This change affects both the epub2 and the epub3 templates. It avoids distortion of the cover image by requiring that the aspect ratio be preserved.
LaTeX template:
- Include csquotes package if csquotes variable set.
- Put back amssymb. We need it for checkboxes in todo lists, and maybe for other things. In this location it seems compatible with the cases that prompted #6469 and PR #6762.
- Disable language-specific shorthands in babel (#6817, #6887). Babel defines “shorthands” for some languages, and these can produce unexpected results. For example, in Spanish, 1.22 gets rendered as 122, and et~al. as etal. One would think that babel’s shorthands=off option (which we were using) would disable these, but it doesn’t. So we remove shorthands=off and add some code that redefines the shorthands macro. Eventually this will be fixed in babel, I hope, and we can revert to something simpler.
JATS template: allow array of persistent institute ids in pid (Albert Krewinkel).
Text.Pandoc.Parsing: minor code and efficiency improvements.
Text.Pandoc.Extension:
- Add Ext_sourcepos constructor for Extension [API change].
- Add Ext_xrefs_name and Ext_xrefs_number constructors for Extension (Nils Carson) [API change].
Text.Pandoc.Citeproc:
- Fix truncation of [Citation] list in Cite inside footnotes (#6890). This affected author-in-text citations in footnotes. It didn’t cause problems for the printed output, but for filters that expected the citation id and other information.
- Allow the use of both inline and external references (#6951), as with pandoc-citeproc. References defined in the document’s metadata take priority over references with the same id defined in an external bibliography.
- Use fetchItem to get external bibliography (#6940).
- Ensure that BCP47 lang codes can be used. We ignore the variants and just use the base lang code and country code when passing off to citeproc.
- Citeproc BibTeX parser: revert change in getRawField which was made (for reasons forgotten) when transferring this code from pandoc-citeproc. The change led to -- in URLs being interpreted as en-dashes, which is unwanted (#6874).
Text.Pandoc.ImageSize:
- Default to DPI 72 if the format specifies DPI of 0 (#6880). This shouldn’t happen, in general, but it can happen with JPEGs that don’t conform to the spec. Having a DPI of 0 will blow up size calculations (division by 0).
- ImageSize: use JuicyPixels to determine size for png, jpeg, and gif, instead of doing our own binary parsing (#6936). This gives more reliable results.
Text.Pandoc.CSS:
- Remove foldOrElse (internal module) (Albert Krewinkel).
Use skylighting 0.10.2 (#6625).
Use citeproc 0.3. This fixes issues with references with duplicate ids (jgm/citeproc#36).
Use doctemplates 0.9. This fixes issues with boolean metadata values in the Markdown writer (#6388) and in meta-json (#6650). It also fixes issues with nested for loops in templates.
Add translations zh-Hans.yaml and zh-Hant.yaml (#6904, #6909, Kolen Cheung, taotieren).
Add tests: True to cabal.project. This fixes some CI failures for cabal.
Normalize test/tables/*.native (#6888, Kolen Cheung).
Move executable to app directory to avoid problems with cabal repl.
CONTRIBUTING: add section “How can I help?” (#6892, Albert Krewinkel). Also adds a paragraph aimed at highlighting the importance of feature maintenance.
MANUAL: Document that –number-sections works in ms (#6935).

jgm/pandoc 2.11.3 pandoc 2.11.3 on GitHub

jgm/pandoc 2.11.3
pandoc 2.11.3

on GitHub