Added
- Export type annotations from pypi package per PEP561 (#679)
- Support for identity cmap's (#626)
- Add support for PDF page labels (#680)
- Installation of Pillow as an optional extra dependency (#714)
Fixed
- Hande decompression error due to CRC checksum error (#637)
- Regression (since 20191107) in
LTLayoutContainer.group_textboxes
that returned some text lines out of order (#659) - Add handling of JPXDecode filter to enable extraction of images for some pdfs (#645)
- Fix extraction of jbig2 files, which was producing invalid files (#652)
- Crash in
pdf2txt.py --boxes-flow=disabled
(#682) - Only use xref fallback if
PDFNoValidXRef
is raised andfallback
is True (#684) - Ignore empty characters when analyzing layout (#499)
Changed
- Replace warnings.warn with logging.Logger.warning in line with recommended use (#673)
- Switched from nose to pytest, from tox to nox and from Travis CI to GitHub Actions (#704)
Removed
- Unnecessary return statements without argument at the end of functions (#707)