Added
- Add
"matrix"
property tochar
objects, representing the current transformation matrix. (ae6f99e) - Add
pdfplumber.ctm
submodule with classCTM
, to calculate scale, skew, and translation of a current transformation matrix obtained from achar
's"matrix"
property. (ae6f99e) - Add
page.search(...)
, an experimental feature that allows you to search a page's text via regular expressions and non-regex strings, returning the text, any regex matches, the bounding box coordinates, and the char objects themselves. (#201 + 58b1ab1) - Add
--include-attrs
/--exclude-attrs
to CLI (and corresponding params to.to_json(...)
,.to_csv(...)
, andSerializer
. (4deac25) - Add
py.typed
for PEP561 compatibility and detection of typing hints by mypy. (ca795d1) [h/t @jhonatan-lopes]
Changed
- Bump pinned
pdfminer.six
version to20220524
. (486cea8)
Removed
- Remove
utils.collate_chars(...)
, the old name (and then alias) forutils.extract_text(...)
. (24f3532)
Fixed
- Fix
IndexError
bug for.extract_text(layout=True)
on pages without text. (#658 + ad3df11) [h/t @ethanscorey]