Added
- Add
Table.columns
, analogous toTable.rows
(h/t @Pk13055). (#1050 + d39302f) - Add
Page.extract_words(return_chars=True)
, mirroringPage.search(..., return_chars=True)
; if this argument is passed, each word dictionary will include an additional key-value pair:"chars": [char_object, ...]
(h/t @cmdlineluser). (#1173 + 1496cbd) - Add
pdfplumber.open(unicode_norm="NFC"/"NFD"/"NFKC"/NFKD")
, where the values are the four options for Unicode normalization (h/t @petermr + @agusluques). (#905 + 03a477f)
Changed
- Change default setting
pdfplumber.repair(...)
passes to Ghostscript's-dPDFSETTINGS
parameter, fromprepress
todefault
, and make that setting modifiable via.repair(setting=...)
, where the value is one of"default"
,"prepress"
,"printer"
, or"ebook"
(h/t @Laubeee). (#874 + 48cab3f)
Fixed
- Fix handling of object coordinates when
mediabox
does not begin at(0,0)
(h/t @wodny). (#1181 + 9025c3f + 046bd87) - Fix error on getting
.annots
/.hyperlinks
fromCroppedPage
(due to missing.rotation
and.initial_doctop
attributes) (h/t @Safrone). (#1171 + e5737d2) - Fix problem where
Page.crop(...)
was not cropping.annots/.hyperlinks
(h/t @Safrone). (#1171 + 22494e8) - Fix calculation of coordinates for
.annots
onCroppedPage
s. (0bbb340 + b16acc3) - Dereference structure element attributes (h/t @dhdaines). (#1169 + 3f16180)
- Fix
Page.get_attr(...)
so that it fully resolves references before determining whether the attribute's value isNone
(h/t @zzhangyun + @mkl-public). (#1176 + c20cd3b)