Minor Changes
-
#953
db6badbThanks @CorentinTh! - Added content extraction support for scanned PDFs images in 1-bit-per-pixel grayscale format. -
#948
725eaffThanks @CorentinTh! - When extracting text from PDF documents, if neither text nor images suitable for OCR are found, the pages are rendered as images and processed with OCR. Adding support for vectorized text.
Patch Changes
- #949
ec740edThanks @CorentinTh! - Added document content extraction support for .xlsx and .ods files.