Feature
- AutoOCR model selecting the best OCR model available and deprecating the usage of EasyOCR (#2391) (
f7244a4
) - Add Tesseract PSM options support (#2411) (
f11f8c0
)
Fix
- asr: Implement robust status check in AsrPipeline (#2442) (
db985bb
) - Deal with chartsheets in workbooks (#2433) (
cce18b2
) - Skip temporary docx files (#2413) (
ee55013
) - AsrPipeline to handle absolute paths and BytesIO streams correctly (#2407) (
b5f7fef
) - Enrichment of documents without pages metadata (pptx and xlsx) (#2401) (
0610d01
) - Proper heading support in rich tables for HTML backend (#2394) (
9705f40
)