Feature
- CLI: Expose code and formula models in the CLI (#820) (
6882e6c
) - Add platform info to CLI version printout (#816) (
95b293a
) - ocr: Expose
rec_keys_path
in RapidOcrOptions to support custom dictionaries (#786) (5332755
) - Introduce automatic language detection in TesseractOcrCliModel (#800) (
3be2fb5
)
Fix
- Fix single newline handling in MD backend (#824) (
5aed9f8
) - Use file extension if filetype fails with PDF (#827) (
adf6353
) - Parse html with omitted body tag (#818) (
a112d7a
)
Documentation
- Document Docling JSON parsing (#819) (
6875913
) - Add SSL verification error mitigation (#821) (
5139b48
) - backend XML: Do not delete temp file in notebook (#817) (
4d41db3
) - Typo (#814) (
8a4ec77
) - Added markdown headings to enable TOC in github pages (#808) (
b885b2f
) - Description of supported formats and backends (#788) (
c2ae1cc
)