What's Changed
-
2025/12/30 2.7.0 Release
- Simplified installation process. No need to separately install
vlmacceleration engine dependencies. Usinguv pip install mineru[all]during installation will install all optional backend dependencies. - Added new
hybridbackend, which combines the advantages ofpipelineandvlmbackends. Built on vlm, it integrates some capabilities of pipeline, adding extra extensibility on top of high accuracy:- Directly extracts text from text PDFs, natively supports multi-language recognition in text PDF scenarios, and greatly reduces parsing hallucinations;
- Supports text recognition in 109 languages for scanned PDF scenarios by specifying OCR language;
- Independent inline formula recognition switch, which can be disabled separately when inline formula recognition is not needed, improving the visual effect of parsing results.
- Simplified engine selection logic for
vlm/hybridbackends. Users only need to specify the backend as*-auto-engine, and the system will automatically select the appropriate engine for inference acceleration based on the current environment, improving usability. - Switched default parsing backend from
pipelinetohybrid-auto-engine, improving out-of-the-box result consistency for new users and avoiding cognitive differences in parsing results. - Added i18n support to gradio application, supporting switching between Chinese and English languages.
- Simplified installation process. No need to separately install
-
2025/12/30 2.7.0 发布
- 简化安装流程,现在不再需要单独安装
vlm加速引擎依赖包,安装时使用uv pip install mineru[all]即可安装所有可选后端的依赖包。 - 增加全新后端
hybrid,该后端结合了pipeline和vlm后端的优势,在vlm的基础上,融入了pipeline的部分能力,在高精度的基础上增加了额外的扩展性:- 从文本pdf中直接抽取文本,在文本pdf场景原生支持多语言识别,并极大减少解析幻觉;
- 通过指定ocr语言,在扫描pdf场景下支持109种语言的文本识别;
- 独立的行内公式识别开关,在不需要行内公式识别的场景下可单独关闭,提升解析结果视觉效果。
- 简化
vlm/hybrid后端的引擎选择逻辑,用户只需指定后端为*-auto-engine,系统会根据当前环境自动选择合适的引擎进行推理加速,提升易用性. - 默认解析后端从
pipeline切换至hybrid-auto-engine,提升新用户开箱即用的结果一致性,避免出现解析结果认知差异。 - gradio应用增加i18n适配,支持中英文两种语言切换。
- 简化安装流程,现在不再需要单独安装
Full Changelog: mineru-2.6.8-released...mineru-2.7.0-released