What's Changed
-
2025/07/05 2.1.0发布
- 这是 MinerU 2 的第一个大版本更新,包含了大量新功能和改进,包含众多性能优化、体验优化和bug修复,具体更新内容如下:
- 性能优化:
- 大幅提升某些特定分辨率(长边2000像素左右)文档的预处理速度
- 大幅提升
pipeline
后端批量处理大量页数较少(<10)文档时的后处理速度 pipline
后端的layout分析速度提升约20%
- 体验优化:
- 新特性:
pipeline
后端更新 PP-OCRv5 多语种文本识别模型,支持法语、西班牙语、葡萄牙语、俄语、韩语等 37 种语言的文字识别,平均精度涨幅超30%。详情pipeline
后端增加对竖排文本的有限支持
-
2025/07/05 Version 2.1.0 Released
- This is the first major update of MinerU 2, which includes a large number of new features and improvements, covering significant performance optimizations, user experience enhancements, and bug fixes. The detailed update contents are as follows:
- Performance Optimizations:
- Significantly improved preprocessing speed for documents with specific resolutions (around 2000 pixels on the long side).
- Greatly enhanced post-processing speed when the
pipeline
backend handles batch processing of documents with fewer pages (<10 pages). - Layout analysis speed of the
pipeline
backend has been increased by approximately 20%.
- Experience Enhancements:
- Built-in ready-to-use
fastapi service
andgradio webui
. For detailed usage instructions, please refer to Documentation. - Adapted to
sglang
version0.4.8
, significantly reducing the GPU memory requirements for thevlm-sglang
backend. It can now run on graphics cards with as little as8GB GPU memory
(Turing architecture or newer). - Added transparent parameter passing for all commands related to
sglang
, allowing thesglang-engine
backend to receive allsglang
parameters consistently with thesglang-server
. - Supports feature extensions based on configuration files, including
custom formula delimiters
,enabling heading classification
, andcustomizing local model directories
. For detailed usage instructions, please refer to Documentation.
- Built-in ready-to-use
- New Features:
- Updated the
pipeline
backend with the PP-OCRv5 multilingual text recognition model, supporting text recognition in 37 languages such as French, Spanish, Portuguese, Russian, and Korean, with an average accuracy improvement of over 30%. Details - Introduced limited support for vertical text layout in the
pipeline
backend.
- Updated the
New Contributors
- @herryqg made their first contribution in #2763
- @QIN2DIM made their first contribution in #2758
- @zhanluxianshen made their first contribution in #2787
- @hzwzwzw made their first contribution in #2831
- @itswcg made their first contribution in #2887
- @yuanjua made their first contribution in #2727
- @Ar-Hyk made their first contribution in #2634
Full Changelog: mineru-2.0.6-released...mineru-2.1.0-released