opendatalab/MinerU mineru-2.1.0-released on GitHub

What's Changed

2025/07/05 2.1.0发布
- 这是 MinerU 2 的第一个大版本更新，包含了大量新功能和改进，包含众多性能优化、体验优化和bug修复，具体更新内容如下：
- 性能优化：
  - 大幅提升某些特定分辨率（长边2000像素左右）文档的预处理速度
  - 大幅提升pipeline后端批量处理大量页数较少（<10）文档时的后处理速度
  - pipline后端的layout分析速度提升约20%
- 体验优化：
  - 内置开箱即用的fastapi服务和gradio webui，详细使用方法请参考文档
  - sglang适配0.4.8版本，大幅降低vlm-sglang后端的显存要求，最低可在8G显存(Turing及以后架构)的显卡上运行
  - 对所有命令增加sglang的参数透传，使得sglang-engine后端可以与sglang-server一致，接收sglang的所有参数
  - 支持基于配置文件的功能扩展，包含自定义公式标识符、开启标题分级功能、自定义本地模型目录，详细使用方法请参考文档
- 新特性：
  - pipeline后端更新 PP-OCRv5 多语种文本识别模型，支持法语、西班牙语、葡萄牙语、俄语、韩语等 37 种语言的文字识别，平均精度涨幅超30%。详情
  - pipeline后端增加对竖排文本的有限支持
2025/07/05 Version 2.1.0 Released
- This is the first major update of MinerU 2, which includes a large number of new features and improvements, covering significant performance optimizations, user experience enhancements, and bug fixes. The detailed update contents are as follows:
- Performance Optimizations:
  - Significantly improved preprocessing speed for documents with specific resolutions (around 2000 pixels on the long side).
  - Greatly enhanced post-processing speed when the pipeline backend handles batch processing of documents with fewer pages (<10 pages).
  - Layout analysis speed of the pipeline backend has been increased by approximately 20%.
- Experience Enhancements:
  - Built-in ready-to-use fastapi service and gradio webui. For detailed usage instructions, please refer to Documentation.
  - Adapted to sglang version 0.4.8, significantly reducing the GPU memory requirements for the vlm-sglang backend. It can now run on graphics cards with as little as 8GB GPU memory (Turing architecture or newer).
  - Added transparent parameter passing for all commands related to sglang, allowing the sglang-engine backend to receive all sglang parameters consistently with the sglang-server.
  - Supports feature extensions based on configuration files, including custom formula delimiters, enabling heading classification, and customizing local model directories. For detailed usage instructions, please refer to Documentation.
- New Features:
  - Updated the pipeline backend with the PP-OCRv5 multilingual text recognition model, supporting text recognition in 37 languages such as French, Spanish, Portuguese, Russian, and Korean, with an average accuracy improvement of over 30%. Details
  - Introduced limited support for vertical text layout in the pipeline backend.

New Contributors

@herryqg made their first contribution in #2763
@QIN2DIM made their first contribution in #2758
@zhanluxianshen made their first contribution in #2787
@hzwzwzw made their first contribution in #2831
@itswcg made their first contribution in #2887
@yuanjua made their first contribution in #2727
@Ar-Hyk made their first contribution in #2634

Full Changelog: mineru-2.0.6-released...mineru-2.1.0-released