github opendatalab/MinerU mineru-2.1.0-released

What's Changed

  • 2025/07/05 2.1.0发布

    • 这是 MinerU 2 的第一个大版本更新,包含了大量新功能和改进,包含众多性能优化、体验优化和bug修复,具体更新内容如下:
    • 性能优化:
      • 大幅提升某些特定分辨率(长边2000像素左右)文档的预处理速度
      • 大幅提升pipeline后端批量处理大量页数较少(<10)文档时的后处理速度
      • pipline后端的layout分析速度提升约20%
    • 体验优化:
      • 内置开箱即用的fastapi服务gradio webui,详细使用方法请参考文档
      • sglang适配0.4.8版本,大幅降低vlm-sglang后端的显存要求,最低可在8G显存(Turing及以后架构)的显卡上运行
      • 对所有命令增加sglang的参数透传,使得sglang-engine后端可以与sglang-server一致,接收sglang的所有参数
      • 支持基于配置文件的功能扩展,包含自定义公式标识符开启标题分级功能自定义本地模型目录,详细使用方法请参考文档
    • 新特性:
      • pipeline后端更新 PP-OCRv5 多语种文本识别模型,支持法语、西班牙语、葡萄牙语、俄语、韩语等 37 种语言的文字识别,平均精度涨幅超30%。详情
      • pipeline后端增加对竖排文本的有限支持
  • 2025/07/05 Version 2.1.0 Released

    • This is the first major update of MinerU 2, which includes a large number of new features and improvements, covering significant performance optimizations, user experience enhancements, and bug fixes. The detailed update contents are as follows:
    • Performance Optimizations:
      • Significantly improved preprocessing speed for documents with specific resolutions (around 2000 pixels on the long side).
      • Greatly enhanced post-processing speed when the pipeline backend handles batch processing of documents with fewer pages (<10 pages).
      • Layout analysis speed of the pipeline backend has been increased by approximately 20%.
    • Experience Enhancements:
      • Built-in ready-to-use fastapi service and gradio webui. For detailed usage instructions, please refer to Documentation.
      • Adapted to sglang version 0.4.8, significantly reducing the GPU memory requirements for the vlm-sglang backend. It can now run on graphics cards with as little as 8GB GPU memory (Turing architecture or newer).
      • Added transparent parameter passing for all commands related to sglang, allowing the sglang-engine backend to receive all sglang parameters consistently with the sglang-server.
      • Supports feature extensions based on configuration files, including custom formula delimiters, enabling heading classification, and customizing local model directories. For detailed usage instructions, please refer to Documentation.
    • New Features:
      • Updated the pipeline backend with the PP-OCRv5 multilingual text recognition model, supporting text recognition in 37 languages such as French, Spanish, Portuguese, Russian, and Korean, with an average accuracy improvement of over 30%. Details
      • Introduced limited support for vertical text layout in the pipeline backend.

New Contributors

Full Changelog: mineru-2.0.6-released...mineru-2.1.0-released

Don't miss a new MinerU release

NewReleases is sending notifications on new releases.