github opendatalab/MinerU magic_pdf-1.3.12-released

What's Changed

  • 2025/05/24 1.3.12 Released

    • Added support for ppocrv5 model, updated ch_server model to PP-OCRv5_rec_server and ch_lite model to PP-OCRv5_rec_mobile (model update required)
      • In testing, we found that ppocrv5(server) shows some improvement for handwritten documents, but slightly lower accuracy than v4_server_doc for other document types. Therefore, the default ch model remains unchanged as PP-OCRv4_server_rec_doc.
      • Since ppocrv5 enhances recognition capabilities for handwritten text and special characters, you can manually select ppocrv5 models for Japanese, traditional Chinese mixed scenarios and handwritten document scenarios
      • You can select the appropriate model through the lang parameter lang='ch_server' (python api) or --lang ch_server (command line):
        • ch: PP-OCRv4_rec_server_doc (default) (Chinese, English, Japanese, Traditional Chinese mixed/15k dictionary)
        • ch_server: PP-OCRv5_rec_server (Chinese, English, Japanese, Traditional Chinese mixed + handwriting/18k dictionary)
        • ch_lite: PP-OCRv5_rec_mobile (Chinese, English, Japanese, Traditional Chinese mixed + handwriting/18k dictionary)
        • ch_server_v4: PP-OCRv4_rec_server (Chinese, English mixed/6k dictionary)
        • ch_lite_v4: PP-OCRv4_rec_mobile (Chinese, English mixed/6k dictionary)
    • Added support for handwritten documents by optimizing layout recognition of handwritten text areas
      • This feature is supported by default, no additional configuration needed
      • You can refer to the instructions above to manually select ppocrv5 model for better handwritten document parsing
  • 2025/05/24 1.3.12 发布

    • 增加ppocrv5模型的支持,将ch_server模型更新为PP-OCRv5_rec_serverch_lite模型更新为PP-OCRv5_rec_mobile(需更新模型)
      • 在测试中,发现ppocrv5(server)对手写文档效果有一定提升,但在其余类别文档的精度略差于v4_server_doc,因此默认的ch模型保持不变,仍为PP-OCRv4_server_rec_doc
      • 由于ppocrv5强化了手写场景和特殊字符的识别能力,因此您可以在日繁混合场景以及手写文档场景下手动选择使用ppocrv5模型
      • 您可通过lang参数lang='ch_server'(python api)或--lang ch_server(命令行)自行选择相应的模型:
        • chPP-OCRv4_rec_server_doc(默认)(中英日繁混合/1.5w字典)
        • ch_serverPP-OCRv5_rec_server(中英日繁混合+手写场景/1.8w字典)
        • ch_litePP-OCRv5_rec_mobile(中英日繁混合+手写场景/1.8w字典)
        • ch_server_v4PP-OCRv4_rec_server(中英混合/6k字典)
        • ch_lite_v4PP-OCRv4_rec_mobile(中英混合/6k字典)
    • 增加手写文档的支持,通过优化layout对手写文本区域的识别,现已支持手写文档的解析
      • 默认支持此功能,无需额外配置
      • 可以参考上述说明,手动选择ppocrv5模型以获得更好的手写文档解析效果

Full Changelog: magic_pdf-1.3.11-released...magic_pdf-1.3.12-released

Don't miss a new MinerU release

NewReleases is sending notifications on new releases.