What's Changed
-
2026/03/29 3.0.0 Released
This release delivers a systematic upgrade centered on parsing capability, system architecture, and engineering usability. The main updates include:
- Native
DOCXparsing- Official support for native
DOCXparsing, delivering high-precision results without hallucinations. - Compared with the traditional workflow of first converting
DOCXtoPDFand then parsing it, end-to-end speed is improved by tens of times, making it better suited for scenarios with high requirements for both accuracy and throughput.
- Official support for native
pipelinebackend upgrade- The
pipelinebackend achieves a score of86.2on OmniDocBench (v1.5), surpassing the accuracy of the previous-generation mainstream VLMMinerU2.0-2505-0.9B. - Added support for parsing images/formulas inside tables, seal text recognition, vertical text support, and interline formula numbering recognition, continuously improving parsing quality for complex document scenarios.
- While maintaining high accuracy, it keeps resource usage extremely low and continues to support inference in pure CPU environments.
- The
API / CLI / Routerorchestration upgrademinerunow runs as an orchestration client based onmineru-api; when--api-urlis not provided, it will automatically start a local temporary service.mineru-apiadds a new asynchronous task endpointPOST /tasks, supporting task submission, status querying, and result retrieval; meanwhile, it retains the synchronous parsing endpointPOST /file_parsefor compatibility with legacy plugins.- Added
mineru-router, designed for unified entry deployment and task routing across multiple services and multiple GPUs; its interfaces are fully compatible withmineru-apiand support automatic task load balancing.
- Deployment and usability improvements
- Resolved compatibility issues with
torch >= 2.8; the base image has been upgraded tovllm0.11.2 + torch2.9.0, unifying installation paths across different Compute Capabilities. - Optimized the parsing pipeline with a sliding-window mechanism, significantly reducing peak memory usage in long-document scenarios, so documents with tens of thousands of pages no longer need to be split manually.
- Batch inference in
pipelinenow supports streaming writes to disk, allowing completed parsing results to be written out in time and further improving the experience for long-running tasks. - Completed thread-safety optimization and now fully supports multi-threaded concurrent inference; together with
mineru-router, this enables one-click multi-GPU deployment and makes it easy to build high-concurrency, high-throughput parsing systems. - Completely removed the use of two AGPLv3 models (
doclayoutyoloandmfd_yolov8) and one CC-BY-NC-SA 4.0 model (layoutreader).
- Resolved compatibility issues with
This update is not just a set of feature enhancements, but a key leap forward in MinerU's overall system capabilities. We specifically addressed the peak memory usage issue in long-document parsing. Through optimizations such as sliding windows and streaming writes to disk, ultra-long document parsing has moved from “requiring manual splitting and careful handling” to being “stable, scalable, and ready for production workloads.” At the same time, we completed thread-safety optimization and fully enabled multi-threaded concurrent inference, further improving single-machine resource utilization and runtime stability under high-concurrency workloads. On top of this, with
mineru-routerand the newAPI / CLIorchestration framework, MinerU now supports one-click multi-GPU deployment, unified access across multiple services, and automatic task load balancing, significantly reducing the difficulty of large-scale deployment. As a result, MinerU is evolving from a standalone data production tool into a large-scale document parsing foundation for high-concurrency and high-throughput scenarios, providing enterprise-grade document data processing with infrastructure that is more stable, more efficient, and easier to scale. - Native
-
2026/03/29 3.0.0 发布
本次版本更新围绕解析能力、系统架构与工程可用性进行了系统升级。主要更新内容包括:
DOCX原生解析- 正式支持
DOCX原生解析,在无幻觉前提下实现高精度解析。 - 相较于“先将
DOCX转为PDF再解析”的传统流程,端到端速度提升数十倍以上,更适合对精度与吞吐均有要求的场景。
- 正式支持
pipeline后端升级pipeline后端在 OmniDocBench (v1.5) 上取得86.2分,精度超过上一代主流 VLMMinerU2.0-2505-0.9B。- 新增表格内图片/公式解析、印章文字识别、竖排文本支持、行间公式序号识别等能力,持续提升复杂文档场景下的解析效果。
- 在保持高精度的同时,资源占用极低,并继续支持纯 CPU 环境推理。
API / CLI / Router编排升级mineru现作为基于mineru-api的编排客户端运行;在未传入--api-url时,会自动拉起本地临时服务。mineru-api新增异步任务接口POST /tasks,支持任务提交、状态查询与结果获取;同时保留同步解析接口POST /file_parse,以兼容老版本插件。- 新增
mineru-router,适用于多服务、多 GPU 的统一入口部署与任务路由;其接口与mineru-api完全兼容,并支持任务自动负载均衡。
- 部署与使用体验优化
- 解决了
torch >= 2.8的兼容问题,基础镜像升级为vllm0.11.2 + torch2.9.0,统一了不同 Compute Capability 的安装路径。 - 通过滑动窗口优化解析链路,显著降低长文档场景下的内存峰值占用,上万页文档解析不再需要手动拆分。
pipeline的 batch 推理支持流式落盘,已完成的解析结果可及时写出,进一步提升长任务处理体验。- 完成线程安全优化,全面支持多线程并发推理;配合
mineru-router,可一键实现多卡部署,轻松构建高并发、高吞吐解析系统。 - 完全移除了两个 AGPLv3 模型(
doclayoutyolo和mfd_yolov8)以及一个 CC-BY-NC-SA 4.0 模型(layoutreader)的使用。
- 解决了
本次更新不仅是若干功能点的补强,更是 MinerU 在系统能力上的一次关键跃迁。我们重点解决了长文档解析过程中的内存峰值占用问题,通过滑动窗口、流式落盘等链路优化,让超长文档解析从“需要手动拆分、谨慎处理”走向“稳定可跑、规模可扩展”。同时,我们完成了线程安全优化,全面支持多线程并发推理,进一步提升了单机资源利用率与高并发场景下的运行稳定性。在此基础上,基于 mineru-router 与全新的 API / CLI 编排体系,MinerU 已具备一键多卡部署、多服务统一接入、任务自动负载均衡的能力,显著降低了大规模部署难度。至此,MinerU 正在从单一的数据生产工具,进一步演进为面向高并发、高吞吐场景的大规模文档解析基座,为企业级文档数据处理提供更稳定、更高效、更易扩展的基础设施能力。
New Contributors
- @boshi91 made their first contribution in #4523
- @Niujunbo2002 made their first contribution in #4662
Full Changelog: mineru-2.7.6-released...mineru-3.0.0-released