github opendatalab/MinerU mineru-3.1.15-released

9 hours ago

What's Changed

  • Improved Gradio preview and upload experience, including Office source-file preview links, clipboard file upload, clearer processing status, better i18n rendering, and extracted Gradio CSS/JS/header resources.
  • Fixed Gradio Markdown/HTML image previews to use served file URLs instead of embedded base64, improving preview compatibility without changing exported artifacts.
  • Improved Office parsing robustness, including DOCX table alignment, safer XML tag-name handling, embedded Office member normalization, and better DOCX table matching.
  • Enhanced XLSX package normalization for shared strings, styles, worksheets, and row-only auto filters to improve compatibility with non-standard files.
  • Optimized OCR/formula processing and image handling, including async OCR/formula execution, updated OCR defaults, larger image width limits, cached vector placeholders, and single-write image reuse.
  • Added Router API docs for POST request parameters in /file_parse and /tasks.

Full Changelog: mineru-3.1.14-released...mineru-3.1.15-released

Don't miss a new MinerU release

NewReleases is sending notifications on new releases.