Security
office_oxide0.1.2 → 0.1.3 (clears RUSTSEC-2026-0194 / RUSTSEC-2026-0195) — the optional Office-document export path depended onoffice_oxide0.1.2, whose transitivequick-xml0.40 has an unbounded per-xmlnsheap allocation inNsReader::pushthat a crafted DOCX/XLSX/PPTX could use to exhaust memory (a denial-of-service on untrusted input).office_oxide0.1.3 upgrades toquick-xml0.41, which bounds the allocation. pdf_oxide's ownquick-xmlwas already 0.41; this bump closes the remaining transitive path so the dependency tree is advisory-clean.
Fixed
-
extract_words/extract_spans/extract_text_linesgarbled text on rotated pages (#804) — on rotated pages the spatial extractors clustered along the wrong axis and fused unrelated cells into giant tokens (a whole column returned as a single 1000+ character "word", separate rows fused into one line). Two independent root causes were fixed:- Page
/Rotate90/270 (§7.7.3.3). Span bounding boxes were mapped into the page's displayed frame before word/line clustering, but a span decomposes into characters by laying glyphs horizontally along its bbox with their raw advance widths — a representation that cannot express a run whose visual direction has become vertical. Every raw text row therefore collapsed onto one displayed band and perpendicular columns fused. Because the horizontal clustering is already correct in raw user space (andextract_charsalready reports raw coordinates), 90°/270° pages now keep their span geometry in raw space; all four spatial APIs agree. (180° pages, where text stays horizontal, keep their existing mirror.) - Rotated text matrices (
rotation_degrees = ±90— vertical column headers, chart-axis labels). A run drawn with a rotated text matrix advances along a rotated axis, but the extractor stores a span bbox flattened onto the x-axis (width = Σ advances, height = font), so adjacent rotated columns overlap and the reading-order word merge and y-band line grouping fused them. Rotated runs are now excluded from both the cross-span word merge and the line grouping — each stays its own word(s) and its own line.
Thanks @ankursri494 for the report and the public, PII-free reproducers.
- Page
Thanks to @ankursri494 (#804) for reporting the issue that drove this release.
Installation
Rust (crates.io)
cargo add pdf_oxidePython (PyPI)
pip install pdf_oxideJavaScript/WASM (npm)
npm install pdf-oxide-wasmCLI (Homebrew)
brew install yfedoseev/tap/pdf-oxideCLI (Scoop — Windows)
scoop bucket add pdf-oxide https://github.com/yfedoseev/scoop-pdf-oxide
scoop install pdf-oxideCLI (Shell installer)
curl -fsSL https://raw.githubusercontent.com/yfedoseev/pdf_oxide/main/install.sh | shCLI (cargo-binstall)
cargo binstall pdf_oxide_cliMCP Server (for AI assistants)
cargo install pdf_oxide_mcpPre-built Binaries
Download archives for Linux, macOS, and Windows from the assets below. Each archive includes both pdf-oxide (CLI) and pdf-oxide-mcp (MCP server).
Platform Support
| Platform | Architecture | Archive |
|---|---|---|
| Linux | x86_64 (glibc) | pdf_oxide-linux-x86_64-*.tar.gz
|
| Linux | x86_64 (musl) | pdf_oxide-linux-x86_64-musl-*.tar.gz
|
| Linux | ARM64 | pdf_oxide-linux-aarch64-*.tar.gz
|
| macOS | x86_64 (Intel) | pdf_oxide-macos-x86_64-*.tar.gz
|
| macOS | ARM64 (Apple Silicon) | pdf_oxide-macos-aarch64-*.tar.gz
|
| Windows | x86_64 | pdf_oxide-windows-x86_64-*.zip
|
Changelog
See CHANGELOG.md for full details.