What's New
Bounding Box Support
bounding_boxonTableandExtractedImage: Spatial position data (BoundingBoxwithx0, y0, x1, y1) now available on both types across all 10 language bindings (Rust, Python, TypeScript, Ruby, PHP, Go, Java, C#, Elixir, WASM).- Table bounding boxes computed from PDF character positions: During PDF extraction, table bounding boxes are calculated from constituent character positions for precise spatial layout.
Inline Markdown Embedding
- Tables embedded inline in PDF markdown output: Tables appear at their correct vertical position instead of being appended at the end, with character deduplication to prevent text appearing both as paragraphs and inside tables.
- Image placeholders in PDF markdown output:
references injected with OCR text blockquotes when available.
Bug Fixes
- PHP FFI bridge
bounding_boxpassthrough: Fixed the Rust-PHP bridge to properly convert bounding boxes instead of always returning null. - Pipeline test flakiness: Fixed
test_pipeline_without_chunkingand related tests that failed due to global processor cache poisoning in parallel execution.
See CHANGELOG.md for full details.