Added
- Comprehensive test suites for all language bindings:
- Python: 34 tests covering type verification, batch APIs, byte extraction, MIME detection, OCR, all file types
- Node.js: 79 tests for type verification, batch APIs, MIME detection, configuration, error handling
- Ruby: 55 RSpec tests for batch APIs, byte extraction, type verification, configuration handling
- Java: 85 JUnit 5 tests with FFM API memory management, concurrency, all file format support
- WASM: 79 tests with performance validation, large document handling, concurrent operations
- Go: 63 table-driven tests with context support, error wrapping with
errors.Is(), all file types - All test suites verify batch extraction APIs (sync/async), type safety, result structure validation
Fixed
- NuGet publish workflow reliability: Replaced NuGet/login OIDC-based authentication with direct API key approach
- Issue: NuGet/login action could fail with 401 errors due to OIDC token context limitations (see NuGet/login#6)
- Solution: Removed NuGet/login step and pass API key directly via
NUGET_AUTH_TOKENenvironment variable and--api-keyparameter - Impact: More reliable C# package publishing without dependency on OIDC token exchange
- Requires:
NUGET_API_KEYsecret to be configured in GitHub repository settings
- LibreOffice installation in Docker full image: Updated LibreOffice from 25.8.2 to 25.8.4
- Version 25.8.2 download URLs were no longer available on DocumentFoundation servers
- Updated to latest stable release 25.8.4 (released Dec 18, 2025)
- Verified working for Office document extraction (DOCX, XLSX, ODT)
- Tested on both x86_64 and aarch64 architectures
- Python IDE type completions: Fixed missing type hints in IDE autocomplete
- Root cause:
_internal_bindings.pyistub file was not being included in wheel distribution - Solution: Added
ensure_stub_file()function inpackages/python/build.pyto verify and include stub file in all build outputs - Impact: Full autocomplete now works for all 67 public APIs, type checkers can find definitions, mypy strict mode compatible
- Root cause:
- Ruby gem native extension compilation: Fixed vendoring of Rust crates during build
- Added automatic vendoring task to
packages/ruby/Rakefilethat runs before compilation - Ensures
vendor/kreuzberg,vendor/kreuzberg-ffi, andvendor/kreuzberg-tesseractare properly copied and version-updated before building native extension
- Added automatic vendoring task to
- Python
ExtractionResult.pagestype hints: Fixed missing type definition in PyO3 stub file- Root cause:
_internal_bindings.pyiwas missingpagesfield declaration inExtractionResultclass - Added
pages: list[PageContent] | Noneattribute andPageContentTypedDict definition - Impact: IDEs now properly show autocomplete for
result.pages, type checkers recognize the attribute - Fixes
TypeError: 'NoneType' object is not iterableconfusion when users iterate without checking for None
- Root cause: