Fixed
- PDFium bundling: Now correctly bundled in all language bindings (Node.js, Python, Java, Ruby, Go, C#)
- FFI library copies
libpdfium.dylib/.so/.dllfrom Rust build output during packaging - C# Kreuzberg.csproj now includes build target to copy native libraries to runtimes directories for all platforms
- Node.js package.json includes all native library extensions (
*.dylib,*.so,*.dll) - Fixes PDF extraction failures with "libpdfium not found" error
- FFI library copies
- C# bindings: Added native library bundling with PDFium support for all platforms
- Build target in Kreuzberg.csproj copies libkreuzberg_ffi and libpdfium to runtimes/{platform}/native directories
- Supports macOS (arm64/x64), Linux (x64/arm64), Windows (x64/arm64)
- Smoke test suite created in test_apps/csharp with 7 tests (PDF, DOCX, XLSX, JPG, PNG + OCR tests)
- All C# tests passing with bundled PDFium
- Rust core: Fixed missing Path import in pdf.rs causing compilation errors
- Added
use std::path::Path;to support async file extraction in PDF extractor
- Added
- Node.js (NAPI-RS): PDFium always included in npm packages (no longer conditional)
- Ruby gems: Fixed gem publishing validation error caused by incorrect compression handling
- Root cause: Gems are POSIX tar archives with gzipped internal files (metadata.gz, data.tar.gz, checksums.yaml.gz) - this is the standard RubyGems format
- Removed broken manual gzip step in publish script that was double-compressing valid gems
gem specvalidation now passes directly on gems produced bybundle exec rake build
- Go bindings: Removed duplicate Windows CGO linker flags causing compilation failures
- Fixed
packages/go/v4/ffi.goandpackages/go/v4/plugins_test_helpers.goto use environment-set flags - Smoke test suite created in test_apps/go with 7 tests (PDF, DOCX, XLSX, JPG, PNG + OCR tests)
- All Go tests passing with bundled PDFium via CGO/pkg-config
- Fixed
- WASM (Deno): Fixed type definition references from
.d.mtsto.d.ts- Corrects Deno test helper type imports
- C# NuGet: Fixed artifact download path to preserve native runtime directory structure
- Java FFI: Added system library path fallback for ONNX Runtime when not bundled in JAR
- Enables users with system-installed ONNX Runtime (e.g.,
brew install onnxruntime) to use the library - Gracefully handles missing ONNX Runtime for operations that don't require embeddings
- Enables users with system-installed ONNX Runtime (e.g.,
- Smoke tests: All 7 tests now passing across all five language bindings (Java, Python, Node.js, C#, Go)
- PDF, DOCX, XLSX, JPG, PNG extraction + OCR tests all working
- Verified test suite created for each binding in test_apps/{java,python,node,csharp,go}
- WASM: Added PDF support to
wasm-targetfeature for browser and Node.js WASM targets- Fixed build.rs to use bundled-pdfium for WASM instead of system-pdfium
- Fixed PDF extractor to handle WASM synchronously (no tokio::spawn_blocking in WASM context)
- Fixed PDF bindings initialization for WASM using system library binding
- WASM test generator: Fixed hardcoded "application/pdf" MIME types in generated tests
- Now correctly uses actual fixture media_type for each document format (DOCX, XLSX, HTML, etc.)
- Regenerated all WASM e2e tests with correct MIME types
- LibreOffice tests: Skipped on Windows to prevent CI hangs due to missing soffice binary