github kreuzberg-dev/kreuzberg v4.0.0-rc.11

latest releases: v4.0.0-rc.13, packages/go/v4/v4.0.0-rc.13, v4.0.0-rc.12...
pre-release9 hours ago

Fixed

  • PDFium bundling: Now correctly bundled in all language bindings (Node.js, Python, Java, Ruby, Go, C#)
    • FFI library copies libpdfium.dylib/.so/.dll from Rust build output during packaging
    • C# Kreuzberg.csproj now includes build target to copy native libraries to runtimes directories for all platforms
    • Node.js package.json includes all native library extensions (*.dylib, *.so, *.dll)
    • Fixes PDF extraction failures with "libpdfium not found" error
  • C# bindings: Added native library bundling with PDFium support for all platforms
    • Build target in Kreuzberg.csproj copies libkreuzberg_ffi and libpdfium to runtimes/{platform}/native directories
    • Supports macOS (arm64/x64), Linux (x64/arm64), Windows (x64/arm64)
    • Smoke test suite created in test_apps/csharp with 7 tests (PDF, DOCX, XLSX, JPG, PNG + OCR tests)
    • All C# tests passing with bundled PDFium
  • Rust core: Fixed missing Path import in pdf.rs causing compilation errors
    • Added use std::path::Path; to support async file extraction in PDF extractor
  • Node.js (NAPI-RS): PDFium always included in npm packages (no longer conditional)
  • Ruby gems: Fixed gem publishing validation error caused by incorrect compression handling
    • Root cause: Gems are POSIX tar archives with gzipped internal files (metadata.gz, data.tar.gz, checksums.yaml.gz) - this is the standard RubyGems format
    • Removed broken manual gzip step in publish script that was double-compressing valid gems
    • gem spec validation now passes directly on gems produced by bundle exec rake build
  • Go bindings: Removed duplicate Windows CGO linker flags causing compilation failures
    • Fixed packages/go/v4/ffi.go and packages/go/v4/plugins_test_helpers.go to use environment-set flags
    • Smoke test suite created in test_apps/go with 7 tests (PDF, DOCX, XLSX, JPG, PNG + OCR tests)
    • All Go tests passing with bundled PDFium via CGO/pkg-config
  • WASM (Deno): Fixed type definition references from .d.mts to .d.ts
    • Corrects Deno test helper type imports
  • C# NuGet: Fixed artifact download path to preserve native runtime directory structure
  • Java FFI: Added system library path fallback for ONNX Runtime when not bundled in JAR
    • Enables users with system-installed ONNX Runtime (e.g., brew install onnxruntime) to use the library
    • Gracefully handles missing ONNX Runtime for operations that don't require embeddings
  • Smoke tests: All 7 tests now passing across all five language bindings (Java, Python, Node.js, C#, Go)
    • PDF, DOCX, XLSX, JPG, PNG extraction + OCR tests all working
    • Verified test suite created for each binding in test_apps/{java,python,node,csharp,go}
  • WASM: Added PDF support to wasm-target feature for browser and Node.js WASM targets
    • Fixed build.rs to use bundled-pdfium for WASM instead of system-pdfium
    • Fixed PDF extractor to handle WASM synchronously (no tokio::spawn_blocking in WASM context)
    • Fixed PDF bindings initialization for WASM using system library binding
  • WASM test generator: Fixed hardcoded "application/pdf" MIME types in generated tests
    • Now correctly uses actual fixture media_type for each document format (DOCX, XLSX, HTML, etc.)
    • Regenerated all WASM e2e tests with correct MIME types
  • LibreOffice tests: Skipped on Windows to prevent CI hangs due to missing soffice binary

Don't miss a new kreuzberg release

NewReleases is sending notifications on new releases.