Added
- CPU profiling infrastructure for performance analysis (#242):
- New
profiling.yamlGitHub Actions workflow for automated profiling on PRs - Signal-based sampling profiler with pprof at 1000 Hz (~1-3% overhead)
- SVG flamegraph generation with HTML gallery visualization
- Fixture filtering via
PROFILING_FIXTURESenv var (81% CI time reduction) - Feature-gated compilation with zero overhead when disabled
generate-flamegraph-indexCLI subcommand for interactive flamegraph gallery- 13 Kreuzberg binding jobs (native, Python, Node, WASM, Ruby, Go, Java, C#)
- Expected PR profiling runtime: 15-25 minutes (vs 120+ min for full benchmarks)
- New
Performance
- FFI batch operations for 4-6x throughput gain (#242):
- Implemented batch streaming APIs in
kreuzberg-ffifor amortized FFI overhead - Ruby and Java batch extraction now process multiple documents per FFI call
- Result pooling to reduce allocation overhead in high-throughput scenarios
- Zero-copy result views for read-only access to extraction results
- String interning for deduplicated metadata strings across batch results
- Implemented batch streaming APIs in
- C# comprehensive optimizations (#242):
- Session 1: Quick win optimizations (method inlining, struct layout)
- Session 3: JSON serialization with source generation (100-200ms gain)
- Session 4: Batch operation tests for TypeScript and C#
- Session 7: Source generation validation and final optimizations
- GC handle pooling for reduced managed-native transitions
- Custom JSON serializer context for zero-reflection serialization
- Core performance improvements (#242):
- PDF text extraction optimizations (reduced allocations, better buffering)
- Token reduction benchmarks and SIMD text processing
- OCR language registry for faster language detection lookups
- UTF-8 validation optimizations for text quality processing
- String pooling for deduplicated text content across documents
- Object pooling utilities for allocation-heavy operations
- Batch pooling benchmarks demonstrating 2-3x throughput improvements
- TypeScript/Node.js batch APIs (#242):
- Config validation optimizations
- Type system improvements for batch operations
- Integration tests for concurrent batch processing
Fixed
- Python type stub file packaging: Fixed
.pyistub files not being included in wheel distributions - Java CI Maven version mismatch: Fixed CI workflow failing with Maven 3.9.11 when project requires Maven 4.0.0-rc-4+
- Go Windows CI linking failure: Fixed duplicate CGO_LDFLAGS causing linker errors on Windows
- Ruby gem Linux/Windows build linking failure: Fixed missing link search path in Magnus FFI bindings build.rs
- Rust LibreOffice tests timeout on Windows CI: Added ignore attribute to skip legacy Office tests on Windows
- Ruby gem publish Zlib corruption: Fixed gem file corruption during GitHub Actions artifact transfer
- WASM compilation errors: Fixed dead code warnings for large stack functions
- WASM Deno test failures: Resolved test failures for HTML table detection and XML metadata
- OpenSSL cache warnings: Eliminated CI warnings for missing OpenSSL cache paths
- FFI header type declarations: Corrected cbindgen configuration for ExtractionResult opaque typedef
- Ruby type signatures: Added missing RBS signatures for ErrorContext methods
- Cargo workspace profiles: Removed profile override from benchmark-harness
Full changelog: https://github.com/kreuzberg-dev/kreuzberg/blob/main/CHANGELOG.md#400-rc16---2025-12-21