LTS patch release. Four targeted bug fixes plus dependency pinning so the
branch builds against current crates.io releases.
Fixed
- #934: RTF hex byte escapes now honor
\ansicpgNNNN, so CP1251 Cyrillic byte runs decode as readable text instead of Windows-1252 mojibake. - #937:
ExtractionConfig(cancel_token=…)raisedTypeError: unexpected keyword argument 'cancel_token'from Python despite the type stub declaring the kwarg. The#[pyo3(signature = …)]onExtractionConfig::__new__did not listcancel_tokenand the constructor body hard-coded it toNone. The kwarg is now accepted and threaded through to the underlyingkreuzberg::CancellationToken. Post-construct attribute assignment (cfg.cancel_token = CancellationToken()) continues to work as before. - #965: C#
OcrConfigwas missing theVlmConfigproperty and theLlmConfigtype was undefined anywhere in the assembly, despite both being documented and present in the Rust core. AddedLlmConfig(Model,ApiKey,BaseUrl,TimeoutSecs,MaxRetries,Temperature,MaxTokens) andOcrConfig.VlmConfig; registeredLlmConfiginKreuzbergJsonContextso source-generated serialization works. - #991: The musl CLI tarball (
kreuzberg-cli-*-unknown-linux-musl.tar.gz) bundledlibonnxruntime.so.1.24.4but not its transitive deps (libprotobuf-lite.so.31,libre2.so.11,libabsl_log_internal_check_op.so.2508.0.0). The launcher invokes the musl loader with--library-path lib/, which replaces (not extends) the loader's search path, so the binary failed at startup on any host.docker/Dockerfile.musl-buildnow recursivelyldd-walks every bundled.so, copies missing deps intolib/, and smoke-tests the loader against each — the build now fails if any unresolved dep remains. - Build compatibility: pin
tokenizers = "=0.22.2"(text-splitter 0.30ChunkSizerimpl +add_special_tokenssignature broke at 0.23), pinv_htmlescape = "=0.15.8"(0.17 renamedescapefn toEscapestruct), drop the removedProcessConfig.extractionsfield, and migrate three#[ctor::ctor]sites to#[ctor::ctor(unsafe)]as required byctor0.5+.
Fixed (tooling)
task updatenow runsscripts/ci/ruby/vendor-kreuzberg-core.pybefore upgrading the Ruby native crate, since that manifest'skreuzbergdep points at the on-demand-generatedpackages/ruby/vendor/kreuzberg/.task updateno longer aborts on the informationalmix hex.outdatedstep, which exits non-zero when any Elixir dep is outdated..gitignore: ignore accidental Go build outputs underpackages/go/v4/.