github kreuzberg-dev/kreuzberg v4.2.10

latest release: benchmark-run-21708112283
6 hours ago

Fixes

Java Bindings

  • Fix ClassCastException when deserializing nested generic collections (#355)
    • Added @JsonDeserialize(contentAs = ...) annotations to PageStructure, FormattedBlock, Footnote, Attributes, PageHierarchy, PageContent, DjotContent
    • Added comprehensive JSON deserialization regression tests (17 new tests)

Python Bindings

  • Fix Windows CLI binary missing from wheel (#349)
    • CI workflow was copying with wrong filename (kreuzberg.exe instead of kreuzberg-cli.exe)

MIME Type Detection

  • Fix DOCX/XLSX/PPTX detected as ZIP via detect_mime_type_from_bytes (#350)
    • The function now inspects ZIP contents for Office format markers

Java Bindings

  • Fix format-specific metadata missing in getMetadataMap()
    • ResultParser.buildMetadata() now properly propagates flattened format metadata to Metadata.additional

Full Changelog: v4.2.9...v4.2.10

Don't miss a new kreuzberg release

NewReleases is sending notifications on new releases.