Fixes
Java Bindings
- Fix ClassCastException when deserializing nested generic collections (#355)
- Added
@JsonDeserialize(contentAs = ...)annotations toPageStructure,FormattedBlock,Footnote,Attributes,PageHierarchy,PageContent,DjotContent - Added comprehensive JSON deserialization regression tests (17 new tests)
- Added
Python Bindings
- Fix Windows CLI binary missing from wheel (#349)
- CI workflow was copying with wrong filename (
kreuzberg.exeinstead ofkreuzberg-cli.exe)
- CI workflow was copying with wrong filename (
MIME Type Detection
- Fix DOCX/XLSX/PPTX detected as ZIP via
detect_mime_type_from_bytes(#350)- The function now inspects ZIP contents for Office format markers
Java Bindings
- Fix format-specific metadata missing in
getMetadataMap()ResultParser.buildMetadata()now properly propagates flattened format metadata toMetadata.additional
Full Changelog: v4.2.9...v4.2.10