github opendataloader-project/opendataloader-pdf v1.11.0
Release v1.11.0

latest releases: v2.3.0, v2.2.1, v2.2.0...
one month ago

What's Changed

  • feat(hybrid): add Hancom Document AI backend support by @hnc-leebd in #181
  • Add sensitive data filter by @LonelyMidoriya in #152
  • Fix case when replacements could overlap each other by @LonelyMidoriya in #187
  • Add spaces when sorting text chunks in text line by @LonelyMidoriya in #190
  • fix: skip Claude Code Review workflow for fork PRs by @hnc-leebd in #186
  • fix: LangChain documentation link in README by @hnc-hyunheejo in #192
  • Update verapdf version by @MaximPlusov in #193
  • fix: add Unicode sanitization to hybrid server response by @hnc-leebd in #207
  • feat: add GPU detection logging to hybrid server startup by @hnc-leebd in #208
  • feat: support --replace-invalid-chars in hybrid-mode full by @hnc-leebd in #209
  • test: add regression tests for Korean CID font extraction by @hnc-leebd in #213
  • chore: update all npm and uv dependencies to latest by @hnc-leebd in #214
  • feat: publish hybrid server Docker image to GHCR by @hnc-leebd in #211
  • test: add regression tests for issue #150 text extraction bugs by @hnc-leebd in #219
  • fix: resolve minimatch ReDoS vulnerability by @hnc-leebd in #218
  • fix: prevent stack trace exposure in hybrid server by @hnc-leebd in #217
  • fix: handle Docling PARTIAL_SUCCESS and fallback failed pages to Java by @hnc-leebd in #216
  • fix: cap Markdown heading level to 1-6 per specification by @hnc-leebd in #223
  • fix: add upfront health check for hybrid server before processing by @hnc-leebd in #226

Full Changelog: v1.10.1...v1.11.0

Don't miss a new opendataloader-pdf release

NewReleases is sending notifications on new releases.