github xberg-io/xberg v3.0.0

latest releases: v1.0.0-rc.1, v5.0.0-rc.35, v5.0.0-rc.34...
15 months ago

Enhancements:

  • added support for multiple OCR backends: added PaddleOCR and Easy OCR (feature)
  • added support for having no OCR backend (feature)
  • changed Tesseract OCR to optional (enhancement)
  • added support for registering creating custom extractors (feature)
  • added support for overriding builtin extractors (feature)
  • added support for post-processing hooks (feature)
  • added support for validation hooks (feature)
  • added PDF metadata extraction using Playa-PDF (feature)
  • added optional chunking support (feature)
  • added documentation site (documentation)

Breaking Changes:

  • Changed ExtractionResults from NamedTuple to TypedDict (breaking change; api)

Internal:

  • Rework internals to allow extensibility by changing to a class-based architecture (internal; architecture)

Don't miss a new xberg release

NewReleases is sending notifications on new releases.