github opendataloader-project/opendataloader-pdf v2.1.1
Release v2.1.1

8 hours ago

What's Changed

  • feat: add --detect-strikethrough option for strikethrough text detection (#295) by @hnc-jglee in #298
  • fix: filter narrow outlier elements in vertical gap detection by @bundolee in #300
  • Refactoring for StrikethroughProcessor and XYCutPlusPlusSorter by @MaximPlusov in #325
  • chore: remove Claude Code GitHub workflows by @bundolee in #334
  • fix: use asyncio event loop on Windows to avoid uvloop error by @bundolee in #328
  • docs: fix hybrid_timeout type and hybrid_fallback default by @bundolee in #299
  • feat: detect CID font extraction failure and route to OCR fallback by @bundolee in #291
  • fix: run converter.convert() in thread pool to prevent event loop blocking by @bundolee in #322
  • Update outdated contributing instructions by @JCZhang2025 in #306
  • docs: create whats-new-v2 article by @bdoubrov in #339
  • test: clean up stale TextProcessor regression by @JCZhang2025 in #308
  • fix: skip hybrid backend checks when no pages remain by @JCZhang2025 in #311
  • chore: remove LFS, move benchmark to opendataloader-bench by @bundolee in #340
  • fix: handle null textColor in HeadingProcessor for hybrid mode by @justperson94 in #320
  • fix(tables): normalize under-segmented spreadsheet tables by @sickn33 in #338
  • fix: change hybrid timeout default to unlimited (0) by @bundolee in #337
  • chore: upgrade GitHub Actions to Node 24-compatible versions by @bundolee in #346
  • fix: handle merged cells in Markdown table generation by @hnc-jglee in #342
  • Add double quotes to whats-new-v2.mdx by @MaximPlusov in #348
  • chore: update dependencies to fix security vulnerabilities by @bundolee in #347
  • fix: replace PR #320 defensive NPE catches with proper graceful degradation by @bundolee in #350
  • ci: add benchmark results to step summary by @bundolee in #355
  • feat: add MCP server for AI agent integration by @bejoyfuuul in #351
  • fix: remove fallback 0 for missing thresholds in step summary by @bundolee in #356
  • fix: add install instructions to hybrid server error and CLI help by @bundolee in #357

New Contributors

Full Changelog: v2.0.2...v2.1.1

Don't miss a new opendataloader-pdf release

NewReleases is sending notifications on new releases.