Performance Improvements
- Lazy loading throughout: Startup time significantly reduced by deferring heavy imports until first use:
- PDF parser now lazily loads
pdfmineronly when parsing PDFs - NES parser lazily loads PIL/Pillow only when rendering CHR graphics
- Kaitai parsers load on-demand per format instead of all at once
- PDF parser now lazily loads
- Caching optimizations:
descendants(),mimetypes(), andall_extensions()now return cached tuples instead of regenerating on each call
Bug Fixes
- PDF parser robustness: Fixed crashes on malformed PDF files:
- Empty lists now return safe zero-length wrappers instead of raising
ValueError - Malformed dictionary values are now logged and skipped rather than causing crashes
- Empty lists now return safe zero-length wrappers instead of raising
- Python 3.14 compatibility: Fixed forward reference handling for PEP 649 compliance
New Features
- Magic test strength scoring: Implemented libmagic-compatible test strength calculation for better match prioritization
- UTF-16 string support: Extended
lestring16/bestring16to support byte-length modifiers - Endianness flip infrastructure: Added foundation for flipped endianness matching (partial implementation)
Magic Definitions
- Major update: Synced with upstream libmagic definitions
- New formats: Added detection for bgcode, creativeassembly, keyman, lauterbach, R language, sf3, syd, tapebackup, uxn, and more
- Expanded coverage: Significant additions to DOS/Windows (+1300 lines), archive (+500 lines), console (+500 lines), images (+680 lines), and Linux (+500 lines) format detection
Breaking Changes
- Python 3.9 no longer supported: Minimum Python version is now 3.10
- pdfminer.six version: Now requires version 20251230 or newer
Dependencies
- Added
filelock>=3.20.3 - Added
packaging>=21.0(replacing deprecatedpkg_resources) - Updated
pdfminer.sixrequirement to>=20251230