What's Changed
Bug Fixes
- CRITICAL: Fixed naturalizer.py regex crash for RU/UK text (~50 patterns with non-capturing groups + backreferences). The entire naturalization stage was silently skipped.
- Added thread-safety locks to
_ai_cacheand_AI_WORDSfor multi-threaded usage. - Added division-by-zero guards in detector metric calculations.
Cleanup
- Removed dead module
tokenizer.py(replaced bysentence_split.py). - Removed 14 one-off diagnostic scripts, 4 outdated competitive analysis docs, debug artifacts.
- Synced PHP and JS package versions to 0.25.0.
Documentation
- Fixed 17-stage to 20-stage across all 15+ documentation files.
- Corrected test counts, LOC claims, speed benchmarks for consistency.
- Fixed CHANGELOG date chronology.
CI
- Raised per-test timeout from 120s to 300s to prevent false failures on slow CI runners.