What's New
9 New Core Modules
ai_backend— Three-tier AI backend: OpenAI API → OSS Gradio model (rate-limited) → built-in rules. Newhumanize_ai()function.pos_tagger— Rule-based POS tagger for EN (500+ exceptions), RU/UK (200+), DE (300+). Universal tagset.cjk_segmenter— Chinese BiMM (2504 entries), Japanese character-type, Korean space+particle segmentation.syntax_rewriter— 8 sentence-level transforms (active↔passive, clause inversion, enumeration reorder, adverb migration). 150+ irregular verbs.statistical_detector— 35-feature ML classifier for AI text detection. Integrated intodetect_ai()with 60/40 weighted merge.word_lm— Word-level unigram/bigram language model for 14 languages. Perplexity, burstiness, naturalness scoring.collocation_engine— PMI-based collocation scoring for context-aware synonym selection. EN ~130, RU ~30, DE ~20 collocations.fingerprint_randomizer— Anti-fingerprint diversification for output variety.benchmark_suite— 6-dimension automated quality benchmarking.
Pipeline & Detection
- Pipeline expanded to 17 stages (added syntax rewriting + anti-fingerprint diversification)
detect_ai()now returnscombined_score(statistical + heuristic)- Fixed NO-OP
_reduce_adjacent_repeats()— now actually removes repetitions
Tests
- 1,696 tests — 92 new, all passing (100% pass rate)