github PiratesIRC/Dispatcharr-Channel-Maparr-Plugin 1.26.1650854
v1.26.1650854

latest release: 1.26.1651015
6 hours ago

Matcher hardening: ports three normalize_name input-cleaning fixes from Stream-Mapparr (the matcher template) so noisy provider stream names normalize to the same form as clean channel-database names. Purely additive to fuzzy_matcher.py (122 insertions, 0 deletions) — no existing matching logic changed. See docs/MATCHER-NORMALIZATION-PORT.md.

Matching

  • Stylized-Unicode decoration stripping (bug-048) — Drops whole tokens that are pure stylized decoration (superscript / small-capital tier markers, bullets) before the ASCII tag pipeline, detected by Unicode character name rather than code-point range. A superscript "RAW" suffix no longer blocks a match to WeatherNation. Real ASCII tier words (Gold/VIP) and non-Latin scripts (Arabic/Cyrillic/CJK) are preserved.
  • Emoji-as-letter normalization (bug-051) — Maps an emoji used as a letter inside a word (SP⚽RTSSPoRTS, the beIN family) to its letter when flanked by ASCII letters, and strips emoji used purely as decoration plus zero-width selectors. beIN SP⚽RTS now matches beIN Sports.
  • Numeric resolution markers (bug-055) — Strips 720p / 1080p / 2160p / 3840P-style markers (a 3–4 digit run glued to p/i) that the keyword quality list misses, while keeping bare numbers (Channel 4, Studio 1080), 5-digit runs, and spaced standalone roman numerals (Volume 100 I) intact. Gated by the same tag-handling flag as the other quality tags.

Beneficial side effect: the NFKD canonicalization in the stylized-strip step unifies accented and ASCII spellings of the same channel, so UniMás/UniMas and TeleFórmula/TeleFormula now match where they previously did not. Verified: 0 changes to any ASCII channel name across all 42,246 database names; no different-channel false-merges.

Data

  • Deduplicated channel databases — removed 651 fully-identical rows across 7 country files (UK 168, MX 206, DE 136, CA 62, BR 43, FR 19, ES 17); all *_channels.json normalized to LF.
  • Norwegian channel database (NO_channels.json) — 94 channels; registered NO → norway in COUNTRY_DIR_MAP so the per-channel logo action resolves them. Coverage is now 12 countries.

Tests

  • tests/test_normalization_port.py (48 cases) locks all three fixes at the helper, regex, and full-pipeline levels, with editor-proof escaped Unicode constants. New fuzzy_module conftest fixture exposes the module-level helpers. A corpus no-regression test asserts the fixes never alter any of the ~41.5K ASCII channel names (CI-enforced, baseline-free). Full suite: 149 passing.

Don't miss a new Dispatcharr-Channel-Maparr-Plugin release

NewReleases is sending notifications on new releases.