We arrived in a pretty stable state.
Changes:
- Addition: 🍱 Add support for Kazakh (Cyrillic) language detection #109
- Improvement: ❇️ Further improve inferring the language from a given code page (single-byte) #112
- Removed: 🔥 Remove redundant logging entry about detected language(s) #115
- Miscellaneous: 🔧 Trying to leverage PEP263 when PEP3120 is not supported #116
- While I do not think that this (116) will actually fix something, it will rather raise a
SyntaxError
(Not about ASCII decoding error) for those trying to install this package using a non-supported Python version
- While I do not think that this (116) will actually fix something, it will rather raise a
- Improvement: ⚡ Refactoring for potential performance improvements in loops #113 @adbar
- Improvement: ✨ Various detection improvement (MD+CD) #117
- Bugfix: 🐛 Fix a minor inconsistency between Python 3.5 and other versions regarding language detection #117 #102
This version pushes forward the detection-coverage to 98%! https://github.com/Ousret/charset_normalizer/runs/3863881150
The great filter (cannot be better than) shall be 99% in conjunction with the current dataset. In future releases.