This release is long overdue, but still mostly serves as a placeholder for the impending 4.0.0 release, which will have retrained models for better accuracy. For now, this release will get the following improvements up on PyPI:
- Added support for Turkish ISO-8859-9 detection (PR #41, thanks @queeup)
- Commented out large unused sections of Big5 and EUC-KR tables to save memory (8bc4b89)
- Removed Python 3.2 from testing, but add 3.4 - 3.6
- Ensure that stdin is open with mode
'rb'
forchardetect
CLI. (PR #38, thanks @lpsinger) - Fixed
chardetect
crash with non-ascii file names (PR #39, thanks @nkanaev) - Made naming conventions more Pythonic throughout (no more
mTypicalPositiveRatio
, and insteadtypical_positive_ratio
) - Modernized test scripts and infrastructure so we've got Travis testing and all that stuff
- Rename
filter_without_english_words
tofilter_international_words
and make it match current Mozilla implementation (PR #44, thanks @rsnair2) - Updated
filter_english_letters
to match C implementation (c665459) - Temporarily disabled Hungarian ISO-8859-2 and Windows-1250 detection because it is very inaccurate (da6c0a0)
- Allow CLI sub-package to be importable (PR #55)
- Add a
hypotheis
-based test (PR #66, thanks @DRMacIver) - Strip endianness from UTF with BOM predictions so that the encoding can be passed directly to
bytes.decode()
(PR #73, thanks @snoack) - Fixed broken links in docs (PR #90, thanks @roskakori)
- Added early exit to
chardetect
when encoding is detected instead of looping through entire file (PR #103, thanks @jpz) - Use
bytearray
objects internally instead ofwrap_ord
calls, which provides a nice performance boost across the board (PR #106) - Add
language
property to probers andUniversalDetector
results (PR #180) - Mark the 5 known test failures as such so we can have more useful Travis build results in the meantime (d588407)