Added
- added C-Api which can be used to extend RapidFuzz from different Python modules using any
programming language which allows the usage of C-Apis (C/C++/Rust) - added new scorers in
rapidfuzz.distance.*
- port existing distances to this new api
- add Indel distance along with the corresponding editops function
Changed
- when the result of
string_metric.levenshtein
orstring_metric.hamming
is below max
they do now returnmax + 1
instead of -1 - Build system moved from setuptools to scikit-build
- Stop including all modules in __init__.py, since they significantly slowed down import time
Removed
- remove the
rapidfuzz.levenshtein
module which was deprecated in v1.0.0 and scheduled for removal in v2.0.0 - dropped support for Python2.7 and Python3.5
Deprecated
- deprecate support to specify processor in form of a boolean (will be removed in v3.0.0)
- new functions will not get support for this in the first place
- deprecate
rapidfuzz.string_metric
(will be removed in v3.0.0). Similar scorers are available
inrapidfuzz.distance.*
Fixed
- process.cdist did raise an exception when used with a pure python scorer
Performance
- improve performance and memory usage of
rapidfuzz.string_metric.levenshtein_editops
- memory usage is reduced by 33%
- performance is improved by around 10%-20%
- significantly improve performance of
rapidfuzz.string_metric.levenshtein
formax <= 31
using a banded implementation