github rapidfuzz/RapidFuzz v1.0.0
Release 1.0.0

latest releases: v3.14.1, v3.14.0, v3.13.0...
4 years ago

Changed

  • all normalized string_metrics can now be used as scorer for process.extract/extractOne
  • Implementation of the C++ Wrapper completely refactored to make it easier to add more scorers, processors and string matching algorithms in the future.
  • increased test coverage, that already helped to fix some bugs and help to prevent regressions in the future
  • improved docstrings of functions

Performance

  • Added bit-parallel implementation of the Levenshtein distance for the weights (1,1,1) and (1,1,2).
  • Added specialized implementation of the Levenshtein distance for cases with a small maximum edit distance, that is even faster, than the bit-parallel implementation.
  • Improved performance of fuzz.partial_ratio
    -> Since fuzz.ratio and fuzz.partial_ratio are used in most scorers, this improves the overall performance.
  • Improved performance of process.extract and process.extractOne

Deprecated

  • the rapidfuzz.levenshtein module is now deprecated and will be removed in v2.0.0
    These functions are now placed in rapidfuzz.string_metric. distance, normalized_distance, weighted_distance and weighted_normalized_distance are combined into levenshtein and normalized_levenshtein.

Added

  • added normalized version of the hamming distance in string_metric.normalized_hamming
  • process.extract_iter as a generator, that yields the similarity of all elements, that have a similarity >= score_cutoff

Fixed

  • multiple bugs in extractOne when used with a scorer, that's not from RapidFuzz
  • fixed bug in token_ratio
  • fixed bug in result normalization causing zero division

Don't miss a new RapidFuzz release

NewReleases is sending notifications on new releases.