github rapidfuzz/RapidFuzz v3.0.0
Release 3.0.0

latest releases: v3.14.1, v3.14.0, v3.13.0...
2 years ago

Changed

  • allow the usage of Hamming for different string lengths. Length differences are handled as
    insertions / deletions

  • remove support for boolean preprocessor functions in rapidfuzz.fuzz and rapidfuzz.process.
    The processor argument is now always a callable or None.

  • update defaults of the processor argument to be None everywhere. For affected functions this can change results, since strings are no longer preprocessed. To get back the old behaviour pass processor=utils.default_process to these functions. The following functions are affected by this:

    • process.extract, process.extract_iter, process.extractOne
    • fuzz.token_sort_ratio, fuzz.token_set_ratio, fuzz.token_ratio, fuzz.partial_token_sort_ratio, fuzz.partial_token_set_ratio, fuzz.partial_token_ratio, fuzz.WRatio, fuzz.QRatio
  • rapidfuzz.process no longer calls scorers with processor=None. For this reason user provided scorers no longer require this argument.

  • remove option to pass keyword arguments to scorer via **kwargs in rapidfuzz.process. They can be passed
    via a scorer_kwargs argument now. This ensures this does not break when extending function parameters and
    prevents naming clashes.

  • remove rapidfuzz.string_metric module. Replacements for all functions are available in rapidfuzz.distance

Added

  • added support for arbitrary hashable sequence in the pure Python fallback implementation of all functions in rapidfuzz.distance
  • added support for None and float("nan") in process.cdist as long as the underlying scorer supports it.
    This is the case for all scorers returning normalized results.

Fixed

  • fix division by zero in simd implementation of normalized metrics leading to incorrect results

Don't miss a new RapidFuzz release

NewReleases is sending notifications on new releases.