github J535D165/recordlinkage v0.8.0
Version 0.8.0 (22 Jan 2017)

latest releases: v0.16, v0.15, v0.14...
7 years ago
  • Add additional arguments to the function that downloads and loads the
    krebsregister data. The argument missing_values is used to fill missing
    values. Default: nothing is done. The argument shuffle is used to
    shuffle the records. Default is True.
  • Remove the lastest traces of the old package name. The new package name is
    'Python Record Linkage Toolkit'
  • Better error messages when there are only matches or non-matches are passed
    to train the classifier.
  • Add AirSpeedVelocity tests to test the performance.
  • Compare for deduplication fixed. It was broken.
  • Parameterized tests for the Compare class and its algorithms. Making use
    of nose-parameterized module.
  • Update documentation about contributing.
  • Bugfix/improvement when blocking on multiple columns with missing values.
  • Fix bug #29. Package
    not working with pandas 0.18 and 0.17. Dropped support pandas 0.17 and fixed
    support for 0.18. Also added multi-dendency tests for TravisCI.
  • Support for dedicated deduplication algorithms
  • Special algorithm for full index in case of finding duplicates. Performce is
    100x better.
  • Function max_number_of_pairs to get the maximum number of pairs.
  • low_memory for compare class.
  • Improved performance in case of comparing a large number of record pairs.
  • New documentation about custom algorithms
  • New documentation about the use of classifiers.
  • Possible to compare arrays and series directly without using labels.
  • Make a dataframe with random comparison vectors with the
    binary_comparisons in the recordlinkage.datasets.random module.
  • Set KMeans cluster centers by hand.
  • Various documentation updates and improvements.
  • Jellyfish is now a required dependency. Fixes bug #30.
  • Added tox.ini to test packaging and installation of package.
  • Drop requirements.txt file.
  • Many small fixes and changes. Most of the changes cover the Compare
    module. Especially label handling is improved.

Don't miss a new recordlinkage release

NewReleases is sending notifications on new releases.