J535D165/recordlinkage v0.8.0 on GitHub

Add additional arguments to the function that downloads and loads the
krebsregister data. The argument missing_values is used to fill missing
values. Default: nothing is done. The argument shuffle is used to
shuffle the records. Default is True.
Remove the lastest traces of the old package name. The new package name is
'Python Record Linkage Toolkit'
Better error messages when there are only matches or non-matches are passed
to train the classifier.
Add AirSpeedVelocity tests to test the performance.
Compare for deduplication fixed. It was broken.
Parameterized tests for the Compare class and its algorithms. Making use
of nose-parameterized module.
Update documentation about contributing.
Bugfix/improvement when blocking on multiple columns with missing values.
Fix bug #29. Package
not working with pandas 0.18 and 0.17. Dropped support pandas 0.17 and fixed
support for 0.18. Also added multi-dendency tests for TravisCI.
Support for dedicated deduplication algorithms
Special algorithm for full index in case of finding duplicates. Performce is
100x better.
Function max_number_of_pairs to get the maximum number of pairs.
low_memory for compare class.
Improved performance in case of comparing a large number of record pairs.
New documentation about custom algorithms
New documentation about the use of classifiers.
Possible to compare arrays and series directly without using labels.
Make a dataframe with random comparison vectors with the
binary_comparisons in the recordlinkage.datasets.random module.
Set KMeans cluster centers by hand.
Various documentation updates and improvements.
Jellyfish is now a required dependency. Fixes bug #30.
Added tox.ini to test packaging and installation of package.
Drop requirements.txt file.
Many small fixes and changes. Most of the changes cover the Compare
module. Especially label handling is improved.

J535D165/recordlinkage v0.8.0 Version 0.8.0 (22 Jan 2017) on GitHub

J535D165/recordlinkage v0.8.0
Version 0.8.0 (22 Jan 2017)

on GitHub