- Full Python3 support
- Update the parameters of the Logistic Regression Classifier manually. In literature, this is often denoted as the deterministic record linkage.
- Expectation/Conditional Maxisation algorithm completely rewritten. The performance of the algorithm is much better now. The algorithm is still experimental.
- New string comparison metrics: Q-gram string comparing and Cosine string comparing.
- New indexing algorithm: Q-gram indexing.
- Several internal tests.
- Updated documenation.
- BernoulliNBClassifier is now named NaiveBayesClassifier. No changes to the algorithm.
- Arguments order in compare functions corrected.
- Function to clean phone numbers
- Return the result of the classifier as index, numpy array or pandas series.
- Many bug fixes