🚨 Breaking Changes
- Remove deprecated target_weights in UMAP (#4081) @lowener
- Upgrade Treelite to 2.0.0 (#4072) @hcho3
- RF/DT cleanup (#4005) @venkywonka
- RF: memset and batch size optimization for computing splits (#4001) @venkywonka
- Remove old RF backend (#3868) @RAMitchell
- Enable warp-per-tree inference in FIL for regression and binary classification (#3760) @levsnv
🐛 Bug Fixes
- Disabling umap reproducibility tests for cuda 11.4 (#4128) @cjnolet
- Fix for crash in RF when
max_leaves
parameter is specified (#4126) @vinaydes - Running umap mnmg test twice (#4112) @cjnolet
- Minimal fix for
SparseRandomProjection
(#4100) @viclafargue - Creating copy of
components
in PCA transform and inverse transform (#4099) @divyegala - Fix SVM model parameter handling in case n_support=0 (#4097) @tfeher
- Fix set_params for linear models (#4096) @lowener
- Fix train test split pytest comparison (#4062) @dantegd
- Fix fit_transform on KMeans (#4055) @lowener
- Fixing -1 key access in 1nn reduce op in HDBSCAN (#4052) @divyegala
- Disable installing gbench to avoid container permission issues (#4049) @dantegd
- Fix double fit crash in preprocessing models (#4040) @viclafargue
- Always add
faiss
library alias if it's missing (#4028) @trxcllnt - Fixing intermittent HBDSCAN pytest failure in CI (#4025) @divyegala
- HDBSCAN bug on A100 (#4024) @divyegala
- Add treelite include paths to treelite targets (#4023) @trxcllnt
- Add Treelite_BINARY_DIR include to
cuml++
build interface include paths (#4018) @trxcllnt - Small ARIMA-related bug fixes in Hessenberg reduction and make_arima (#4017) @Nyrio
- Update setup.py (#4015) @ajschmidt8
- Update
treelite
version inget_treelite.cmake
(#4014) @ajschmidt8 - Fix build with latest RAFT branch-21.08 (#4012) @trxcllnt
- Skipping hdbscan pytests when gpu is a100 (#4007) @cjnolet
- Using 64-bit array lengths to increase scale of pca & tsvd (#3983) @cjnolet
- Fix MNMG test in Dask RF (#3964) @hcho3
- Use nested include in destination of install headers to avoid docker permission issues (#3962) @dantegd
- Fix automerge #3939 (#3952) @dantegd
- Update UCX-Py version to 0.21 (#3950) @pentschev
- Fix kernel and line info in cmake (#3941) @dantegd
- Fix for multi GPU PCA compute failing bug after transform and added error handling when n_components is not passed (#3912) @akaanirban
- Tolerate QN linesearch failures when it's harmless (#3791) @achirkin
📖 Documentation
- Improve docstrings for silhouette score metrics. (#4026) @bdice
- Update CHANGELOG.md link (#3956) @Salonijain27
- Update documentation build examples to be generator agnostic (#3909) @robertmaynard
- Improve FIL code readability and documentation (#3056) @levsnv
🚀 New Features
- Add Multinomial and Bernoulli Naive Bayes variants (#4053) @lowener
- Add weighted K-Means sampling for SHAP (#4051) @Nanthini10
- Use chebyshev, canberra, hellinger and minkowski distance metrics (#3990) @mdoijade
- Implement vector leaf prediction for fil. (#3917) @RAMitchell
- change TargetEncoder's smooth argument from ratio to count (#3876) @daxiongshu
- Enable warp-per-tree inference in FIL for regression and binary classification (#3760) @levsnv
🛠️ Improvements
- Remove clang/clang-tools from conda recipe (#4109) @dantegd
- Pin dask version (#4108) @galipremsagar
- ANN warnings/tests updates (#4101) @viclafargue
- Removing local memory operations from computeSplitKernel and other optimizations (#4083) @vinaydes
- Fix libfaiss dependency to not expressly depend on conda-forge (#4082) @Ethyling
- Remove deprecated target_weights in UMAP (#4081) @lowener
- Upgrade Treelite to 2.0.0 (#4072) @hcho3
- Optimize dtype conversion for FIL (#4070) @dantegd
- Adding quick notes to HDBSCAN public API docs as to why discrepancies may occur between cpu and gpu impls. (#4061) @cjnolet
- Update
conda
environment name for CI (#4039) @ajschmidt8 - Rewrite random forest gtests (#4038) @RAMitchell
- Updating Clang Version to 11.0.0 (#4029) @codereport
- Raise ARIMA parameter limits from 4 to 8 (#4022) @Nyrio
- Testing extract clusters in HDBSCAN (#4009) @divyegala
- ARIMA - Kalman loop rewrite: single megakernel instead of host loop (#4006) @Nyrio
- RF/DT cleanup (#4005) @venkywonka
- Exposing condensed hierarchy through cython for easier unit-level testing (#4004) @cjnolet
- Use the 21.08 branch of rapids-cmake as rmm requires it (#4002) @robertmaynard
- RF: memset and batch size optimization for computing splits (#4001) @venkywonka
- Reducing cluster size to number of selected clusters. Returning stability scores (#3987) @cjnolet
- HDBSCAN: Lazy-loading (and caching) condensed & single-linkage tree objects (#3986) @cjnolet
- Fix
21.08
forward-merge conflicts (#3982) @ajschmidt8 - Update Dask/Distributed version (#3978) @pentschev
- Use clang-tools on x86 only (#3969) @jakirkham
- Promote
trustworthiness_score
to public header, add missing includes, update dependencies (#3968) @trxcllnt - Moving FAISS ANN wrapper to raft (#3963) @cjnolet
- Add MG weighted k-means (#3959) @lowener
- Remove unused code in UMAP. (#3931) @trivialfis
- Fix automerge #3900 and correct package versions in meta packages (#3918) @dantegd
- Adaptive stress tests when GPU memory capacity is insufficient (#3916) @lowener
- Fix merge conflicts (#3892) @ajschmidt8
- Remove old RF backend (#3868) @RAMitchell
- Refactor to extract random forest objectives (#3854) @RAMitchell