🚨 Breaking Changes
- RF: python api behaviour refactor (#4207) @venkywonka
- Implement vector leaf for random forest (#4191) @RAMitchell
- Random forest refactoring (#4166) @RAMitchell
- RF: Add Poisson deviance impurity criterion (#4156) @venkywonka
- avoid paramsSolver::{n_rows,n_cols} shadowing their base class counterparts (#4130) @yitao-li
- Apply modifications to account for RAFT changes (#4077) @viclafargue
🐛 Bug Fixes
- Update scikit-learn version in conda dev envs to 0.24 (#4241) @dantegd
- Using pinned host memory for Random Forest and DBSCAN (#4215) @divyegala
- Make sure we keep the rapids-cmake and cuml cal version in sync (#4213) @robertmaynard
- Add thrust_create_target to install export in CMakeLists (#4209) @dantegd
- Change the error type to match sklearn. (#4198) @achirkin
- Fixing remaining hdbscan bug (#4179) @cjnolet
- Fix for cuDF changes to cudf.core (#4168) @dantegd
- Fixing UMAP reproducibility pytest failures in 11.4 by using random init for now (#4152) @cjnolet
- avoid paramsSolver::{n_rows,n_cols} shadowing their base class counterparts (#4130) @yitao-li
- Use the new RAPIDS.cmake to fetch rapids-cmake (#4102) @robertmaynard
📖 Documentation
- Expose train_test_split in API doc (#4234) @hcho3
- Adding docs for
.get_feature_names()
insideTfidfVectorizer
(#4226) @mayankanand007 - Removing experimental flag from hdbscan description in docs (#4211) @cjnolet
- updated build instructions (#4200) @shaneding
- Forward-merge branch-21.08 to branch-21.10 (#4171) @jakirkham
🚀 New Features
- Experimental option to build libcuml++ only with FIL (#4225) @dantegd
- FIL to import categorical models from treelite (#4173) @levsnv
- Add hamming, jensen-shannon, kl-divergence, correlation and russellrao distance metrics (#4155) @mdoijade
- Add Categorical Naive Bayes (#4150) @lowener
- FIL to infer categorical forests and generate them in C++ tests (#4092) @levsnv
- Add Gaussian Naive Bayes (#4079) @lowener
- ARIMA - Add support for missing observations and padding (#4058) @Nyrio
🛠️ Improvements
- Pin max
dask
anddistributed
versions to 2021.09.1 (#4229) @galipremsagar - Fea/umap refine (#4228) @AjayThorve
- Upgrade Treelite to 2.1.0 (#4220) @hcho3
- Add option to clone RAFT even if it is in the environment (#4217) @dantegd
- RF: python api behaviour refactor (#4207) @venkywonka
- Pytest updates for Scikit-learn 0.24 (#4205) @dantegd
- Faster glm ols-via-eigendecomposition algorithm (#4201) @achirkin
- Implement vector leaf for random forest (#4191) @RAMitchell
- Refactor kmeans sampling code (#4190) @Nanthini10
- Gracefully accept 'n_jobs', a common sklearn parameter, in NearestNeighbors Estimator (#4178) @NV-jpt
- Update with rapids cmake new features (#4175) @robertmaynard
- Update to UCX-Py 0.22 (#4174) @pentschev
- Random forest refactoring (#4166) @RAMitchell
- Fix log level for dask tree_reduce (#4163) @lowener
- Add CUDA 11.4 development environment (#4160) @dantegd
- RF: Add Poisson deviance impurity criterion (#4156) @venkywonka
- Split FIL infer_k into phases to speed up compilation (when a patch is applied) (#4148) @levsnv
- RF node queue rewrite (#4125) @RAMitchell
- Remove max version pin for
dask
&distributed
on development branch (#4118) @galipremsagar - Correct name of a cmake function in get_spdlog.cmake (#4106) @robertmaynard
- Apply modifications to account for RAFT changes (#4077) @viclafargue
- Warnings are errors (#4075) @harrism
- ENH Replace gpuci_conda_retry with gpuci_mamba_retry (#4065) @dillon-cullinan
- Changes to NearestNeighbors to call 2d random ball cover (#4003) @cjnolet
- support space in workspace (#3752) @jolorunyomi