Release 0.12.0
Breaking changes:
- Class weights are now taken into account by
eval_metrics()
,get_feature_importance()
, andget_object_importance()
.
In previous versions the weights were ignored. - Parameter
random-strength
for pairwise training (PairLogitPairwise
,QueryCrossEntropy
,YetiRankPairwise
) is not supported anymore. - Simultaneous use of
MultiClass
andMultiClassOneVsAll
metrics is now deprecated.
New functionality:
cv
method is now supported on GPU.- String labels for classes are supported in Python.
In multiclassification the string class names are inferred from the data.
In binary classification for using string labels you should employclass_names
parameter and specify which class is negative (0) and which is positive (1).
You can also useclass_names
in multiclassification mode to pass all possible class names to the fit function. - Borders can now be saved and reused.
To save the feature quantization information obtained during training data preprocessing into a text file use cli option--output-borders-file
.
To use the borders for training use cli option--input-borders-file
.
This functionanlity is now supported on CPU and GPU (it was GPU-only in previous versions).
File format for the borders is described here. - CLI option
--eval-file
is now supported on GPU.
Quality improvement:
- Some cases in binary classification are fixed where training could diverge
Optimizations:
- A great speedup of the Python applier (10x)
- Reduced memory consumption in Python
cv
function (times fold count)
Benchmarks and tutorials:
- Added speed benchmarks for CPU and GPU on a variety of different datasets.
- Added benchmarks of different ranking modes. In this tutorial we compare different ranking modes in CatBoost, XGBoost and LightGBM.
- Added tutorial for applying model in Java.
- Added benchmarks of SHAP values calculation for CatBoost, XGBoost and LightGBM.
The benchmarks also contain explanation of complexity of this calculation in all the libraries.
We also made a list of stability improvements and stricter checks of input data and parameters.
And we are so grateful to our community members @canorbal and @neer201 for their contribution to this release. Thank you.