- Supported uncertainty prediction for classification models.
- Fixed RMSEWithUncertainty data uncertainty prediction - now it predicts variance, not standard deviation.
- Allow categorical feature counters for
group_weightparameter added to
catboost.utils.eval_metricmethod to allow passing weights for object groups. Allows correctly match weighted ranking metrics computation when group weights present.
- Faster non-owning deserialization from memory with less memory overhead - moved some dynamically computed data to model file, other data is computed in lazy manner only when needed.
- Supported embedding features as input and linear discriminant analysis for embeddings preprocessing. Try adding your embeddings as new columns with embedding values array in Pandas.Dataframe and passing corresponding column names to
embedding_features=['EmbeddingFeaturesColumnName1, ...]parameter. Another way of adding your embedding vectors is new type of column in Column Description file
NumVectorand adding semicolon separated embeddings column to your XSV file:
- Published new tutorial on uncertainty prediction.
- Reduced GPU memory usage in multi gpu training when there is no need to compute categorical feature counters.
- Now CatBoost allows to specify
use_weightsfor metrics when
auto_class_weightsparameter is set.
- Correctly handle NaN values in
- Fixed floating point precision drop releated bugs during Multiclass training with lots of objects in our case, bug was triggered while training on 25mln objects on single GPU card.
averageparameter is passed to TotalF1 metric while training on GPU.
- Added class labels checks
- Disallow feature remapping in model predict when there is empty feature names in model.