sentence-transformers 3.2.1 on Python PyPI

This patch release fixes some small bugs, such as related to loading CLIP models, automatic model card generation issues, and ensuring compatibility with third party libraries.

Install this version with

# Training + Inference
pip install sentence-transformers[train]==3.2.1

# Inference only, use one of:
pip install sentence-transformers==3.2.1
pip install sentence-transformers[onnx-gpu]==3.2.1
pip install sentence-transformers[onnx]==3.2.1
pip install sentence-transformers[openvino]==3.2.1

Fixing Loading non-Transformer models

In v3.2.0, a non-Transformer based model (e.g. CLIP) would not load correctly if the model was saved in the root of the model repository/directory. This has been resolved in #3007.

Throw error if `StaticEmbedding`-based model is finetuned with incompatible losses

The following losses are not compatible with StaticEmbedding-based models:

CachedGISTEmbedLoss
CachedMultipleNegativesRankingLoss
CachedMultipleNegativesSymmetricRankingLoss
DenoisingAutoEncoderLoss
GISTEmbedLoss

An error is now thrown when one of these are used with a StaticEmbedding-based model. I recommend using MultipleNegativesRankingLoss to finetune these models, e.g. as in https://huggingface.co/tomaarsen/static-bert-uncased-gooaq.
Note: to get good performance, you must use much higher learning rates than otherwise. In my experiments, 2e-1 worked well.

Patch ONNX model when the model uses `output_hidden_states`

For example, this script used to fail, but passes now:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer(
    "distiluse-base-multilingual-cased",
    backend="onnx",
    model_kwargs={"provider": "CPUExecutionProvider"},
)

sentences = ["This is an example sentence", "Each sentence is converted"]
embeddings = model.encode(sentences)
print(embeddings.shape)

All changes

Bump optimum version by @echarlaix in #2984
[docs] Update the training snippets for some losses that should use the v3 Trainer by @tomaarsen in #2987
[enh] Throw error if StaticEmbedding-based model is trained with incompatible loss by @tomaarsen in #2990
[fix] Fix semantic_search_usearch with 'binary' by @tomaarsen in #2989
[enh] Add support for large_string in model card create by @yaohwang in #2999
[model cards] Prevent crash on generating widgets if dataset column is empty by @tomaarsen in #2997
[fix] Added model2vec import compatible with current and newer version by @Pringled in #2992
Fix cache_dir issue with loading CLIPModel by @BoPeng in #3007
[warn] Throw a warning if compute_metrics is set, as it's not used by @tomaarsen in #3002
[fix] Prevent IndexError if output_hidden_states & ONNX by @tomaarsen in #3008

New Contributors

@echarlaix made their first contribution in #2984
@yaohwang made their first contribution in #2999
@Pringled made their first contribution in #2992
@BoPeng made their first contribution in #3007

Full Changelog: v3.2.0...v3.2.1

sentence-transformers 3.2.1 v3.2.1 - Patch CLIP loading, small ONNX fix, compatibility with other libraries on Python PyPI

Fixing Loading non-Transformer models

Throw error if StaticEmbedding-based model is finetuned with incompatible losses

Patch ONNX model when the model uses output_hidden_states

All changes

New Contributors

sentence-transformers 3.2.1
v3.2.1 - Patch CLIP loading, small ONNX fix, compatibility with other libraries

on Python PyPI

Throw error if `StaticEmbedding`-based model is finetuned with incompatible losses

Patch ONNX model when the model uses `output_hidden_states`