UKPLab/sentence-transformers v3.4.1 on GitHub

This release introduces a convenient compatibility with Model2Vec models, and fixes a bug that caused an outgoing request even when using a local model.

Install this version with

# Training + Inference
pip install sentence-transformers[train]==3.4.1

# Inference only, use one of:
pip install sentence-transformers==3.4.1
pip install sentence-transformers[onnx-gpu]==3.4.1
pip install sentence-transformers[onnx]==3.4.1
pip install sentence-transformers[openvino]==3.4.1

Full Model2Vec integration

This release introduces support to load an efficient Model2Vec embedding model directly in Sentence Transformers:

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer(
    "minishlab/potion-base-8M",
    device="cpu",
)

# Run inference
sentences = [
    'Gadofosveset-enhanced MR angiography of carotid arteries: does steady-state imaging improve accuracy of first-pass imaging?',
    'To evaluate the diagnostic accuracy of gadofosveset-enhanced magnetic resonance (MR) angiography in the assessment of carotid artery stenosis, with digital subtraction angiography (DSA) as the reference standard, and to determine the value of reading first-pass, steady-state, and "combined" (first-pass plus steady-state) MR angiograms.',
    'In a longitudinal study we investigated in vivo alterations of CVO during neuroinflammation, applying Gadofluorine M- (Gf) enhanced magnetic resonance imaging (MRI) in experimental autoimmune encephalomyelitis, an animal model of multiple sclerosis. SJL/J mice were monitored by Gadopentate dimeglumine- (Gd-DTPA) and Gf-enhanced MRI after adoptive transfer of proteolipid-protein-specific T cells. Mean Gf intensity ratios were calculated individually for different CVO and correlated to the clinical disease course. Subsequently, the tissue distribution of fluorescence-labeled Gf as well as the extent of cellular inflammation was assessed in corresponding histological slices.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings[0], embeddings[1:])
print(similarities)
# tensor([[0.8085, 0.4884]])

Previously, loading a Model2Vec model required you to load a `StaticEmbedding` module.

from sentence_transformers import SentenceTransformer
from sentence_transformers.models import StaticEmbedding

# Download from the 🤗 Hub
module = StaticEmbedding.from_model2vec("minishlab/potion-base-8M")
model = SentenceTransformer(modules=[module], device="cpu")

# Run inference
sentences = [
    'Gadofosveset-enhanced MR angiography of carotid arteries: does steady-state imaging improve accuracy of first-pass imaging?',
    'To evaluate the diagnostic accuracy of gadofosveset-enhanced magnetic resonance (MR) angiography in the assessment of carotid artery stenosis, with digital subtraction angiography (DSA) as the reference standard, and to determine the value of reading first-pass, steady-state, and "combined" (first-pass plus steady-state) MR angiograms.',
    'In a longitudinal study we investigated in vivo alterations of CVO during neuroinflammation, applying Gadofluorine M- (Gf) enhanced magnetic resonance imaging (MRI) in experimental autoimmune encephalomyelitis, an animal model of multiple sclerosis. SJL/J mice were monitored by Gadopentate dimeglumine- (Gd-DTPA) and Gf-enhanced MRI after adoptive transfer of proteolipid-protein-specific T cells. Mean Gf intensity ratios were calculated individually for different CVO and correlated to the clinical disease course. Subsequently, the tissue distribution of fluorescence-labeled Gf as well as the extent of cellular inflammation was assessed in corresponding histological slices.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings[0], embeddings[1:])
print(similarities)
# tensor([[0.8085, 0.4884]])

Model2Vec was the inspiration of the recent Static Embedding work; all of these models can be used to approach the performance of normal transformer-based embedding models at a fraction of the latency. For example, both Model2Vec and Static Embedding models are ~25x faster than tiny embedding models on a GPU and ~400x faster than those models on a CPU.

Bug Fix

Using local_files_only=True still triggered a request to Hugging Face for the model card metadata; this has been resolved in (#3202).

All Changes

fix loss name in documentation of CachedMultipleNegativesRankingLoss by @JINO-ROHIT in #3191
Bump jinja2 from 3.1.4 to 3.1.5 in /docs by @dependabot in #3192
minor typo in MegaBatchMarginLoss by @JINO-ROHIT in #3193
Fix type hint of StaticEmbedding.__init__ by @altescy in #3196
[integration] Work towards full model2vec integration by @tomaarsen in #3182
Don't call set_base_model when local_files_only=True by @Davidyz in #3202

New Contributors

@dependabot made their first contribution in #3192
@altescy made their first contribution in #3196
@Davidyz made their first contribution in #3202

Full Changelog: v3.4.0...v3.4.1

UKPLab/sentence-transformers v3.4.1 v3.4.1 - Model2Vec compatibility & offline model fix on GitHub

Full Model2Vec integration

Bug Fix

All Changes

New Contributors

UKPLab/sentence-transformers v3.4.1
v3.4.1 - Model2Vec compatibility & offline model fix

on GitHub