github huggingface/sentence-transformers v5.2.0
v5.2.0 - CrossEncoder multi-processing, multilingual NanoBEIR evaluators, similarity score in `mine_hard_negatives`, Transformers v5 support

20 hours ago

This minor release introduces multi-processing for CrossEncoder (rerankers), multilingual NanoBEIR evaluators, similarity score outputs in mine_hard_negatives, Transformers v5 support, Python 3.9 deprecations, and more.

Install this version with

# Training + Inference
pip install sentence-transformers[train]==5.2.0

# Inference only, use one of:
pip install sentence-transformers==5.2.0
pip install sentence-transformers[onnx-gpu]==5.2.0
pip install sentence-transformers[onnx]==5.2.0
pip install sentence-transformers[openvino]==5.2.0

CrossEncoder Multi-processing

The CrossEncoder class now supports multiprocessing for faster inference on CPU and multi-GPU setups. This brings CrossEncoder functionality in line with the existing multiprocessing capabilities of SentenceTransformer models, allowing you to use multiple CPU cores or GPUs to speed up both the predict and rank methods when processing large batches of sentence pairs.

The implementation introduces these new methods, mirroring the SentenceTransformer approach:

  • start_multi_process_pool() - Initialize a pool of worker processes
  • stop_multi_process_pool() - Clean up the worker pool

Usage is straightforward with the new pool parameter:

from sentence_transformers.cross_encoder import CrossEncoder

def main():
	model = CrossEncoder('cross-encoder/ms-marco-MiniLM-L6-v2')
	
	# Start a pool of workers
	pool = model.start_multi_process_pool()
	
	# Use the pool for faster inference
	scores = model.predict(sentence_pairs, pool=pool)
	rankings = model.rank(query, documents, pool=pool)
	
	# Clean up when done
	model.stop_multi_process_pool(pool)

if __name__ == "__main__":
    main()

Or simply pass a list of devices to device to have predict and rank automatically create a pool behind the scenes.

from sentence_transformers.cross_encoder import CrossEncoder

def main():
	model = CrossEncoder('cross-encoder/ms-marco-MiniLM-L6-v2', device="cpu")
	
	# Use 4 processes
	scores = model.predict(sentence_pairs, device=["cpu"] * 4)
	rankings = model.rank(query, documents, device=["cpu"] * 4)

if __name__ == "__main__":
    main()

This enhancement is particularly beneficial for CPU-based deployments and enables multi-GPU reranking in the mine_hard_negatives function, making hard negative mining faster for large datasets.

Multilingual NanoBEIR Support

The NanoBEIR evaluators now support custom dataset IDs, allowing for evaluation on non-English NanoBEIR collections. All three NanoBEIR evaluators (dense, sparse, and cross-encoder) support this functionality with a simple dataset_id parameter.

For example:

import logging
from pprint import pprint

from sentence_transformers import SentenceTransformer
from sentence_transformers.evaluation import NanoBEIREvaluator

logging.basicConfig(format="%(asctime)s - %(message)s", datefmt="%Y-%m-%d %H:%M:%S", level=logging.INFO)

# Load a model to evaluate
model = SentenceTransformer("google/embeddinggemma-300m")
# Use a Serbian translation of NanoBEIR
evaluator = NanoBEIREvaluator(
    ["msmarco", "nq"],
    dataset_id="Serbian-AI-Society/NanoBEIR-sr"
)
results = evaluator(model)
print(results[evaluator.primary_metric])
pprint({key: value for key, value in results.items() if "ndcg@10" in key})
"""
{'NanoBEIR_mean_cosine_ndcg@10': 0.44754032737278326,
 'NanoMSMARCO_cosine_ndcg@10': 0.4424192627754922,
 'NanoNQ_cosine_ndcg@10': 0.45266139197007427}
"""

There are already supported translations for French, Arabic, German, Spanish, Italian, Portuguese, Norwegian, Swedish, Serbian, Korean, Japanese, and 22 Bharat languages in the NanoBEIR collection. Contact me (@tomaarsen) if you have found or created another translation and would like to get it added to the collection!

Similarity Scores in Hard Negatives Mining

The mine_hard_negatives function now includes an output_scores parameter that allows you to export similarity scores alongside the mined negatives. When output_scores=False (default), these are the output formats for various output_formats:

  • "triplet": (anchor, positive, negative)
  • "n-tuple": (anchor, positive, negative_1, ..., negative_n)
  • "labeled-pair": (anchor, passage, label)
  • "labeled-list": (anchor, [passages], [labels])

And when output_scores=True, the format becomes:

  • "triplet": (anchor, positive, negative, [scores])
  • "n-tuple": (anchor, positive, negative_1, ..., negative_n, [scores])
  • "labeled-pair": (anchor, passage, score)
  • "labeled-list": (anchor, [passages], [scores])

For context, labels are binary options denoting whether the relevant pair was labeled as a positive or not, whereas scores are similarity scores from the SentenceTransformer or CrossEncoder model.
Additionally:

  • The deprecated n-tuple-scores format has been replaced with the cleaner output_format="n-tuple" combined with output_scores=True.
  • Several issues with datasets supporting multiple positives have been resolved

For example:

from sentence_transformers.util import mine_hard_negatives
from sentence_transformers import SentenceTransformer
from datasets import load_dataset

# Load a Sentence Transformer model
model = SentenceTransformer("sentence-transformers/static-retrieval-mrl-en-v1")

# Load a dataset to mine hard negatives from
dataset = load_dataset("sentence-transformers/natural-questions", split="train").select(range(10000))
print(dataset)
"""
Dataset({
    features: ['query', 'answer'],
    num_rows: 10000
})
"""

# Mine hard negatives into num_negatives + 3 columns:
# 'query', 'answer', 'negative_1', 'negative_2', ..., 'score'
# where 'score' is a list of similarity scores for the query-answer plus each query-negative pair.
dataset = mine_hard_negatives(
    dataset=dataset,
    model=model,
    num_negatives=5,
    sampling_strategy="top",
    relative_margin=0.05,
    batch_size=128,
    use_faiss=True,
    output_format="labeled-list",
    output_scores=True,
)
"""
Negative candidates mined, preparing dataset...
Metric       Positive       Negative     Difference
Count          10,000         49,241
Mean           0.5884         0.3909         0.2033
Median         0.6005         0.3766         0.1837
Std            0.1467         0.1050         0.1337
Min            0.0272         0.1595         0.0088
25%            0.4918         0.3127         0.0903
50%            0.6005         0.3766         0.1837
75%            0.6974         0.4558         0.2924
Max            0.9679         0.8505         0.7281
Skipped 25,451 potential negatives (4.89%) due to the relative_margin of 0.05.
Could not find enough negatives for 148 samples (1.48%). Consider adjusting the range_max and relative_margin parameters if you'd like to find more valid negatives.
"""
print(dataset)
"""
Dataset({
    features: ['query', 'answer', 'scores'],
    num_rows: 9852
})
"""
print(dataset[0])
{
    "query": "when did richmond last play in a preliminary final",
    "answer": [
        "Richmond Football Club Richmond began 2017 with 5 straight wins, a feat it had not achieved since 1995. A series of close losses hampered the Tigers throughout the middle of the season, including a 5-point loss to the Western Bulldogs, 2-point loss to Fremantle, and a 3-point loss to the Giants. Richmond ended the season strongly with convincing victories over Fremantle and St Kilda in the final two rounds, elevating the club to 3rd on the ladder. Richmond's first final of the season against the Cats at the MCG attracted a record qualifying final crowd of 95,028; the Tigers won by 51 points. Having advanced to the first preliminary finals for the first time since 2001, Richmond defeated Greater Western Sydney by 36 points in front of a crowd of 94,258 to progress to the Grand Final against Adelaide, their first Grand Final appearance since 1982. The attendance was 100,021, the largest crowd to a grand final since 1986. The Crows led at quarter time and led by as many as 13, but the Tigers took over the game as it progressed and scored seven straight goals at one point. They eventually would win by 48 points – 16.12 (108) to Adelaide's 8.12 (60) – to end their 37-year flag drought.[22] Dustin Martin also became the first player to win a Premiership medal, the Brownlow Medal and the Norm Smith Medal in the same season, while Damien Hardwick was named AFL Coaches Association Coach of the Year. Richmond's jump from 13th to premiers also marked the biggest jump from one AFL season to the next.",
        "2017 AFL Grand Final The 2017 AFL Grand Final was an Australian rules football game contested between the Adelaide Crows and the Richmond Tigers, held at the Melbourne Cricket Ground on 30 September 2017. It was the 121st annual grand final of the Australian Football League (formerly the Victorian Football League), staged to determine the premiers for the 2017 AFL season.[1]. Richmond defeated Adelaide by 48 points, marking the club's eleventh premiership and first since 1980. Richmond's Dustin Martin won the Norm Smith Medal as the best player on the ground. The match was attended by 100,021 people, the largest crowd since the 1986 Grand Final.",
        "Raid of Richmond The Richmond Campaign was a group of British military actions against the capital of Virginia, Richmond, and the surrounding area, during the American Revolutionary War. Led by American turncoat Benedict Arnold, the Richmond Campaign is considered one of his greatest successes while serving under the British Army, and one of the most notorious actions that Arnold ever performed.",
        "2001 AFL Grand Final The 2001 AFL Grand Final was an Australian rules football game contested between the Essendon Football Club and the Brisbane Lions, held at the Melbourne Cricket Ground in Melbourne on 29 September 2001. It was the 105th annual Grand Final of the Australian Football League (formerly the Victorian Football League),[1] staged to determine the premiers for the 2001 AFL season. The match, attended by 91,482 spectators, was won by Brisbane by a margin of 26 points, marking that club's first premiership victory.",
        "1964 VFL Grand Final The 1964 VFL Grand Final was an Australian rules football game contested between the  Collingwood Football Club and Melbourne Football Club, held at the Melbourne Cricket Ground in Melbourne on 19 September 1964. It was the 68th annual Grand Final of the Victorian Football League, staged to determine the premiers for the 1964 VFL season. The match, attended by 102,471 spectators, was won by Melbourne by a margin of 4 points, marking that club's 12th (and to date, most recent) premiership victory.",
        "1998 AFL Grand Final The 1998 AFL Grand Final was an Australian rules football game contested between the Adelaide Crows and the North Melbourne Kangaroos, held at the Melbourne Cricket Ground in Melbourne on 26 September 1998. It was the 102nd annual Grand Final of the Australian Football League (formerly the Victorian Football League), staged to determine the premiers for the 1998 AFL season. The match, attended by 94,431 spectators, was won by Adelaide by a margin of 35 points marking that club's second consecutive premiership victory, and second premiership overall.",
    ],
    "scores": [
        0.5460646748542786,
        0.5105829238891602,
        0.4460095167160034,
        0.3221113085746765,
        0.3161606788635254,
        0.31184709072113037,
    ],
}
# dataset.push_to_hub("natural-questions-hard-negatives", "labeled-list-scores")

Transformers v5 Support

Sentence Transformers now supports the latest Transformers v5.0 release while maintaining backward compatibility with v4.x. The library includes dual CI testing for both version for now, allowing users to upgrade to the newest Transformers features when ready. In future versions, Sentence Transformers may start requiring Transformers v5.0 or higher.

Pillow now Optional

The Pillow library is now an optional dependency rather than a required one, reducing installation size for users who don't work with image-based models. Users who need image functionality can install it via pip install sentence-transformers[image] or directly with pip install pillow.

Python 3.9 Deprecation

Following Python's deprecation schedule, Sentence Transformers v5.2.0 has deprecated support for Python 3.9. Users are encouraged to upgrade to Python 3.10 or newer to continue receiving updates and new features.

Minor Changes

  • Training dataset columns with names "scores" and "labels" are now also considered special label columns, whose information will be passed to the labels argument in the loss that's used to train (#3506).
  • The sentence-transformers[onnx] and sentence-transformers[onnx-gpu] extra's now rely on the new optimum-onnx package with optimum >= 2.0.0.

All Changes

  • [tests] Loosen safetensors test rtol/atol by @tomaarsen in #3572
  • [deprecation] Deprecate Python 3.9, upgrade ruff by @tomaarsen in #3573
  • ArXiv -> HF Papers by @qgallouedec in #3565
  • Document broken LexRank pip implementation by @stevenae in #3567
  • Add documentation analytics by @tomaarsen in #3577
  • [fix]: correct condition for restoring layer embeddings in TransformerDecorator/AdaptiveLayerLoss by @emapco in #3560
  • [chore] Rename master to main, update outdated URLs by @tomaarsen in #3579
  • [tests] Increase atol/rtol from 1e-6 to 1e-5 for higher test consistency by @tomaarsen in #3578
  • [feat] Allow transformers v5.0, add CI for transformers =v5 by @tomaarsen in #3586
  • add multiprocessing support for Cross Encoder by @omkar-334 in #3580
  • [deps] Use optimum-onnx now that both optimum-onnx and optimum-intel can use optimum==2.0.0 by @tomaarsen in #3587
  • Skip test_train_stsb tests; triggers rate limit too often by @tomaarsen in #3590
  • feat/deps: make Pillow an optional dependency by @akx in #3589
  • Extend NanoBEIR evaluators to support custom NanoBEIR datasets by @milistu in #3583
  • Mine hard negatives: optionally output similarity scores by @tsbalzhanov in #3506
  • docs: update NanoBEIR collection links and descriptions for evaluators by @tomaarsen in #3591
  • docs: add release notes summary for v5.2 on main page by @tomaarsen in #3592

New Contributors

An extra thanks to @Samoed, @NohTow, and @raphaelsty for engaging in valuable discussions in the pull requests, @omkar-334 for finding all kinds of open issues where possible, and @marquesafonso for working on a solid PR for multilingual NanoBEIR that we didn't end up going for.

Additionally, a big thanks to @milistu from Serbian-AI-Society, @NohTow & @raphaelsty from LightOn, @mlabonne and Fernando Fernandes Neto from LiquidAI, @lbourdois from CATIE-AQ and Arun Arumugam for creating the NanoBEIR translations that are supported out of the gate.

Full Changelog: v5.1.2...v5.2.0

Don't miss a new sentence-transformers release

NewReleases is sending notifications on new releases.