This patch release introduces some improvements for the SentenceTransformerTrainer, as well as some updates for the automatic model card generation. It also patches some minor evaluator bugs and a bug with MatryoshkaLoss. Lastly, every single Sentence Transformer model can now be saved and loaded with the safer model.safetensors files.
Install this version with
# Full installation:
pip install sentence-transformers[train]==3.0.1
# Inference only:
pip install sentence-transformers==3.0.1SentenceTransformerTrainer improvements
- Implement gradient checkpointing for lower memory usage during training (#2717)
- Implement support for
push_to_hub=TrueTraining Argument, also implementtrainer.push_to_hub(...)(#2718)
Model Cards
This patch release improves on the automatically generated model cards in several ways:
- Your training datasets are now automatically linked if they're on Hugging Face (#2711)
- A new
generated_from_trainertag is now also added (#2710) - The automatically included widget examples are now improved, especially for question-answering. Previously, the widget could give examples of comparing two questions with eachother (#2713)
- If you save a model locally, then load it again and upload it, it would previously still show
...
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
...This now gets replaced with your new model ID on Hugging Face (#2714)
- The exact training dataset size is now included in the model metadata, rather than as a bucket of e.g. 1K<n<10K (#2728)
Evaluators fixes
- The primary metric of evaluators in
SequentialEvaluatorwould be ignored in thescorescalculation (#2700) - Fix confusing print statement in TranslationEvaluator when using
print_wrong_matches=True(#1894) - Fix bug that prevents you from customizing the
primary_metricinInformationRetrievalEvaluator(#2701) - Allow passing a list of evaluators to the STTrainer rather than a
SequentialEvaluator(#2717)
Losses fixes
- Fix
MatryoshkaLosscrash if the first dimension is not the biggest (#2719)
Security
- Integrate safetensors with all modules, including Dense, LSTM, CNN, etc. to prevent needing pickled
pytorch_model.binanymore (#2722)
All changes
- updating to evaluation_strategy by @higorsilvaa in #2686
- fix loss link by @Samoed in #2690
- Fix bug that restricts users from specifying custom primary_function in InformationRetrievalEvaluator by @hetulvp in #2701
- Fix a bug in SequentialEvaluator to use primary_metric if defined in evaluator. by @hetulvp in #2700
- [
fix] Always override the originally saved version in the ST config by @tomaarsen in #2709 - [
model cards] Also include HF datasets in the model card metadata by @tomaarsen in #2711 - Add "generated_from_trainer" tag to auto-generated model cards by @tomaarsen in #2710
- Fix confusing print statement in TranslationEvaluator by @NathanS-Git in #1894
- [
model cards] Improve the widget example selection: not based on embeddings, better for QA by @tomaarsen in #2713 - [
model cards] Replace 'sentence_transformers_model_id' from reused model if possible by @tomaarsen in #2714 - [
feat] Allow passing a list of evaluators to the Trainer by @tomaarsen in #2716 - [
fix] Fix gradient checkpointing to allow for much lower memory usage by @tomaarsen in #2717 - [
fix] Implementcreate_model_cardon the Trainer, allowing args.push_to_hub=True by @tomaarsen in #2718 - [
fix] FixMatryoshkaLosscrash if the first dimension is not the biggest by @tomaarsen in #2719 - Update models_en_sentence_embeddings.html by @saikartheekb in #2720
- [
typing] Improve typing for many functions & addpy.typedto satisfymypyby @tomaarsen in #2724 - [
fix] Fix edge case with evaluator being None by @tomaarsen in #2726 - [
simplify] Set can_return_loss=True globally, instead of via the data collator by @tomaarsen in #2727 - [
feat] Integrate safetensors with Dense, etc. modules too. by @tomaarsen in #2722 - [
model cards] Specify the exact dataset size as a tag, will be bucketized by HF by @tomaarsen in #2728
New Contributors
- @higorsilvaa made their first contribution in #2686
- @hetulvp made their first contribution in #2701
- @NathanS-Git made their first contribution in #1894
- @saikartheekb made their first contribution in #2720
Full Changelog: v3.0.0...v3.0.1