This patch release introduces some improvements for the SentenceTransformerTrainer, as well as some updates for the automatic model card generation. It also patches some minor evaluator bugs and a bug with MatryoshkaLoss
. Lastly, every single Sentence Transformer model can now be saved and loaded with the safer model.safetensors
files.
Install this version with
# Full installation:
pip install sentence-transformers[train]==3.0.1
# Inference only:
pip install sentence-transformers==3.0.1
SentenceTransformerTrainer improvements
- Implement gradient checkpointing for lower memory usage during training (#2717)
- Implement support for
push_to_hub=True
Training Argument, also implementtrainer.push_to_hub(...)
(#2718)
Model Cards
This patch release improves on the automatically generated model cards in several ways:
- Your training datasets are now automatically linked if they're on Hugging Face (#2711)
- A new
generated_from_trainer
tag is now also added (#2710) - The automatically included widget examples are now improved, especially for question-answering. Previously, the widget could give examples of comparing two questions with eachother (#2713)
- If you save a model locally, then load it again and upload it, it would previously still show
...
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
...
This now gets replaced with your new model ID on Hugging Face (#2714)
- The exact training dataset size is now included in the model metadata, rather than as a bucket of e.g. 1K<n<10K (#2728)
Evaluators fixes
- The primary metric of evaluators in
SequentialEvaluator
would be ignored in thescores
calculation (#2700) - Fix confusing print statement in TranslationEvaluator when using
print_wrong_matches=True
(#1894) - Fix bug that prevents you from customizing the
primary_metric
inInformationRetrievalEvaluator
(#2701) - Allow passing a list of evaluators to the STTrainer rather than a
SequentialEvaluator
(#2717)
Losses fixes
- Fix
MatryoshkaLoss
crash if the first dimension is not the biggest (#2719)
Security
- Integrate safetensors with all modules, including Dense, LSTM, CNN, etc. to prevent needing pickled
pytorch_model.bin
anymore (#2722)
All changes
- updating to evaluation_strategy by @higorsilvaa in #2686
- fix loss link by @Samoed in #2690
- Fix bug that restricts users from specifying custom primary_function in InformationRetrievalEvaluator by @hetulvp in #2701
- Fix a bug in SequentialEvaluator to use primary_metric if defined in evaluator. by @hetulvp in #2700
- [
fix
] Always override the originally saved version in the ST config by @tomaarsen in #2709 - [
model cards
] Also include HF datasets in the model card metadata by @tomaarsen in #2711 - Add "generated_from_trainer" tag to auto-generated model cards by @tomaarsen in #2710
- Fix confusing print statement in TranslationEvaluator by @NathanS-Git in #1894
- [
model cards
] Improve the widget example selection: not based on embeddings, better for QA by @tomaarsen in #2713 - [
model cards
] Replace 'sentence_transformers_model_id' from reused model if possible by @tomaarsen in #2714 - [
feat
] Allow passing a list of evaluators to the Trainer by @tomaarsen in #2716 - [
fix
] Fix gradient checkpointing to allow for much lower memory usage by @tomaarsen in #2717 - [
fix
] Implementcreate_model_card
on the Trainer, allowing args.push_to_hub=True by @tomaarsen in #2718 - [
fix
] FixMatryoshkaLoss
crash if the first dimension is not the biggest by @tomaarsen in #2719 - Update models_en_sentence_embeddings.html by @saikartheekb in #2720
- [
typing
] Improve typing for many functions & addpy.typed
to satisfymypy
by @tomaarsen in #2724 - [
fix
] Fix edge case with evaluator being None by @tomaarsen in #2726 - [
simplify
] Set can_return_loss=True globally, instead of via the data collator by @tomaarsen in #2727 - [
feat
] Integrate safetensors with Dense, etc. modules too. by @tomaarsen in #2722 - [
model cards
] Specify the exact dataset size as a tag, will be bucketized by HF by @tomaarsen in #2728
New Contributors
- @higorsilvaa made their first contribution in #2686
- @hetulvp made their first contribution in #2701
- @NathanS-Git made their first contribution in #1894
- @saikartheekb made their first contribution in #2720
Full Changelog: v3.0.0...v3.0.1