github UKPLab/sentence-transformers v1.2.0
v1.2.0 - Unsupervised Learning, New Training Examples, Improved Models

latest releases: v2.7.0, v2.6.1, v2.6.0...
2 years ago

Unsupervised Sentence Embedding Learning

New methods integrated to train sentence embedding models without labeled data. See Unsupervised Learning for an overview of all existent methods.

New methods:

Pre-Training Methods

  • MLM: An example script to run Masked-Language-Modeling (MLM). Running MLM on your custom data before supervised training can significantly improve the performances. Further, MLM also works well for domain trainsfer: You first train on your custom data, and then train with e.g. NLI or STS data.

Training Examples

New models

New Functions

  • SentenceTransformer.fit() Checkpoints: The fit() method now allows to save checkpoints during the training at a fixed number of steps. More info
  • Pooling-mode as string: You can now pass the pooling-mode to models.Pooling() as string:
    pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension(), pooling_mode='mean')
    Valid values are mean/max/cls.
  • NoDuplicatesDataLoader: When using the MultipleNegativesRankingLoss, one should avoid to have duplicate sentences in the same sentence. This data loader simplifies this task and ensures that no duplicate entries are in the same batch.~~~~

Don't miss a new sentence-transformers release

NewReleases is sending notifications on new releases.