github UKPLab/sentence-transformers v0.3.8
v0.3.8 - CrossEncoder, Data Augmentation, new Models

latest releases: v2.7.0, v2.6.1, v2.6.0...
3 years ago
  • Add support training and using CrossEncoder
  • Data Augmentation method AugSBERT added
  • New model trained on large scale paraphrase data. Models works on internal benchmark much better than previous models: distilroberta-base-paraphrase-v1 and xlm-r-distilroberta-base-paraphrase-v1
  • New model for Information Retrieval trained on MS Marco: distilroberta-base-msmarco-v1
  • Improved MultipleNegativesRankingLoss loss function: Similarity function can be changed and is now cosine similarity (was dot-product before), further, similarity scores can be multiplied by a scaling factor. This allows the usage of NTXentLoss / InfoNCE loss.
  • New MegaBatchMarginLoss, inspired from the paper ParaNMT-Paper.

Smaller changes:

  • Update InformationRetrievalEvaluator, so that it can work with large corpora (Millions of entries). Removed the query_chunk_size parameter from the evaluator
  • SentenceTransformer.encode method detaches tensors from compute graph
  • SentenceTransformer.fit() method - Parameter output_path_ignore_not_empty deprecated. No longer checks that target folder must be empty

Don't miss a new sentence-transformers release

NewReleases is sending notifications on new releases.