github UKPLab/sentence-transformers v0.2.0
v0.2.0 - New Architecture & Models

latest releases: v3.1.1, v3.1.0, v3.0.1...
5 years ago

v0.2.0 completely changes the architecture of sentence transformers.

The new architecture is based on a sequential architecture: You define individual models that transform step-by-step a sentence to a fixed sized sentence embedding.

The modular architecture allows to easily swap different components. You can choose between different embedding methods (BERT, XLNet, word embeddings), transformations (LSTM, CNN), weighting & pooling methods as well as adding deep averaging networks.

New models in this release:

  • Word Embeddings (like GloVe) for computation of average word embeddings
  • Word weighting, for example, with tf-idf values
  • BiLSTM and CNN encoder, for example, to re-create the InferSent model
  • Bag-of-Words (BoW) sentence representation. Optionally also with tf-idf weighting.

This release has many breaking changes with the previous release. If you need help with the migration, open a new issue.

New model storing procedure: Each sub-module is stored in its own subfolder. If you need to migrate old models, it is best to create the subfolder structure by the system (model.save()) and then to copy the pytorch_model.bin into the correct subfolder.

Don't miss a new sentence-transformers release

NewReleases is sending notifications on new releases.