UKPLab/sentence-transformers v0.2.0 on GitHub

v0.2.0 completely changes the architecture of sentence transformers.

The new architecture is based on a sequential architecture: You define individual models that transform step-by-step a sentence to a fixed sized sentence embedding.

The modular architecture allows to easily swap different components. You can choose between different embedding methods (BERT, XLNet, word embeddings), transformations (LSTM, CNN), weighting & pooling methods as well as adding deep averaging networks.

New models in this release:

Word Embeddings (like GloVe) for computation of average word embeddings
Word weighting, for example, with tf-idf values
BiLSTM and CNN encoder, for example, to re-create the InferSent model
Bag-of-Words (BoW) sentence representation. Optionally also with tf-idf weighting.

This release has many breaking changes with the previous release. If you need help with the migration, open a new issue.

New model storing procedure: Each sub-module is stored in its own subfolder. If you need to migrate old models, it is best to create the subfolder structure by the system (model.save()) and then to copy the pytorch_model.bin into the correct subfolder.

UKPLab/sentence-transformers v0.2.0 v0.2.0 - New Architecture & Models on GitHub

UKPLab/sentence-transformers v0.2.0
v0.2.0 - New Architecture & Models

on GitHub