Docker image/tag: semitechnologies/weaviate:1.2.0
Breaking Changes
none
New Features
-
Use Transformer NLP Models with Weaviate out of the box
With Weaviate release
v1.0.0
we introduced a module API and the ability to import any vector. This allowed the usage of any Machine Learning Model with Weaviate. With this release we are making it even easier to use Weaviate with some of the most popular ML Models out there; transformers. This means models likeBERT
,DilstBERT
,RoBERTa
,DilstilROBERTa
, etc. can be used out-of-the box with Weaviate.To use transformers with weaviate the
text2vec-transformers
module needs to be enabled. The models are encapsulated in Docker containers. This allows for efficient scaling and resource planning. Neural-Network-based models run most efficiently on GPU-enabled serves, yet Weaviate is CPU-optimized. This separate-container microservice setup allows you to very easily host (and scale) the model independently on GPU-enabled hardware while keeping Weaviate on cheap CPU-only hardware.To choose your specific model, you simply need to select the correct Docker container. There is a selection of pre-built Docker images available, but you can also build your own with a simple two-line Dockerfile.
How to get started with transformers
Option 1: With an example docker-compose file
You can find an example Docker-compose file here, which will spin up Weaviate with the transformers module. In this example we have selected the
sentence-transformers/msmarco-distilroberta-base-v2
which works great for asymmetric semantic search. See below for how to select an alternative model.Option 2: Configure your custom setup
Step 1: Enable the
text2vec-transformers
moduleMake sure you set the
ENABLE_MODULES=text2vec-transformers
environment variable. Additionally make this module the default vectorizer, so you don't have to specify it on each schema class:DEFAULT_VECTORIZER_MODULE=text2vec-transformers
Important: This setting is now a requirement, if you plan on using any module. So, when using the
text2vec-contextionary
module, you need to haveENABLE_MODULES=text2vec-contextionary
set. All our configuration-generators / Helm charts will be updated as part of the Weaviatev1.2.0
support.Step 2: Run your favorite model
Choose any of our pre-built transformers models (for building your own model container, see below) and spin it up (for example using
docker run -itp "8000:8080" semitechnologies/transformers-inference:sentence-transformers-msmarco-distilroberta-base-v2
) . Use a CUDA-enabled machine for optimal performance. Read more about CUDA-support on the inference container here !!!!TODO!!!.Step 3: Tell Weaviate where to find the inference container
Set the Weaviate environment variable
TRANSFORMERS_INFERENCE_API
to where your inference container is running, for exampleTRANSFORMERS_INFERENCE_API="http://localhost:8000"
You can now use Weaviate normally and all vectorization during import and search time will be done with the selected transformers model.
Run with any transformers module
You have three options to select your desired model:
- Use any of our pre-built transformers model containers The models selected in this list have proven to work well with semantic search in the past. (If you think we should support another model out-of-the-box please open an issue or pull request here.
- Use any model from Hugging Face Model Hub. Click here to learn how.
- Use any PyTorch or Tensorflow model from your local disk. Click here to learn how.
Transformers-specific module configuration (on classes and properties)
You can use the same module-configuration on your classes and properties which you already know from the
text2vec-contextionary
module. This includesvectorizeClassName
,vectorizePropertyName
andskip
.In addition you can use a class-level module config to select the pooling strategy with
poolingStrategy
. Allowed values aremasked_mean
orcls
. They refer to different techniques to obtain a sentence-vector from individual word vectors as outlined in the Sentence-BERT paper.Limitations
- The Weaviate module system currently does not support running two different
text2vec-...
modules in the same setup. This is due to a limitation in theExplore { }
search where both modules would try to provide an inter-classnearText
searcher, which would be incompatible. - The configuration generator tool on our Website is not yet capable of selecting between a contextionary-focused or a transformer-focused setup. This will be added shortly. In the meantime the easiest way is to use an example Docker Compose file.
Fixes
none