Docker image/tag: semitechnologies/weaviate:0.22.12
See also: example docker compose files in English, German, Dutch, Italian and Czech.
Breaking Changes
none
New Features
-
New Underscore Prop for visualization:
_featureProjection
(#1178, #1139)
This release adds a new optional property ("underscore prop") to list results (RESTGET /v1/{kinds}
, GraphQLGet {}
). The feature projection is intended to reduce the dimensionality of the object's vector into something easily suitable for visualizing, such as 2d or 3d. The underlying algorithm is exchangeable, the first algorithm to be provided ist-SNE
.The feature can be used without any params and tries to pick reasonable defaults. To do so, use the include parameter on REST (
GET /v1/{kinds}/?include=_featureProjection) or the
_featureProjection { vector }` paramter in GraphQL which appears alongside the schema-defined properties.Optional Parameteres
To tweak the feature projection optional paramaters (currently GraphQL-only) can be provided. The values and their defaults are:
Parameter Type Default Implication dimensions
int
2
Target dimensionality, usually 2
or3
algorithm
string
tsne
Algorithm to be used, currently supported: tsne
perplexity
int
min(5, len(results)-1)
The t-SNE
perplexity value, must be smaller than then-1
wheren
is the number of results to be visualizedlearningRate
int
25
The t-SNE
learning rateiterations
int
100
The number of iterations the t-SNE
algorithm runs. Higher values lead to more stable results at the cost of a larger response timeLimitations and Restrictions
- There is no request size limit (other than the global 10,000 items request limit) which can be used on a
_featureProjection
query. However, due to the O(n^2) complexity of thet-SNE
algorithm, large requests size have an exponential effect on the response time. We recommend to keep the request size at or below 100 items, as we have noticed drastic increases in response time thereafter. - Feature Projection happens in real-time, per query. The dimensions returned have no meaning across queries.
- Currently only root elements (not resolved cross-references) are taken into consideration for the featureProjection.
- Due to the relatively high cost of the underlying algorithm, we recommend to limit requests including a
_featureProjection
in high-load situations where response time matters. Avoid parallel requests including a_featureProjection
, so that some threads stay available to serve other, time-critical requests.
Example
The screenshot below shows a visualization done on a subset of the 20 newsgroup dataset with the article's main category used as label. The chart was created in Python using
matplotlib.pyplot
'sscatter
feature. - There is no request size limit (other than the global 10,000 items request limit) which can be used on a
Fixes
none