Docker image/tag: semitechnologies/weaviate:0.22.0
See also: example docker compose files in english and dutch.

Contains Breaking Change!

Note: While this release contains no API-level breaking changes, the internals have changes so much, that we recommend not to simply replace your existing Weaviate container with the new one. Instead you should create a new cluster and reimport our things and actions. See changelog below for more detailed reasons why.

Breaking Changes

Improve cross-reference storing strategy (#1069)
Prior to this release Weaviate would build an automated cache of referenced objects. This led to very fast response time for nested queries, at the cost of large disk usage. We have since learned that disk usage can be so excessive in heavily connected graphs that the benefits don't outweigh the costs. In addition configuring cache boundaries led to unnecessary complexity.
The major goal of 0.22.0 was to replace automated denormalization caching with a smarter strategy without losing the snappiness of cached results and the overall low latencies of queries our users have come to appreciate.
We believe we have found a good strategy with this release, by implementing smarter query strategies to keep inter-container traffic to a minimum and use our backing storage in a way it performs well.
This boils down to the following advantages that 0.22.0 provides over 0.21.x:
- Feature parity No feature got lost through the rewrite. If it worked with 0.21.x it works with 0.22.x. If you think otherwise, please open an issue
- Much smaller disk footprint Since we don't excessively normalize references anymore, the disk footprint got much smaller. Essentially the size on disk is now (object size + vector size + index overheads) * desiredReplication. The amount of cross-references no longer has a direct impact on disk space (other than storing the link itself which is effectively the size of the bytes in a weaviate://... beacon)
- No depth limit on nested filters Prior to this release a filter on a cross ref prop, such as path: ["inCity", "City", "inCountry", "Country", "name"] had a limit. It would only work within a cache boundary. This limitation is now gone and you can filter as deep as you like. Please note that an excessively deep query will have a perfomance impact.
- Smaller CPU impact during imports Prior to this release we'd spent a share of the available resources on building a denormalized cache asynchronously after importing a connected object. Without having to build such a cache, more performance on imports is available for storing, vectorizing and indexing objects.
Please note that caching was previously done at import time. We recommend not to try to upgrade a 0.21.x cluster, but instead creating a new cluster and reimporting. This is the only way to guarantee your cluster won't have cache leftovers which can impact performance.

New Features

none

Fixes

#967 became obsolete through this change

weaviate/weaviate 0.22.0 0.22.0 - Updated Cross-Reference Storage Strategy on GitHub

Contains Breaking Change!

Breaking Changes

New Features

Fixes

weaviate/weaviate 0.22.0
0.22.0 - Updated Cross-Reference Storage Strategy

on GitHub