Docker image/tag: semitechnologies/weaviate:1.1.0
See also: example docker-compose files in English, Dutch, German, Czech, Italian. If you need to configure additional settings, you can also generate a custom docker-compose.yml
file using the documentation.
Breaking Changes
none
New Features
-
nearObject
search to get most similar objects (#1427)Prior to the introduction of this feature, the only way to get the objects closest to one another was to display an objects vector and then do a
nearVector
search with it. Now this can be done in a single step:GraphQL
Get { ClassName(nearObject:{...}) {...}
You can simply specify an object'sid
orbeacon
, such as:{ Get{ Publication( nearObject: { id: "27b5213d-e152-4fea-bd63-2063d529024d", // alternatively `beacon` certainty: 0.7 } ){ name _additional { certainty } } } }
Combining near Object with movements in the
text2vec-contextionary
module
You can even add the nearObject search into an existing movement, for an example see the second code block here. -
Cross-reference batch import speed improvements (#1334, #1259)
Prior to this release importing cross-references in batches was no faster than importing objects. Internally adding a reference was seen as an update, which would lead to a deletion and creation of the updated version in all indices. However, since reference-updates do not alter the vector position of an object, a full reimport wasn't necessary. This release add's a special logic that recognizes such updates and treats them in an optimized fashion. A single reference batch is now considerably faster and even overall import speeds can improve between 30 and 50% depending on how cross-reference heavy your dataset is. Additionally, some improvements around writing into the inverted index in batches have been made, leading to a slightly improved import time for object batches on large imports.
Fixes
-
Fix Search Inconsistencies during heavy write loads (#1362)
Prior to this release there was a bug in the HNSW implementation which could lead to searches returning zero (or too few) results if the search was performed while an import was running. Such a situation is very common in classification scenarios where already classified objects are being written while other objects are being classified (which requires a similarity search).
On large classifications this could lead to some to-be-classified items never finding any training data and thus not classifying the item. As a further symptom, these unclassified items could then be found in subsequent classifications. In such a scenario you might have seen two subsequent classification with
count: 5000
andcount: 20
, suggesting a total fo5020
objects - even when there were only 5000 objects present.This fix addresses the root cause of index inconsistencies while importing, which - among other things - fixes the classification miscount issue.