Release Notes
- Locally generated embeddings are now default. If you want to continue using API embeddings, set EMBEDDING_BACKEND to openai. This will download a ONNX model and recreate all embeddings. But in most instances it's very worth it. Removing the network bound call to create embeddings. Creating embeddings on my N100 device is extremely fast. Typically a search response is provided in less than 50ms.
- Added a benchmarks create for evaluating the retrieval process
- Added fastembed embedding support, enables the use of local CPU generated embeddings, greatly improved latency if machine can handle it. Quick search has vastly better accuracy and is much faster, 50ms latency when testing compared to minimum 300ms.
- Embeddings stored on own table.
- Refactored retrieval pipeline to use the new, faster and more accurate strategy. Read blog post for more details.
Download main 1.0.0
| File | Platform | Checksum |
|---|---|---|
| main-aarch64-apple-darwin.tar.xz | Apple Silicon macOS | checksum |
| main-x86_64-apple-darwin.tar.xz | Intel macOS | checksum |
| main-x86_64-pc-windows-msvc.zip | x64 Windows | checksum |
| main-x86_64-unknown-linux-gnu.tar.xz | x64 Linux | checksum |