What's Changed
You can now run a model with a alias. This will help you communicating with the API.
infinity_emb --served-model-name "your_nickname"
You can now use preload
models. This acts as a "run download and load into ram" test. Upon execution, all files are cached, which will speedup consecutive loads. For additonal speedups, use --no-model-warmup
to skip model warmup after loading.
infinity_emb --preload-only --model--name-or-path BAAI/bge-large-en-v1.5
PR's
- feat: add served_model_name argument for the infinity_server by @bufferoverflow in #180
- FIX: import crossencoder without torch installed and git push of creds by @michaelfeil in #181
- update default model_name to be unified name across routes by @michaelfeil in #179
- python39 type hints by @michaelfeil in #182
- pydantic cli / args validation by @michaelfeil in #183
- update defered moving to cpu & type hints improvement by @michaelfeil in #187
- Update README.md - add Contributors by @michaelfeil in #189
- update infinity offline solution by @michaelfeil in #195
- update offline-mode: deployment docs v2 by @michaelfeil in #196
New Contributors
- @bufferoverflow made their first contribution in #180 Thanks!
Full Changelog: 0.0.31...0.0.32