michaelfeil/infinity 0.0.32 on GitHub

What's Changed

You can now run a model with a alias. This will help you communicating with the API.

infinity_emb --served-model-name "your_nickname"

You can now use preload models. This acts as a "run download and load into ram" test. Upon execution, all files are cached, which will speedup consecutive loads. For additonal speedups, use --no-model-warmup to skip model warmup after loading.

infinity_emb --preload-only --model--name-or-path BAAI/bge-large-en-v1.5

PR's

feat: add served_model_name argument for the infinity_server by @bufferoverflow in #180
FIX: import crossencoder without torch installed and git push of creds by @michaelfeil in #181
update default model_name to be unified name across routes by @michaelfeil in #179
python39 type hints by @michaelfeil in #182
pydantic cli / args validation by @michaelfeil in #183
update defered moving to cpu & type hints improvement by @michaelfeil in #187
Update README.md - add Contributors by @michaelfeil in #189
update infinity offline solution by @michaelfeil in #195
update offline-mode: deployment docs v2 by @michaelfeil in #196

New Contributors

@bufferoverflow made their first contribution in #180 Thanks!

Full Changelog: 0.0.31...0.0.32