bentoml/OpenLLM v0.4.6 on GitHub

Installation

pip install openllm==0.4.6

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.6

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.6 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.6

Find more information about this release in the CHANGELOG.md

chore: cleanup unused code path by @aarnphm in #633
perf(model): update mistral inference parameters and prompt format by @larme in #632
infra: remove unused postprocess_generate by @aarnphm in #634
docs: update README.md by @aarnphm in #635
fix(client): correct destructor the httpx object boht sync and async by @aarnphm in #636
doc: update adding new model guide by @larme in #637
fix(generation): compatibility dtype with CPU by @aarnphm in #638
fix(cpu): more verbose definition for dtype casting by @aarnphm in #639
fix(service): to yield out correct JSON objects by @aarnphm in #640
fix(cli): set default dtype to auto infer by @aarnphm in #642
fix(dependencies): lock build < 1 for now by @aarnphm in #643
chore(openapi): unify inject param by @aarnphm in #645

Full Changelog: v0.4.5...v0.4.6