github bentoml/OpenLLM v0.1.6

latest releases: v0.6.30, v0.6.29, v0.6.28...
2 years ago

Features

Quantization now can be enabled during serving time:

openllm start stablelm --quantize int8

This will loads the model in 8-bit mode, with bitsandbytes

For CPU machine, don't worry, you can use --bettertransformer instead:

openllm start stablelm --bettertransformer

Roadmap

  • GPTQ is being developed, will include support soon

Installation

pip install openllm==0.1.6

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.1.6

Usage

All available models: python -m openllm.models

To start a LLM: python -m openllm start dolly-v2

Find more information about this release in the CHANGELOG.md

What's Changed

Full Changelog: v0.1.5...v0.1.6

Don't miss a new OpenLLM release

NewReleases is sending notifications on new releases.