- Import Hugging Face Transformers Model: the
bentoml.transformers.import_model
API imports pretrained transformers models directly from HuggingFace. Using this API allows importing Transformers models into the BentoML model store without loading the model into memory. Thebentoml.transformers.import_model
API takes the first argument to be the model name in BentoML store, and the second argument to be themodel_id
on HuggingFace Hub.
import bentoml
bentomodel = bentoml.transformers.import_model("zephyr-7b-beta", "HuggingFaceH4/zephyr-7b-beta")
- Standardize with
nvidia-ml-py
: BentoML now uses the officialnvidia-ml-py
package instead ofpynvml
to avoid conflict with other packages. - Define Environment Variable in Configuration: Within
bentoml_configuration.yaml
, values in the form of${ENV_VAR}
will be expanded at runtime to the value of the corresponding environment variable, but please note that this only supports string types.
What's Changed
- docs: Update the deployment docs by @Sherlock113 in #4260
- ci: pre-commit autoupdate [skip ci] by @pre-commit-ci in #4264
- feat: import model for transformers framework by @MingLiangDai in #4247
- build: Use official nvidia-ml-py package instead of fork by @ecederstrand in #4208
New Contributors
- @MingLiangDai made their first contribution in #4247
- @ecederstrand made their first contribution in #4208
Full Changelog: v1.1.7...v1.1.9