🍱 BentoML v1.0.4 is here!
-
Added support for explicit GPU mapping for runners. In addition to specifying the number of GPU devices allocated to a runner, we can map a list of device IDs directly to a runner through configuration.
runners: iris_clf_1: resources: nvidia.com/gpu: [2, 4] # Map device 2 and 4 to iris_clf_1 runner iris_clf_2: resources: nvidia.com/gpu: [1, 3] # Map device 1 and 3 to iris_clf_2 runner
-
Added SSL support for API server through both CLI and configuration.
--ssl-certfile TEXT SSL certificate file --ssl-keyfile TEXT SSL key file --ssl-keyfile-password TEXT SSL keyfile password --ssl-version INTEGER SSL version to use (see stdlib 'ssl' module) --ssl-cert-reqs INTEGER Whether client certificate is required (see stdlib 'ssl' module) --ssl-ca-certs TEXT CA certificates file --ssl-ciphers TEXT Ciphers to use (see stdlib 'ssl' module)
-
Added adaptive batching size histogram metrics,
BENTOML_{runner}_{method}_adaptive_batch_size_bucket
, for observability of batching mechanism details. -
Added support OpenTelemetry OTLP exporter for tracing and configures the OpenTelemetry resource automatically if user has not explicitly configured it through environment variables. Upgraded OpenTelemetry python packages to version
0.33b0
. -
Added support for saving
external_modules
alongside with models in thesave_model
API. Saving external Python modules is useful for models with external dependencies, such as tokenizers, preprocessors, and configurations. -
Enhanced Swagger UI to include additional documentation and helper links.
💡 We continue to update the documentation on every release to help our users unlock the full power of BentoML.
- Checkout the adaptive batching documentation on how to leverage batching to improve inference latency and efficiency.
- Checkout the runner configuration documentation on how to customize resource allocation for runners at run time.
🙌 We continue to receive great engagement and support from the BentoML community.
- Shout out to @sptowey for their contribution on adding SSL support.
- Shout out to @dbuades for their contribution on adding the OTLP exporter.
- Shout out to @tweeklab for their contribution on fixing a bug on
import_model
in the MLflow framework.
What's Changed
- refactor: cli to
bentoml_cli
by @sauyon in #2880 - chore: remove typing-extensions dependency by @sauyon in #2879
- fix: remove chmod install scripts by @aarnphm in #2830
- fix: relative imports to lazy by @aarnphm in #2882
- fix(cli): click utilities imports by @aarnphm in #2883
- docs: add custom model runner example by @parano in #2885
- qa: analytics unit tests by @aarnphm in #2878
- chore: script for releasing quickstart bento by @parano in #2892
- fix: pushing models from Bento instead of local modelstore by @parano in #2887
- fix(containerize): supports passing multiple tags by @aarnphm in #2872
- feat: explicit GPU runner mappings by @jjmachan in #2862
- fix: setuptools doesn't include
bentoml_cli
by @bojiang in #2898 - feat: Add SSL support for http api servers via bentoml serve by @sptowey in #2886
- patch: ssl styling and default value check by @aarnphm in #2899
- fix(scheduling): raise an error for invalid resources by @bojiang in #2894
- chore(templates): cleanup debian dependency logic by @aarnphm in #2904
- fix(ci): unittest failed by @bojiang in #2908
- chore(cli): add figlet for CLI by @aarnphm in #2909
- feat: codespace by @aarnphm in #2907
- feat: use yatai proxy to upload/download bentos/models by @yetone in #2832
- fix(scheduling): numpy worker environs are not taking effect by @bojiang in #2893
- feat: Adaptive batching size histogram metrics by @ssheng in #2902
- chore(swagger): include help links by @parano in #2927
- feat(tracing): add support for otlp exporter by @dbuades in #2918
- chore: Lock OpenTelemetry versions and add tracing metadata by @ssheng in #2928
- revert: unminify CSS by @aarnphm in #2931
- fix: importing mlflow:/ urls with no extra path info by @tweeklab in #2930
- fix(yatai): make presigned_urls_deprecated optional by @bojiang in #2933
- feat: add timeout option for bentoml runner config by @jjmachan in #2890
- perf(cli): speed up by @aarnphm in #2934
- chore: remove multipart IO descriptor warning by @ssheng in #2936
- fix(json): revert eager check by @aarnphm in #2926
- chore: remove
--config
flag to load the bentoml runtime config by @jjmachan in #2939 - chore: update README messaging by @ssheng in #2937
- fix: use a temporary file for file uploads by @sauyon in #2929
- feat(cli): add CLI command to serve a runner by @bojiang in #2920
- docs: Runner configuration for batching and resource allocation by @ssheng in #2941
- bug: handle bad image file by @parano in #2942
- chore(docs): earlier check for buildx by @aarnphm in #2940
- fix(cli): helper message default values by @ssheng in #2943
- feat(sdk): add external_modules option to save_model by @bojiang in #2895
- fix(cli): component name regression by @ssheng in #2944
New Contributors
- @sptowey made their first contribution in #2886
- @dbuades made their first contribution in #2918
- @tweeklab made their first contribution in #2930
Full Changelog: v1.0.3...v1.0.4