bentoml/BentoML v1.0.4 on GitHub

🍱 BentoML v1.0.4 is here!

Added support for explicit GPU mapping for runners. In addition to specifying the number of GPU devices allocated to a runner, we can map a list of device IDs directly to a runner through configuration.

runners:
  iris_clf_1:
    resources:
      nvidia.com/gpu: [2, 4] # Map device 2 and 4 to iris_clf_1 runner
  iris_clf_2:
    resources:
      nvidia.com/gpu: [1, 3] # Map device 1 and 3 to iris_clf_2 runner

Added SSL support for API server through both CLI and configuration.

  --ssl-certfile TEXT          SSL certificate file
  --ssl-keyfile TEXT           SSL key file
  --ssl-keyfile-password TEXT  SSL keyfile password
  --ssl-version INTEGER        SSL version to use (see stdlib 'ssl' module)
  --ssl-cert-reqs INTEGER      Whether client certificate is required (see stdlib 'ssl' module)
  --ssl-ca-certs TEXT          CA certificates file
  --ssl-ciphers TEXT           Ciphers to use (see stdlib 'ssl' module)

Added adaptive batching size histogram metrics, BENTOML_{runner}_{method}_adaptive_batch_size_bucket, for observability of batching mechanism details.
Added support OpenTelemetry OTLP exporter for tracing and configures the OpenTelemetry resource automatically if user has not explicitly configured it through environment variables. Upgraded OpenTelemetry python packages to version 0.33b0.
Added support for saving external_modules alongside with models in the save_model API. Saving external Python modules is useful for models with external dependencies, such as tokenizers, preprocessors, and configurations.
Enhanced Swagger UI to include additional documentation and helper links.

💡 We continue to update the documentation on every release to help our users unlock the full power of BentoML.

Checkout the adaptive batching documentation on how to leverage batching to improve inference latency and efficiency.
Checkout the runner configuration documentation on how to customize resource allocation for runners at run time.

🙌 We continue to receive great engagement and support from the BentoML community.

Shout out to @sptowey for their contribution on adding SSL support.
Shout out to @dbuades for their contribution on adding the OTLP exporter.
Shout out to @tweeklab for their contribution on fixing a bug on import_model in the MLflow framework.

What's Changed

refactor: cli to bentoml_cli by @sauyon in #2880
chore: remove typing-extensions dependency by @sauyon in #2879
fix: remove chmod install scripts by @aarnphm in #2830
fix: relative imports to lazy by @aarnphm in #2882
fix(cli): click utilities imports by @aarnphm in #2883
docs: add custom model runner example by @parano in #2885
qa: analytics unit tests by @aarnphm in #2878
chore: script for releasing quickstart bento by @parano in #2892
fix: pushing models from Bento instead of local modelstore by @parano in #2887
fix(containerize): supports passing multiple tags by @aarnphm in #2872
feat: explicit GPU runner mappings by @jjmachan in #2862
fix: setuptools doesn't include bentoml_cli by @bojiang in #2898
feat: Add SSL support for http api servers via bentoml serve by @sptowey in #2886
patch: ssl styling and default value check by @aarnphm in #2899
fix(scheduling): raise an error for invalid resources by @bojiang in #2894
chore(templates): cleanup debian dependency logic by @aarnphm in #2904
fix(ci): unittest failed by @bojiang in #2908
chore(cli): add figlet for CLI by @aarnphm in #2909
feat: codespace by @aarnphm in #2907
feat: use yatai proxy to upload/download bentos/models by @yetone in #2832
fix(scheduling): numpy worker environs are not taking effect by @bojiang in #2893
feat: Adaptive batching size histogram metrics by @ssheng in #2902
chore(swagger): include help links by @parano in #2927
feat(tracing): add support for otlp exporter by @dbuades in #2918
chore: Lock OpenTelemetry versions and add tracing metadata by @ssheng in #2928
revert: unminify CSS by @aarnphm in #2931
fix: importing mlflow:/ urls with no extra path info by @tweeklab in #2930
fix(yatai): make presigned_urls_deprecated optional by @bojiang in #2933
feat: add timeout option for bentoml runner config by @jjmachan in #2890
perf(cli): speed up by @aarnphm in #2934
chore: remove multipart IO descriptor warning by @ssheng in #2936
fix(json): revert eager check by @aarnphm in #2926
chore: remove --config flag to load the bentoml runtime config by @jjmachan in #2939
chore: update README messaging by @ssheng in #2937
fix: use a temporary file for file uploads by @sauyon in #2929
feat(cli): add CLI command to serve a runner by @bojiang in #2920
docs: Runner configuration for batching and resource allocation by @ssheng in #2941
bug: handle bad image file by @parano in #2942
chore(docs): earlier check for buildx by @aarnphm in #2940
fix(cli): helper message default values by @ssheng in #2943
feat(sdk): add external_modules option to save_model by @bojiang in #2895
fix(cli): component name regression by @ssheng in #2944

New Contributors

@sptowey made their first contribution in #2886
@dbuades made their first contribution in #2918
@tweeklab made their first contribution in #2930

Full Changelog: v1.0.3...v1.0.4

bentoml/BentoML v1.0.4 BentoML - v1.0.4 on GitHub

What's Changed

New Contributors

bentoml/BentoML v1.0.4
BentoML - v1.0.4

on GitHub