bentoml/BentoML v1.0.19 on GitHub

🍱 BentoML v1.0.19 is released with enhanced GPU utilization and expanded ML framework support.

Optimized GPU resource utilization: Enabled scheduling of multiple instances of the same runner using the workers_per_resource scheduling strategy configuration. The following configuration allows scheduling 2 instances of the “iris” runner per GPU instance. workers_per_resource is 1 by default.
```
runners:
  iris:
    resources:
      nvidia.com/gpu: 1
    workers_per_resource: 2
```
New ML framework support: We've added support for EasyOCR and Detectron2 to our growing list of supported ML frameworks.
Enhanced runner communication: Implemented PEP 574 out-of-band pickling to improve runner communication by eliminating memory copying, resulting in better performance and efficiency.
Backward compatibility for Hugging Face Transformers: Resolved compatibility issues with Hugging Face Transformers versions prior to v4.18, ensuring a seamless experience for users with older versions.

⚙️ With the release of Kubeflow 1.7, BentoML now has native integration with Kubeflow, allowing developers to leverage BentoML's cloud-native components. Prior, developers were limited to exporting and deploying Bento
as a single container. With this integration, models trained in Kubeflow can easily be packaged, containerized, and deployed to a Kubernetes cluster as microservices. This architecture enables the individual models to run in their own pods, utilizing the most optimal hardware for their respective tasks and enabling independent scaling.

💡 With each release, we consistently update our blog, documentation and examples to empower the community in harnessing the full potential of BentoML.

Learn more scheduling strategy to get better resource utilization.
Learn more about model monitoring and drift detection in BentoML and integration with various monitoring framework.
Learn more about using Nvidia Triton Inference Server as a runner to improve your application’s performance and throughput.

What's Changed

fix(env): using python -m to run pip commands by @frostming in #3762
chore(deps): bump pytest from 7.3.0 to 7.3.1 by @dependabot in #3766
feat: lazy load bentoml.server by @aarnphm in #3763
fix(client): service route prefix by @aarnphm in #3765
chore: add test with many requests by @sauyon in #3768
fix: using http config for grpc server by @aarnphm in #3771
feat: apply pep574 out-of-band pickling to DefaultContainer by @larme in #3736
fix: passing serve_cmd and passthrough kwargs by @aarnphm in #3764
feat: Detectron by @aarnphm in #3711
chore(dispatcher): (re-)factor out training code by @sauyon in #3767
feat: EasyOCR by @aarnphm in #3712
feat(build): support 3.11 by @aarnphm in #3774
patch: backports module availability for transformers<4.18 by @aarnphm in #3775
fix(dispatcher): set wait to 0 while training by @sauyon in #3664
chore(deps): bump ruff from 0.0.261 to 0.0.262 by @dependabot in #3778
feat: add model#load_model method by @parano in #3780
feat: Allow spawning more than 1 worker on each resource by @frostming in #3776
docs: Fix TensorFlow save_model parameter order by @ssheng in #3781
chore(deps): bump yamllint from 1.30.0 to 1.31.0 by @dependabot in #3782
chore(deps): bump imageio from 2.27.0 to 2.28.0 by @dependabot in #3783
chore(deps): bump ruff from 0.0.262 to 0.0.263 by @dependabot in #3790
fix: allow import service defined under a Python package by @parano in #3794

New Contributors

@frostming made their first contribution in #3762

Full Changelog: v1.0.18...v1.0.19

bentoml/BentoML v1.0.19 BentoML - v1.0.19 on GitHub

What's Changed

New Contributors

bentoml/BentoML v1.0.19
BentoML - v1.0.19

on GitHub