bentoml/BentoML v1.3.0 on GitHub

We are excited to announce the release of BentoML 1.3! Following the feedback received since the launch of 1.2 earlier this year, we are introducing a host of new features and enhancements in 1.3. Below are the key highlights of 1.3 and stay tuned for an upcoming blog post, where we'll provide a detailed exploration of the new features and the driving forces behind the development.

Here are some of the important points to note about 1.3:

1.3 ensures full backward compatibility, meaning that Bentos built with 1.2 will continue to work seamlessly with this release.
We remain committed to supporting 1.2. Critical bug fixes and security updates will be backported to the 1.2 branch.
The BentoML documentation has been updated with examples and guides for 1.3. More guides will be added in the coming weeks.
BentoCloud supports Bento Deployments from both 1.2 and 1.3 releases of BentoML.

Now, let’s take a look at the major features and enhancements:

🕙 Implemented BentoML task execution

Introduced the @bentoml.task decorator to set a task endpoint for executing a long-running workload (such as batch processing or video generation).
Added the .submit() method to both the sync and async clients, which can submit task inputs via the task endpoint and dedicated worker processes constantly monitor task queues for new work to perform.
Full compatibility with BentoCloud to run Bentos defined with task endpoints.
See the Services and Clients doc with examples of a Service API by initializing a long running task in the Service constructor, creating clients to call the endpoint, and retrieving task status.

🚀 Optimized the build cache to accelerate the build process

Enhanced build speed for bentoml build & containerize through pre-installed large packages like torch
Switch to uv as the installer and resolver, replacing pip

🔨 Supported concurrency-based autoscaling on BentoCloud

Added the concurrency configuration to the @bentoml.service decorator to set the ideal number of simultaneous requests a Service is designed to handle.
Added the external_queue configuration to the @bentoml.service decorator to queue excess requests until they can be processed within the defined concurrency limits.
See the documentation to configure concurrency and external queue.

🔒 Secure data handling with secrets in BentoCloud:

You can now create and manage credentials, such as HuggingFace tokens and AWS secrets, securely on BentoCloud and easily apply them across multiple Deployments.
Added secret subcommands to the BentoML CLI for secret management. Run bentoml secret -h to learn more.

🗒️ Added streamed logs for Bento image deployment.

Easier to troubleshoot build issues and enable faster development iterations

🙏 Thank you for your continued support!

What's Changed

fix: change forbid extra keys to false for bentocloud by @FogDong in #4866
feat(dev): 1.3 by @frostming in #4849
fix: delete cluster and ns if it is first cluster by @FogDong in #4869
fix: auto login confirm ask logic by @xianml in #4864
fix: secret default value by @xianml in #4870
fix: fix typo in error msg by @FogDong in #4871

Full Changelog: v1.2.20...v1.3.0