We are excited to announce the release of BentoML 1.3! Following the feedback received since the launch of 1.2 earlier this year, we are introducing a host of new features and enhancements in 1.3. Below are the key highlights of 1.3 and stay tuned for an upcoming blog post, where we'll provide a detailed exploration of the new features and the driving forces behind the development.
Here are some of the important points to note about 1.3:
1.3
ensures full backward compatibility, meaning that Bentos built with1.2
will continue to work seamlessly with this release.- We remain committed to supporting
1.2
. Critical bug fixes and security updates will be backported to the1.2
branch. - The BentoML documentation has been updated with examples and guides for
1.3
. More guides will be added in the coming weeks. - BentoCloud supports Bento Deployments from both
1.2
and1.3
releases of BentoML.
Now, let’s take a look at the major features and enhancements:
🕙 Implemented BentoML task execution
- Introduced the
@bentoml.task
decorator to set a task endpoint for executing a long-running workload (such as batch processing or video generation). - Added the
.submit()
method to both the sync and async clients, which can submit task inputs via the task endpoint and dedicated worker processes constantly monitor task queues for new work to perform. - Full compatibility with BentoCloud to run Bentos defined with task endpoints.
- See the Services and Clients doc with examples of a Service API by initializing a long running task in the Service constructor, creating clients to call the endpoint, and retrieving task status.
🚀 Optimized the build cache to accelerate the build process
- Enhanced build speed for
bentoml build
&containerize
through pre-installed large packages liketorch
- Switch to
uv
as the installer and resolver, replacingpip
🔨 Supported concurrency-based autoscaling on BentoCloud
- Added the
concurrency
configuration to the@bentoml.service
decorator to set the ideal number of simultaneous requests a Service is designed to handle. - Added the
external_queue
configuration to the@bentoml.service
decorator to queue excess requests until they can be processed within the definedconcurrency
limits. - See the documentation to configure concurrency and external queue.
🔒 Secure data handling with secrets in BentoCloud:
- You can now create and manage credentials, such as HuggingFace tokens and AWS secrets, securely on BentoCloud and easily apply them across multiple Deployments.
- Added secret subcommands to the BentoML CLI for secret management. Run
bentoml secret -h
to learn more.
🗒️ Added streamed logs for Bento image deployment.
- Easier to troubleshoot build issues and enable faster development iterations
🙏 Thank you for your continued support!
What's Changed
- fix: change forbid extra keys to false for bentocloud by @FogDong in #4866
- feat(dev): 1.3 by @frostming in #4849
- fix: delete cluster and ns if it is first cluster by @FogDong in #4869
- fix: auto login confirm ask logic by @xianml in #4864
- fix: secret default value by @xianml in #4870
- fix: fix typo in error msg by @FogDong in #4871
Full Changelog: v1.2.20...v1.3.0