We are thrilled to announce the release of BentoML 1.4! This version introduces several new features and improvements to accelerate your iteration cycle and enhance the overall developer experience.
Below are the key highlights of 1.4, and you can find more details in the release blog post.
🚀 20x faster iteration with Codespaces
- Introduced BentoML Codespaces, a development platform built on BentoCloud
- Added the
bentoml code
command for creating a Codespace - Auto-sync of local changes to the cloud environment
- Access to a variety of powerful cloud GPUs
- Real-time logs and debugging through the cloud dashboard
- Eliminate dependency headaches and ensure consistency between dev and prod environments
🐍 New Python SDK for runtime configurations
- Added
bentoml.images.PythonImage
for defining the Bento runtime environment in Python instead of usingbentofile.yaml
orpyproject.toml
- Support customizing runtime configurations (e.g., Python version, system packages, and dependencies) directly in the
service.py
file - Introduced context-sensitive
run()
method for running custom build commands - Backward compatible with existing
bentofile.yaml
andpyproject.toml
configurations
⚡ Accelerated model loading with safetensors
- Implemented build-time model downloads and parallel loading of model weights using safetensors to reduce cold start time and improve scaling performance. See the documentation to learn more.
- Added
bentoml.models.HuggingFaceModel
for loading models from HF. It supports private model repositories and custom endpoints - Added
bentoml.models.BentoModel
for loading models from BentoCloud and the Model Store
🌍 External deployment dependencies
- Extended
bentoml.depends()
to support external deployments - Added support for calling BentoCloud Deployments via name or URL
- Added support for calling self-hosted HTTP AI services outside BentoCloud
⚠️ Legacy Service API deprecation
- The legacy
bentoml.Service
API (with runners) is now officially deprecated and is scheduled for removal in a future release. We recommend you use the@bentoml.service
decorator.
Note that:
1.4
remains fully compatible with Bentos created by1.3
.- The BentoML documentation has been updated with examples and guides for
1.4
.
🙏 As always, we appreciate your continued support!
What's Changed
- feat: support bentoml serve without service name by @frostming in #5208
- feat(service): expose service-level labels definition by @aarnphm in #5211
- fix: restore path after import by @frostming in #5214
- fix: compile bytecode when installing python packages by @frostming in #5212
- fix: IO descriptor honor validators by @frostming in #5213
- feat(image): add support for chaining
.pyproject.toml
by @aarnphm in #5218 - feat: support root input spec using positonal-only argument by @frostming in #5217
- fix: gradio error when uploading file by @frostming in #5220
- fix: input data validation for root input by @frostming in #5221
- fix: don't restore model store after importing service by @frostming in #5223
- feat(metrics): extend histogram buckets to support LLM latencies by @devin-ai-integration in #5222
- fix: always add bentoml req unless it is specified as a url dependency by @frostming in #5225
- docs: update links to examples by @aarnphm in #5224
- docs: add environment variable authentication documentation by @devin-ai-integration in #5231
- docs: Update docs to use new runtime API by @Sherlock113 in #5177
- fix: add files under env/docker by @frostming in #5234
Full Changelog: v1.3.22...v1.4.0