bentoml/BentoML v1.4.0 on GitHub

We are thrilled to announce the release of BentoML 1.4! This version introduces several new features and improvements to accelerate your iteration cycle and enhance the overall developer experience.

Below are the key highlights of 1.4, and you can find more details in the release blog post.

🚀 20x faster iteration with Codespaces

Introduced BentoML Codespaces, a development platform built on BentoCloud
Added the bentoml code command for creating a Codespace
Auto-sync of local changes to the cloud environment
Access to a variety of powerful cloud GPUs
Real-time logs and debugging through the cloud dashboard
Eliminate dependency headaches and ensure consistency between dev and prod environments

🐍 New Python SDK for runtime configurations

Added bentoml.images.PythonImage for defining the Bento runtime environment in Python instead of using bentofile.yaml or pyproject.toml
Support customizing runtime configurations (e.g., Python version, system packages, and dependencies) directly in the service.py file
Introduced context-sensitive run() method for running custom build commands
Backward compatible with existing bentofile.yaml and pyproject.toml configurations

⚡ Accelerated model loading with safetensors

Implemented build-time model downloads and parallel loading of model weights using safetensors to reduce cold start time and improve scaling performance. See the documentation to learn more.
Added bentoml.models.HuggingFaceModel for loading models from HF. It supports private model repositories and custom endpoints
Added bentoml.models.BentoModel for loading models from BentoCloud and the Model Store

🌍 External deployment dependencies

Extended bentoml.depends() to support external deployments
Added support for calling BentoCloud Deployments via name or URL
Added support for calling self-hosted HTTP AI services outside BentoCloud

⚠️ Legacy Service API deprecation

The legacy bentoml.Service API (with runners) is now officially deprecated and is scheduled for removal in a future release. We recommend you use the @bentoml.service decorator.

Note that:

1.4 remains fully compatible with Bentos created by 1.3.
The BentoML documentation has been updated with examples and guides for 1.4.

🙏 As always, we appreciate your continued support!

What's Changed

feat: support bentoml serve without service name by @frostming in #5208
feat(service): expose service-level labels definition by @aarnphm in #5211
fix: restore path after import by @frostming in #5214
fix: compile bytecode when installing python packages by @frostming in #5212
fix: IO descriptor honor validators by @frostming in #5213
feat(image): add support for chaining .pyproject.toml by @aarnphm in #5218
feat: support root input spec using positonal-only argument by @frostming in #5217
fix: gradio error when uploading file by @frostming in #5220
fix: input data validation for root input by @frostming in #5221
fix: don't restore model store after importing service by @frostming in #5223
feat(metrics): extend histogram buckets to support LLM latencies by @devin-ai-integration in #5222
fix: always add bentoml req unless it is specified as a url dependency by @frostming in #5225
docs: update links to examples by @aarnphm in #5224
docs: add environment variable authentication documentation by @devin-ai-integration in #5231
docs: Update docs to use new runtime API by @Sherlock113 in #5177
fix: add files under env/docker by @frostming in #5234

Full Changelog: v1.3.22...v1.4.0