github bentoml/BentoML v1.0.8
BentoML - v1.0.8

latest releases: v1.2.15, v1.2.14, v1.2.13...
18 months ago

🍱 BentoML v1.0.8 is released with a list of improvement we hope that you’ll find useful.

  • Introduced Bento Client for easy access to the BentoML service over HTTP. Both sync and async calls are supported. See the Bento Client Guide for more details.

    from bentoml.client import Client
    
    client = Client.from_url("http://localhost:3000")
    
    # Sync call
    response = client.classify(np.array([[4.9, 3.0, 1.4, 0.2]]))
    
    # Async call
    response = await client.async_classify(np.array([[4.9, 3.0, 1.4, 0.2]]))
  • Introduced custom metrics support for easy instrumentation of custom metrics over Prometheus. See Metrics Guide] for more details.

    # Histogram metric
    inference_duration = bentoml.metrics.Histogram(
        name="inference_duration",
        documentation="Duration of inference",
        labelnames=["nltk_version", "sentiment_cls"],
    )
    
    # Counter metric
    polarity_counter = bentoml.metrics.Counter(
        name="polarity_total",
        documentation="Count total number of analysis by polarity scores",
        labelnames=["polarity"],
    )

    Full Prometheus style syntax is supported for instrumenting custom metrics inside API and Runner definitions.

    # Histogram
    inference_duration.labels(
        nltk_version=nltk.__version__, sentiment_cls=self.sia.__class__.__name__
    ).observe(time.perf_counter() - start)
    
    # Counter
    polarity_counter.labels(polarity=is_positive).inc()
  • Improved health checking to also cover the status of runners to avoid returning a healthy status before runners are ready.

  • Added SSL/TLS support to gRPC serving.

    bentoml serve-grpc --ssl-certfile=credentials/cert.pem --ssl-keyfile=credentials/key.pem --production --enable-reflection
  • Added channelz support for easy debugging gRPC serving.

  • Allowed nested requirements with the -r syntax.

    # requirements.txt
    -r nested/requirements.txt
    
    pydantic
    Pillow
    fastapi
  • Improved the adaptive batching] dispatcher auto-tuning ability to avoid sporadic request failures due to batching in the beginning of the runner lifecycle.

  • Fixed a bug such that runners will raise a TypeError when overloaded. Now an HTTP 503 Service Unavailable will be returned when runner is overloaded.

    File "python3.9/site-packages/bentoml/_internal/runner/runner_handle/remote.py", line 188, in async_run_method
        return tuple(AutoContainer.from_payload(payload) for payload in payloads)
    TypeError: 'Response' object is not iterable

💡 We continue to update the documentation and examples on every release to help the community unlock the full power of BentoML.

🥂 We’d like to thank the community for your continued support and engagement.

Don't miss a new BentoML release

NewReleases is sending notifications on new releases.