github sgl-project/sglang gateway-v0.2.1
Release Gateway-v0.2.1

latest releases: v0.5.6.post2, gateway-v0.2.4, v0.5.6.post1...
one month ago

๐Ÿš€ SGLang Model Gateway v0.2.1 Released!

This release focuses on stability, cleanup, and two big new performance features.

๐Ÿงพ Docs & CI

  • Updated router documentation to reflect recent feature additions

๐Ÿงน Code Cleanup

  • Refactored StopSequenceDecoder for cleaner incremental decoding
  • Added spec.rs test harness under spec/ for structured unit tests

๐Ÿž Bug Fixes

  • Fixed UTF-8 boundary in stop-sequence decoding
  • Fixed gRPC timeout configuration
  • Fixed worker filtering, tool-choice normalization, and bootstrap-port handling
  • Additional gRPC server warm-up and concurrency fixes

๐ŸŒŸ New Features

  • Two-Level Tokenizer Caching (L0 + L1)
  • L0: exact-match cache for repeated prompts
  • L1: prefix-aware cache at special-token boundaries
  • OpenAI-Style Classification API โ†’ new /v1/classifications endpoint, shout out to yanbo for the contribution
  • Worker Management Workflow Engine โ†’ improved async registration, worker self discovery, and health orchestration

What's Changed in Gateway

Gateway Changes (26 commits)

Paths Included

  • sgl-router
  • python/sglang/srt/grpc
  • python/sglang/srt/entrypoints/grpc_server.py

Full Changelog: gateway-v0.2.0...gateway-v0.2.1

Don't miss a new sglang release

NewReleases is sending notifications on new releases.