github sgl-project/sglang v0.3.2
Release v0.3.2

latest releases: v0.5.3rc0, v0.5.2, v0.5.2rc2...
11 months ago

Highlight

  • Support torch.compile, cuda graph for triton attention backend and DeepSeek MLA #1442 #1422
  • Initial support for multi-LoRA serving #1307
  • Integrate torchao for quantization #1341
  • Optimize the CPU scheduler overhead
  • Multiple critical bug fixes for llama and llava (tokenizer, modality)
  • Support AMD backend #1420
  • New models: MiniCPM3, OLMoE

What's Changed

New Contributors

Full Changelog: v0.3.0...v0.3.2

Don't miss a new sglang release

NewReleases is sending notifications on new releases.