pypi vllm 0.3.3
v0.3.3

latest releases: 0.6.3.post1, 0.6.3, 0.6.2...
8 months ago

Major changes

  • StarCoder2 support
  • Performance optimization and LoRA support for Gemma
  • 2/3/8-bit GPTQ support
  • Performance optimization for MoE kernel
  • [Experimental] AWS Inferentia2 support
  • [Experimental] Structured output (JSON, Regex) in OpenAI Server

What's Changed

New Contributors

Full Changelog: v0.3.2...v0.3.3

Don't miss a new vllm release

NewReleases is sending notifications on new releases.