github vllm-project/vllm v0.1.3
vLLM v0.1.3

latest releases: v0.6.3.post1, v0.6.3, v0.6.2...
15 months ago

What's Changed

Major changes

  • More model support: LLaMA 2, Falcon, GPT-J, Baichuan, etc.
  • Efficient support for MQA and GQA.
  • Changes in the scheduling algorithm: vLLM now uses a TGI-style continuous batching.
  • And many bug fixes.

All changes

New Contributors

Full Changelog: v0.1.2...v0.1.3

Don't miss a new vllm release

NewReleases is sending notifications on new releases.