pypi vllm 0.2.1
v0.2.1

latest releases: 0.6.3.post1, 0.6.3, 0.6.2...
13 months ago

Major Changes

  • PagedAttention V2 kernel: Up to 20% end-to-end latency reduction
  • Support log probabilities for prompt tokens
  • AWQ support for Mistral 7B

What's Changed

New Contributors

Full Changelog: v0.2.0...v0.2.1

Don't miss a new vllm release

NewReleases is sending notifications on new releases.