pypi vllm 0.2.7
v0.2.7

latest releases: 0.6.3.post1, 0.6.3, 0.6.2...
10 months ago

Major Changes

  • Up to 70% throughput improvement for distributed inference by removing serialization/deserialization overheads
  • Fix tensor parallelism support for Mixtral + GPTQ/AWQ

What's Changed

New Contributors

Full Changelog: v0.2.6...v0.2.7

Don't miss a new vllm release

NewReleases is sending notifications on new releases.