github vllm-project/vllm v0.2.7

latest releases: v0.6.1.post2, v0.6.1.post1, v0.6.1...
8 months ago

Major Changes

  • Up to 70% throughput improvement for distributed inference by removing serialization/deserialization overheads
  • Fix tensor parallelism support for Mixtral + GPTQ/AWQ

What's Changed

New Contributors

Full Changelog: v0.2.6...v0.2.7

Don't miss a new vllm release

NewReleases is sending notifications on new releases.