github vllm-project/vllm-omni v0.15.0rc1

latest releases: v0.18.0rc1, v0.17.0rc1, v0.16.0...
pre-releaseone month ago

This pre-release is a alignment to the upstream vLLM v0.15.0.

Highlights

  • Rebase to Upstream vLLM v0.15.0: vLLM-Omni is now fully aligned with the latest vLLM v0.15.0 core, bringing in all the latest upstream features, bug fixes, and performance improvements (#1159).
  • Tensor Parallelism for LongCat-Image: We have added Tensor Parallelism (TP) support for LongCat-Image and LongCat-Image-Edit models, significantly improving the inference speed and scalability of these vision-language models (#926).
  • TeaCache Optimization: Introduced Coefficient Estimation for TeaCache, further refining the efficiency of our caching mechanisms for optimized generation (#940).
  • Alignment & Stability:
    • Enhanced error handling logic to maintain consistency with upstream vLLM v0.14.0/v0.15.0 standards (#1122).
    • Integrated "Bagel" E2E Smoke Tests and refactored sequence parallel tests to ensure robust CI/CD and accurate performance benchmarking (#1074, #1165).
  • Update paper link: A intial paper to arxiv to give introductions to our design and some performance test results (#1169).

What's Changed

Features & Optimizations

Alignment & Integration

Infrastructure (CI/CD) & Documentation

New Contributors

Full Changelog: v0.14.0...v0.15.0rc1

Don't miss a new vllm-omni release

NewReleases is sending notifications on new releases.