vllm-project/vllm v0.2.4 on GitHub

Major changes

Mixtral model support (officially from @mistralai)
AMD GPU support (collaboration with @EmbeddedLLM)

What's Changed

add custom server params by @esmeetu in #1868
support ChatGLMForConditionalGeneration by @dancingpipi in #1932
Save pytorch profiler output for latency benchmark by @Yard1 in #1871
Fix typo in adding_model.rst by @petergtz in #1947
Make InternLM follow rope_scaling in config.json by @theFool32 in #1956
Fix quickstart.rst example by @gottlike in #1964
Adding number of nvcc_threads during build as envar by @AguirreNicolas in #1893
fix typo in getenv call by @dskhudia in #1972
[Continuation] Merge EmbeddedLLM/vllm-rocm into vLLM main by @tjtanaa in #1836
Fix Baichuan2-7B-Chat by @firebook in #1987
[Docker] Add cuda arch list as build option by @simon-mo in #1950
Fix for KeyError on Loading LLaMA by @imgaojun in #1978
[Minor] Fix code style for baichuan by @WoosukKwon in #2003
Fix OpenAI server completion_tokens referenced before assignment by @js8544 in #1996
[Minor] Add comment on skipping rope caches by @WoosukKwon in #2004
Replace head_mapping params with num_kv_heads to attention kernel. by @wbn03 in #1997
Fix completion API echo and logprob combo by @simon-mo in #1992
Mixtral 8x7B support by @pierrestock in #2011
Minor fixes for Mixtral by @WoosukKwon in #2015
Change load format for Mixtral by @WoosukKwon in #2028
Update run_on_sky.rst by @eltociear in #2025
Update requirements.txt for mixtral by @0-hero in #2029
Revert #2029 by @WoosukKwon in #2030
[Minor] Fix latency benchmark script by @WoosukKwon in #2035
[Minor] Fix type annotation in Mixtral by @WoosukKwon in #2036
Update README.md to add megablocks requirement for mixtral by @0-hero in #2033
[Minor] Fix import error msg for megablocks by @WoosukKwon in #2038
Bump up to v0.2.4 by @WoosukKwon in #2034

New Contributors

@dancingpipi made their first contribution in #1932
@petergtz made their first contribution in #1947
@theFool32 made their first contribution in #1956
@gottlike made their first contribution in #1964
@AguirreNicolas made their first contribution in #1893
@dskhudia made their first contribution in #1972
@tjtanaa made their first contribution in #1836
@firebook made their first contribution in #1987
@imgaojun made their first contribution in #1978
@js8544 made their first contribution in #1996
@wbn03 made their first contribution in #1997
@pierrestock made their first contribution in #2011
@0-hero made their first contribution in #2029

Full Changelog: v0.2.3...v0.2.4