xorbitsai/inference v0.16.0
on GitHub

latest releases: v0.16.2, v0.16.1

17 days ago

What's new in 0.16.0 (2024-10-18)

These are the changes in inference v0.16.0.

New features

FEAT: Adding support for awq/gptq vLLM inference to VisionModel such as Qwen2-VL by @cyhasuka in #2445
FEAT: Dynamic batching for the state-of-the-art FLUX.1 text_to_image interface by @ChengjieLi28 in #2380
FEAT: added MLX for qwen2.5-instruct by @qinxuye in #2444

Enhancements

ENH: Speed up cli interaction by @frostyplanet in #2443
REF: Enable continuous batching for LLM with transformers engine by default by @ChengjieLi28 in #2437

Documentation

DOC: update readme & docs by @qinxuye in #2435

New Contributors

@cyhasuka made their first contribution in #2445

Full Changelog: v0.15.4...v0.16.0

Check out latest releases or
releases around xorbitsai/inference v0.16.0

Don't miss a new inference release

NewReleases is sending notifications on new releases.

Get notifications