xorbitsai/inference v0.12.0
on GitHub

latest releases: v1.0.0, v0.16.3, v0.16.2...

5 months ago

What's new in 0.12.0 (2024-06-07)

These are the changes in inference v0.12.0.

New features

FEAT: new model: mini-cpm-llama3-v-2.5 by @Minamiyama in #1577
FEAT: support glm4-chat & glm4-chat-1m by @qinxuye in #1584
FEAT: add mistral-instruct-v0.3 by @qinxuye in #1576
FEAT: add codestral-v0.1 by @qinxuye in #1575
FEAT: Support ChatTTS by @codingl2k1 in #1578
FEAT: Continuous batching for chat model on transformers backend by @ChengjieLi28 in #1548
FEAT: support qwen2 by @qinxuye in #1597
Feat: support glm-4v 9b by @Minamiyama in #1591

Enhancements

ENH: make CogVLM2 support stream output by @Minamiyama in #1572
BLD: Docker clean all images after building image on self-hosted machine by @ChengjieLi28 in #1595
BLD: Fix pip is looking multiple versions of some packages while installing by @ChengjieLi28 in #1603

Bug fixes

BUG: Fix typo for cogvlm2 by @Minamiyama in #1573

Documentation

DOC: added new models in README by @qinxuye in #1585
DOC: Fix audio doc by @codingl2k1 in #1593
DOC: Usage about cal-model-memory by @wxiwnd in #1589
DOC: Fix audio doc by @codingl2k1 in #1599
DOC: Continuous batching by @ChengjieLi28 in #1602
DOC: add new models to readme by @qinxuye in #1604

New Contributors

@wxiwnd made their first contribution in #1589

Full Changelog: v0.11.3...v0.12.0

Check out latest releases or
releases around xorbitsai/inference v0.12.0

Don't miss a new inference release

NewReleases is sending notifications on new releases.

Get notifications