xorbitsai/inference v0.1.0
on GitHub

latest releases: v0.16.3, v0.16.2, v0.16.1...

15 months ago

What's new in 0.1.0 (2023-07-28)

These are the changes in inference v0.1.0.

New features

FEAT: support fp4 and int8 quantization for pytorch model by @pangyoki in #238
FEAT: support llama-2-chat-70b ggml by @UranusSeven in #257

Enhancements

ENH: skip 4-bit quantization for non-linux or non-cuda local deployment by @UranusSeven in #264
ENH: handle legacy cache by @UranusSeven in #266
REF: model family by @UranusSeven in #251

Bug fixes

BUG: fix restful stop parameters by @RayJi01 in #241
BUG: download integrity hot fix by @RayJi01 in #242
BUG: disable baichuan-chat and baichuan-base on macos by @pangyoki in #250
BUG: delete tqdm_class in snapshot_download by @pangyoki in #258
BUG: ChatGLM Parameter Switch by @Bojun-Feng in #262
BUG: refresh related fields when format changes by @UranusSeven in #265
BUG: Show downloading progress in gradio by @aresnow1 in #267
BUG: LLM json not included by @UranusSeven in #268

Tests

TST: Update ChatGLM Tests by @Bojun-Feng in #259

Documentation

DOC: Update installation part in readme by @aresnow1 in #253
DOC: update readme for pytorch model by @pangyoki in #207

Full Changelog: v0.0.6...v0.1.0

Check out latest releases or
releases around xorbitsai/inference v0.1.0

Don't miss a new inference release

NewReleases is sending notifications on new releases.

Get notifications