github xorbitsai/inference v0.9.2

latest releases: v0.16.3, v0.16.2, v0.16.1...
8 months ago

What's new in 0.9.2 (2024-03-08)

These are the changes in inference v0.9.2.

New features

Enhancements

  • ENH: Supports n_gpu_layers parameter for llama-cpp-python by @ChengjieLi28 in #1070
  • ENH: Add a dropdown to the web UI to support adjusting GPU offload layers for llama.cpp loader by @notsyncing in #1073
  • ENH: [UI] Show replica on running model page by @ChengjieLi28 in #1093
  • ENH: Add "[DONE]" to the end of stream generation for better openai SDK compatibility by @ZhangTianrong in #1062
  • ENH: [UI] Support setting CPU when selecting n_gpu by @ChengjieLi28 in #1096

Documentation

Others

  • Update llm_family.json to correct the context length of glaive coder by @mikeshi80 in #1083

New Contributors

Full Changelog: v0.9.1...v0.9.2

Don't miss a new inference release

NewReleases is sending notifications on new releases.