github xorbitsai/inference v1.2.2

2 days ago

What's new in 1.2.2 (2025-02-08)

These are the changes in inference v1.2.2.

New features

Bug fixes

  • BUG: fix llama-cpp when some quantizations have multiple parts by @qinxuye in #2786
  • BUG: Use Cache class instead of raw tuple for transformers continuous batching, compatible with latest transformers by @ChengjieLi28 in #2820

Documentation

New Contributors

Full Changelog: v1.2.1...v1.2.2

Don't miss a new inference release

NewReleases is sending notifications on new releases.