kvcache-ai/ktransformers v0.3.1
on GitHub

latest releases: v0.4.4, v0.4.3, v0.4.2...

7 months ago

🚀 New Features

Intel Arc support @aubreyli @rnwang04

⚡ Performance Improvements

DeepSeek-R1 Q4 decoding @ 7.5 tokens/s
Measured on a single-socket Xeon + DDR5 4800 MT/s + A770 platform; enabling dual-NUMA delivers additional speedups.
Easy benchmarking
Try it yourself with the local_chat script to see these gains firsthand.

🔜 What’s Next

Balance_serve integration
We’re working to seamlessly merge Intel GPU operators into the balance_serve backend for end-to-end support and streamlined maintenance.

Check out latest releases or
releases around kvcache-ai/ktransformers v0.3.1

Don't miss a new ktransformers release

NewReleases is sending notifications on new releases.

Get notifications