github rjmalagon/ollama-linux-amd-apu v0.12.4-rc6

latest releases: v0.13.5, v0.13.3, v.0.13.1-rc0...
pre-release3 months ago

What's Changed (this repo branch)

  • Sync to v0.12.4
  • AMD GTT patches discontinuation: Main Ollama now supports AMD APUs, RDNA1 class APUs to current, and VEGA class APUs are now discontinued.
  • New main current branch of the repo. We now host container images with additional AMD ROCM optimizations over the current main Ollama
    We offer a container image with the most recent Ollama that is compatible with the old GTT patches.

What's Changed (from Ollama)

What's Changed

  • Flash attention is now enabled by default for Qwen 3 and Qwen 3 Coder
  • Fixed minor memory estimation issues when scheduling models on NVIDIA GPUs
  • Fixed an issue where keep_alive in the API would accept different values for the /api/chat and /api/generate endpoints
  • Fixed tool calling rendering with qwen3-coder
  • More reliable and accurate VRAM detection
  • OLLAMA_FLASH_ATTENTION can now be overridden to 0 for models that have flash attention enabled by default
  • macOS 12 Monterey and macOS 13 Ventura are no longer supported
  • Fixed crash where templates were not correctly defined
  • Fix memory calculations on NVIDIA iGPUs

Full Changelog: https://github.com/rjmalagon/ollama-linux-amd-apu/commits/v0.12.4

Don't miss a new ollama-linux-amd-apu release

NewReleases is sending notifications on new releases.