github unslothai/unsloth v0.1.35-beta
Google - Gemma 4 now in Unsloth!

latest releases: v0.1.37-beta, v0.1.36-beta
21 days ago

Google releases Gemma 4 with four new models: E2B, E4B, 26B-A4B, 31B.

gemma 4 banner

Updates

  • Tool calls for smaller models are now more stable and don't cut off anymore
  • Pre-compiled binaries for llama.cpp for 2 Gemma 4 fixes:
  • Pre-compiled binaries for Windows, Linux, Mac, WSL devices - CPU and GPU
  • 90% reduced HF API calls - less rate limits
  • Intel Mac works
  • All Gemma 4 models are re-converted.
  • Tool Calling more robust
  • Speculative Decoding added for non vision models (Gemma-4 is vision sadly and Qwen3.5)
  • Context length is now properly applied.
  • Tool calls for all models are now +30% to +80% more accurate.
  • Web search now actually gets web content and not just summaries
  • Number of tool calls allowed are increased to 25 from 10
  • Tool calls now terminate much better, so looping / repetitions will be reduced
  • More tool call healing and de-duplication logic to stop tool callings from leaking XML as well
  • Tested with unsloth/Qwen3.5-4B-GGUF (UD-Q4_K_XL), web search + code execution + thinking enabled.
Metric Before After
XML leaks in response 10/10 0/10
URL fetches used 0 4/10 runs
Runs with correct song names 0/10 2/10
Avg tool calls 5.5 3.8
Avg response time 12.3s 9.8s

Run Gemma 4 in Unsloth Studio:

gemma 4 in unsloth studio

What's Changed

New Contributors

Full Changelog: v0.1.3-beta...v0.1.35-beta

Don't miss a new unsloth release

NewReleases is sending notifications on new releases.