unslothai/unsloth v0.1.36-beta on GitHub

Hey everyone, we’ve updated Gemma 4 training and quants with many fixes. The bugs are universal and affected all packages and implementations and did NOT originate from Unsloth. We identified the bugs, fixed them, and Gemma 4 training now works properly only in Unsloth.

You need 8GB VRAM to train Gemma-4-E2B locally. Unsloth trains Gemma 4 ~1.5x faster with ~60% less VRAM than FA2 setups.

You can also train 26B-A4B and 31B or train via Unsloth Studio. Studio and the notebooks work for Vision, Text, Audio and inference.
For more details, guide + notebooks on training Gemma 4, view our blog: https://unsloth.ai/docs/models/gemma-4/train

Gemma 4 Training Fixes:

For fix details see our blog.

Grad accumulation no longer causes losses to explode - before you might see losses of 300 to 400 - it should be 10 to 15 - Unsloth has this fixed.
Index Error for 26B and 31B for inference - this will fail inference for 26B and 31B when using transformers - we fixed it.
use_cache=False had gibberish for E2B, E4B - see huggingface/transformers#45242
float16 audio -1e9 overflows on float16

If you see losses higher than 13-15 (like 100 or 300) most likely gradient accumulation is not being accounted properly - we have fixed this as part of Unsloth and Unsloth Studio.

Gemma 4 Quant Re-uploads

We also updated our Gemma 4 GGUFs so you will need to re-download. Once again, the quant issues are NOT related to or originated from Unsloth:

CUDA: check for buffer overlap before fusing - CRITICAL fixes <unused24> tokens ggml-org/llama.cpp#21566
kv-cache : support attention rotation for heterogeneous iSWA ggml-org/llama.cpp#21513
vocab : add byte token handling to BPE detokenizer for Gemma4 ggml-org/llama.cpp#21488
convert : set "add bos" == True for Gemma 4 ggml-org/llama.cpp#21500
common : add gemma 4 specialized parser ggml-org/llama.cpp#21418
llama-model: read final_logit_softcapping for Gemma 4 ggml-org/llama.cpp#21390
llama: add custom newline split for Gemma 4 ggml-org/llama.cpp#21406

Unsloth Studio Updates

Add speculative decoding support (ngram-mod, on by default)
Llama.cpp binaries updated to use latest version which includes all Gemma 4 Fixes
Fix Qwen3.5 and Gemma 4 training issues
Enable exporting and saving of Gemma 4 models
Harden sandbox security for terminal and python tools
Let recipes use the model loaded in Chat
Fix empty chat threads on navigation (and whenever switching tabs) and stabilize new chat flow
Allow non-LLM recipes to run and move Data tab first in executions
Reuse HF cached repo casing to prevent duplicate downloads

What's Changed

fix(studio): lazy-import transformers in model_config to fix 5.x version switch by @rolandtannous in #4806
fix: patch PEFT for Gemma4ClippableLinear in loader checkpoint path (fixes export) by @rolandtannous in #4807
Fix/gemma4 install script by @Manan17 in #4815
Fix/llama.cppbuilding by @mmathew23 in #4804
Add tests for simplified llama.cpp install policy (from PR #4804) by @danielhanchen in #4817
Differentiate web search and URL fetch in chat tool UI by @Shine1i in #4802
Allow non-LLM recipes to run and move Data tab first in executions by @Shine1i in #4805
studio: reuse HF cached repo casing to prevent duplicate downloads by @Imagineer99 in #4822
fix(studio): ensure first chat tool call starts in session sandbox by @neodon in #4810
fix(studio): harden sandbox security for terminal and python tools by @danielhanchen in #4827
studio: add speculative decoding support (ngram-mod, on by default) by @danielhanchen in #4836
Add Gemma 4 model sampling defaults by @danielhanchen in #4838
Add tests for cache case resolution (from PR #4822) by @danielhanchen in #4823
Bump minimum unsloth version to 2026.4.2 in install scripts by @danielhanchen in #4842
Fix/studio colab button message: Add fallback message for Colab Studio button when proxy URL fails by @LeoBorcherding in #4866
[Studio][Optimization]Add vision detection cache to is_vision_model() by @rolandtannous in #4853
Add tests for is_vision_model() caching behaviour by @danielhanchen in #4855
Remove Gemma-4 from FORCE_FLOAT32 by @danielhanchen in #4875
fix: skip redundant HfFileSystem().glob() calls in loader.py by @rolandtannous in #4852
fix(studio): custom folder scan fails to find GGUF variants when pointing directly at a model directory by @JYYYYYT in #4860
Add unit tests for loader glob skip guard (from PR #4852) by @danielhanchen in #4854
Studio: Fix empty chat threads on navigation and stabilize new chat flow by @Imagineer99 in #4872
Bump minimum unsloth version to 2026.4.4 in install scripts by @danielhanchen in #4876
split venv_t5 into tiered 5.3.0/5.5.0 and fix trust_remote_code by @rolandtannous in #4878
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #4879
build(deps): bump oxc-parser from 0.121.0 to 0.123.0 in /studio/backend/core/data_recipe/oxc-validator in the npm-oxc-validator group by @dependabot[bot] in #4776
Update dependabot.yml by @danielhanchen in #4915
Let recipes use the model loaded in Chat by @Shine1i in #4840
build(deps): bump the bun-frontend group across 1 directory with 16 updates by @dependabot[bot] in #4586

New Contributors

@neodon made their first contribution in #4810
@JYYYYYT made their first contribution in #4860

Full Changelog: v0.1.35-beta...v0.1.36-beta

unslothai/unsloth v0.1.36-beta Gemma 4 Fixes on GitHub

Gemma 4 Training Fixes:

Gemma 4 Quant Re-uploads

Unsloth Studio Updates

What's Changed

New Contributors

unslothai/unsloth v0.1.36-beta
Gemma 4 Fixes

on GitHub