Google releases Gemma 4 with four new models: E2B, E4B, 26B-A4B, 31B.
- You can now run and train the Gemma 4 models in Unsloth. Guide / Blog: https://unsloth.ai/docs/models/gemma-4
- The multimodal reasoning models are licensed under Apache 2.0.
- Run E2B and E4B on 6GB RAM, and on phones. Run 26B-A4B and 31B on ~18GB.
- GGUFs: https://huggingface.co/collections/unsloth/gemma-4
Updates
- Tool calls for smaller models are now more stable and don't cut off anymore
- Context length is now properly applied.
- Tool calls for all models are now +30% to +80% more accurate.
- Web search now actually gets web content and not just summaries
- Number of tool calls allowed are increased to 25 from 10
- Tool calls now terminate much better, so looping / repetitions will be reduced
- More tool call healing and de-duplication logic to stop tool callings from leaking XML as well
- Tested with
unsloth/Qwen3.5-4B-GGUF(UD-Q4_K_XL), web search + code execution + thinking enabled.
| Metric | Before | After |
|---|---|---|
| XML leaks in response | 10/10 | 0/10 |
| URL fetches used | 0 | 4/10 runs |
| Runs with correct song names | 0/10 | 2/10 |
| Avg tool calls | 5.5 | 3.8 |
| Avg response time | 12.3s | 9.8s |
Run Gemma 4 in Unsloth Studio:
What's Changed
- studio: Polish Windows installer/setup logs by @Imagineer99 in #4736
- feat: move folder management into model selector dropdown by @Shine1i in #4731
- fix: clear tool status badge immediately after tool execution by @Shine1i in #4733
- refactor flex attn to prefer flash if possible by @Datta0 in #4734
- Fix Windows local GGUF model loading crash by @danielhanchen in #4730
- Fix OOM model styling in Studio model selectors by @LeoBorcherding in #4738
- feat(studio): strip org prefix in model search to surface unsloth variants by @rolandtannous in #4749
- Fix forward compatibility with transformers 5.x by @danielhanchen in #4752
- Architecture-aware KV cache VRAM estimation (5-path) by @danielhanchen in #4757
- Fix save_pretrained_merged for full-finetuned models by @danielhanchen in #4755
- Feat/prebuiltllamacpp by @mmathew23 in #4741
- Add installer test coverage for prebuilt llama.cpp changes by @danielhanchen in #4756
- fix: studio web search SSL failures and empty page content by @danielhanchen in #4754
- fix: add tokenizers to no-torch deps and TORCH_CONSTRAINT for arm64 macOS py313+ by @danielhanchen in #4748
- fix(studio): allow context length slider to reach model's native limit by @danielhanchen in #4746
- Tests for architecture-aware KV cache estimation by @danielhanchen in #4760
- Fix custom llama.cpp source builds and macos metal source builds by @mmathew23 in #4762
- studio: align composer/code, unify fonts, and remove tool collapse jitter by @Imagineer99 in #4763
- fix(chat): correct loading text for cached models during inference by @AdamPlatin123 in #4764
- fix(security): shell injection in GGML export conversion by @mateeaaaaaaa in #4768
- Add regression test for shell injection fix in GGML conversion by @danielhanchen in #4773
- fix(studio): prevent small models from stalling on tool-calling tasks by @danielhanchen in #4769
- Add regression tests for custom llama prebuilt installer by @danielhanchen in #4772
- Feat/custom llama prebuilt by @mmathew23 in #4771
- studio: fix chat font changes leaking outside chat page by @Imagineer99 in #4775
- feat(studio): display images from Python tool execution in chat UI by @danielhanchen in #4778
- ui improvement by @rolandtannous in #4781
- UI Changes by @danielhanchen in #4782
- fix(studio): improve tool-calling re-prompt for small models by @danielhanchen in #4783
- Pin Gemma-4 transformers requirement to 5.5.0 stable by @danielhanchen in #4784
- Switch llama.cpp default to mainline ggml-org by @danielhanchen in #4785
- Use transformers v5.5-release branch, pin to 5.5.0 by @danielhanchen in #4786
- Fix: pin transformers==4.57.6 in main Studio venv by @danielhanchen in #4788
- fix(studio): build llama.cpp from master for Gemma 4 support by @danielhanchen in #4790
- fix name fixed name by @rolandtannous in #4791
- fix(studio): prioritize curated defaults in Recommended model list by @danielhanchen in #4792
- fix windows llama.cpp compile from source issue by @mmathew23 in #4793
- fix(studio): pin llama.cpp to b8637 (Gemma 4 support) by @danielhanchen in #4796
- fix(studio): don't set trust_remote_code for Gemma 4 training by @danielhanchen in #4795
- fix(studio): revert llama.cpp default tag to latest by @danielhanchen in #4797
- fix(studio): suppress fatal error when ggml-org has no prebuilt manifest by @danielhanchen in #4799
New Contributors
- @AdamPlatin123 made their first contribution in #4764
- @mateeaaaaaaa made their first contribution in #4768
Full Changelog: v0.1.3-beta...v0.1.35-beta
