We've merged over 150 PRs this week so lots of new updates, a new model Hub and look! Ensure you install the latest v0.1.463-beta or 2026.6.6 - if DiffusionGemma still doesn't work, we'll be releasing another update in a few hours.
DiffusionGemma + Gemma 4 MTP + Audio
- Run and train DiffusionGemma via Unsloth Studio. Install the latest
v0.1.462-betaif DiffusionGemma wasn't previously working. - Gemma 4 MTP is here! Run Gemma 4 around 2x faster with MTP - MTP is auto enabled in Unsloth Studio.
- Audio chat is now supported for Gemma 4 (
wav,mp3,m4a,flac,webm). - Preserve Thinking added to Gemma 4.
Hub + Download Manager (Experimental)
- Added a new Hub page for browsing, downloading, and managing Hugging Face models and datasets.
- Unsloth can now detect models and datasets already on your machine and show them alongside downloaded assets.
- Downloaded GGUF models now have direct Run / New Chat actions.
Chat with Files / RAG (Experimental)
- Added Chat with Files in Studio, letting you ask questions over your own documents and knowledge bases.
- Supports hybrid search, citations, PDF previews, per-thread documents, and a built-in
search_knowledge_basetool.
New Update Button + Hardware Support
- Unsloth now uses fresh, up-to-date llama.cpp prebuilts across CUDA, ROCm, Windows, Linux, and macOS.
- Added an in-app Update llama.cpp button so users can update the local backend without reinstalling Studio.
- Improved Windows / WSL AMD support, Strix Halo ROCm support, Blackwell CUDA selection, and clearer installer messages.
Local Chat, Tools & API Compatibility
- Local tool calling is more reliable, with better ordering of tool cards, fewer duplicate tool loops, and support for tool use with GGUF vision models.
- Improved OpenAI-compatible API and Anthropic-compatible API behavior for local Studio servers, including better errors, token usage, stop reasons, and Claude Code compatibility.
Tool Calling, MCP, Encrypted Cloudflare Tunnels
- Bypass Permissions, Tool Call Permissions (Approve, Always Approve, Deny)
- 50% to 90% less tool call nudging issues without any accuracy loss
- MCP, Artifacts are now select-able
- Tensor parallelism is now enabled for GGUFs - get +30% throughput!
- Cloudflare HTTPS free tunnels is now added allowing for end to end encrypted studios!
Training & General Fixes
- Improved MLX support with better model labels, generation speed stats, and fixes for VLM training.
- Fixed several training and dataset edge cases, including non-writable Hugging Face caches and custom dataset mappings.
- Added many UI polish fixes across chat, menus, model picker, dark mode, import/export, and settings.
To update Unsloth or install a new Unsloth Studio, you must use:
macOS, Linux, WSL:
curl -fsSL https://unsloth.ai/install.sh | sh
Windows:
irm https://unsloth.ai/install.ps1 | iex
Warning
DO NOT USE unsloth studio update since packaging will not get the latest updates
What's Changed
- Studio: llama.cpp update banner redesign, About tab license info, UI polish by @shimmyshimmer in #6196
- Bump install.sh / install.ps1 pin to unsloth>=2026.6.3 by @danielhanchen in #6212
- Expose runtime context length for hub models by @alkinun in #6154
- Studio: fix llama.cpp update banner offering a downgrade / sticking on mix releases by @oobabooga in #6219
- Fix kwarg spacing in training files to satisfy pre-commit by @shimmyshimmer in #6209
- Studio: reword the Cloudflare line when the public probe fails by @danielhanchen in #6217
- fix: deduplicate lemonade ROCm prebuilt selection log by @LeoBorcherding in #6021
- Stop false RoPE 'default' warning and fix rope drift gate on transformers 5 by @danielhanchen in #6223
- fix(studio): load run.py by path for editable installs by @jimdawdy-hub in #5909
- fix(studio): inherit llama_extra_args and honor --no-mmproj by @jimdawdy-hub in #5902
- fix(studio): adopt server-loaded model before chat auto-load by @jimdawdy-hub in #5900
- Fix stale sidebar regression test to match the gap-px markup by @danielhanchen in #6232
- Studio: gate the staged prebuilt runtime validation behind a flag (off by default) by @danielhanchen in #6216
- Fix FastModel config passthrough for sequence classification by @alkinun in #6203
- fix: decode subprocess output as UTF-8 in save.py on Windows by @dylanschroers in #6218
- patch: fix EmptyLogits gathering in nested payloads and Accelerate recursively_apply by @MdHussain121 in #6092
- Studio: show Apple GPU temperature and power in the GPU monitor (macOS) by @Ban921 in #6187
- Studio: Add inline confirmation (Allow/Always allow/Deny) for tool calls by @oobabooga in #5869
- Studio: guard Apple GPU power against negative counter-reset readings by @danielhanchen in #6235
- Fix step count mismatch when sequence packing is enabled by @IrakliXYZ in #5967
- fix/uv-bytecode-timeout by @alkinun in #6166
- Studio: tune llama.cpp env for data-center GPUs by @danielhanchen in #6098
- Studio: drop the on-disk freshness cache after a llama.cpp update by @danielhanchen in #6234
- Add missing RAG deps to no-torch Studio runtime requirements by @danielhanchen in #6236
- Studio: rounded rectangle hover states for menu items instead of pills by @shimmyshimmer in #6210
- docs: repository cleanup by @Agnibha007 in #5617
- Run cross-platform parity test on Windows and macOS in CI by @danielhanchen in #6241
- chore(studio/frontend): normalize line endings to LF by @danielhanchen in #6012
- fix: respect absolute export paths to prevent cross-drive copy failures (WinError 112) by @anmolxlight in #6088
- Studio: Add Tensor-Parallel llama.cpp support by @oobabooga in #6040
- Studio: Add custom provider option to Connections by @Imagineer99 in #6112
- Studio: model selector and settings polish by @shimmyshimmer in #6240
- Studio: login card polish and sidebar label alignment by @shimmyshimmer in #6242
- Studio: pinnable plus menu items and saved prompt pins by @shimmyshimmer in #6237
- Studio: bottom update banners, smooth llama.cpp progress, re-prompt after copy by @shimmyshimmer in #6233
- fix(studio/responses): forward chat_template_kwargs enable_thinking to chat request by @Anai-Guo in #6202
- Studio: fix WSL Strix Halo GPU on reinstall (ROCDXG drop-in + system HIP before bundle) by @danielhanchen in #6227
- Studio: fully rounded Hub pills and refreshed menu icons by @shimmyshimmer in #6248
- Studio: use px-2.5 for Hub option menu padding by @shimmyshimmer in #6249
- Studio: fix Downloaded model list disappearing and order it by last download by @danielhanchen in #6247
- Studio: new-chat shortcut, composer draft autosave, archive threads by @NilayYadav in #5771
- Studio: persist speculative decoding preference across restart and model switch by @oobabooga in #6169
- Studio: refine menu chevron, tick icon, and one-line plus-menu shape by @shimmyshimmer in #6251
- Studio: serve DiffusionGemma with live in-place denoising and honest stats by @danielhanchen in #6250
- Studio: bundle Gemma 4 chat templates (E2B/E4B + larger) and auto-apply to unsloth/gemma-4-*-GGUF by @danielhanchen in #6245
- feat(studio): implement S3 dataset loading (completes #5951) by @ashzak in #6222
- Studio: cache MCP tool discovery instead of re-probing every chat send by @oobabooga in #5828
- Fix Studio S3 dataset panel layout by @wasimysaid in #6252
- Attach DiffusionGemma visual-server from the prebuilt bundle by @danielhanchen in #6254
New Contributors
- @jimdawdy-hub made their first contribution in #5909
- @dylanschroers made their first contribution in #6218
- @MdHussain121 made their first contribution in #6092
- @Ban921 made their first contribution in #6187
- @IrakliXYZ made their first contribution in #5967
- @Agnibha007 made their first contribution in #5617
Full Changelog: v0.1.451-beta...v0.1.46-beta