unslothai/unsloth v0.1.463-beta on GitHub

We've merged over 150 PRs this week so lots of new updates, a new model Hub and look! Ensure you install the latest v0.1.463-beta or 2026.6.6. DiffusionGemma, Gemma 4 MTP and MiniMax-M3 are all now supported.

DiffusionGemma + Gemma 4 MTP + Audio

Run and train DiffusionGemma via Unsloth Studio. Install the latest v0.1.462-beta if DiffusionGemma wasn't previously working.
Gemma 4 MTP is here! Run Gemma 4 around 2x faster with MTP - MTP is auto enabled in Unsloth Studio.
Audio chat is now supported for Gemma 4 (wav, mp3, m4a,flac, webm).
Preserve Thinking added to Gemma 4.

Hub + Download Manager (Experimental)

Added a new Hub page for browsing, downloading, and managing Hugging Face models and datasets.
Unsloth can now detect models and datasets already on your machine and show them alongside downloaded assets.
Downloaded GGUF models now have direct Run / New Chat actions.

Chat with Files / RAG (Experimental)

Added Chat with Files in Studio, letting you ask questions over your own documents and knowledge bases.
Supports hybrid search, citations, PDF previews, per-thread documents, and a built-in search_knowledge_base tool.

New Update Button + Hardware Support

Unsloth now uses fresh, up-to-date llama.cpp prebuilts across CUDA, ROCm, Windows, Linux, and macOS.
Added an in-app Update llama.cpp button so users can update the local backend without reinstalling Studio.
Improved Windows / WSL AMD support, Strix Halo ROCm support, Blackwell CUDA selection, and clearer installer messages.

Local Chat, Tools & API Compatibility

Local tool calling is more reliable, with better ordering of tool cards, fewer duplicate tool loops, and support for tool use with GGUF vision models.
Improved OpenAI-compatible API and Anthropic-compatible API behavior for local Studio servers, including better errors, token usage, stop reasons, and Claude Code compatibility.

Tool Calling, MCP, Encrypted Cloudflare Tunnels

Bypass Permissions, Tool Call Permissions (Approve, Always Approve, Deny)
50% to 90% less tool call nudging issues without any accuracy loss
MCP, Artifacts are now select-able
Tensor parallelism is now enabled for GGUFs - get +30% throughput!
Cloudflare HTTPS free tunnels is now added allowing for end to end encrypted studios!

Training & General Fixes

Improved MLX support with better model labels, generation speed stats, and fixes for VLM training.
Fixed several training and dataset edge cases, including non-writable Hugging Face caches and custom dataset mappings.
Added many UI polish fixes across chat, menus, model picker, dark mode, import/export, and settings.

To update Unsloth or install a new Unsloth Studio, you must use:

macOS, Linux, WSL:

curl -fsSL https://unsloth.ai/install.sh | sh

Windows:

irm https://unsloth.ai/install.ps1 | iex

Warning

DO NOT USE unsloth studio update since packaging will not get the latest updates

What's Changed

Studio: llama.cpp update banner redesign, About tab license info, UI polish by @shimmyshimmer in #6196
Bump install.sh / install.ps1 pin to unsloth>=2026.6.3 by @danielhanchen in #6212
Expose runtime context length for hub models by @alkinun in #6154
Studio: fix llama.cpp update banner offering a downgrade / sticking on mix releases by @oobabooga in #6219
Fix kwarg spacing in training files to satisfy pre-commit by @shimmyshimmer in #6209
Studio: reword the Cloudflare line when the public probe fails by @danielhanchen in #6217
fix: deduplicate lemonade ROCm prebuilt selection log by @LeoBorcherding in #6021
Stop false RoPE 'default' warning and fix rope drift gate on transformers 5 by @danielhanchen in #6223
fix(studio): load run.py by path for editable installs by @jimdawdy-hub in #5909
fix(studio): inherit llama_extra_args and honor --no-mmproj by @jimdawdy-hub in #5902
fix(studio): adopt server-loaded model before chat auto-load by @jimdawdy-hub in #5900
Fix stale sidebar regression test to match the gap-px markup by @danielhanchen in #6232
Studio: gate the staged prebuilt runtime validation behind a flag (off by default) by @danielhanchen in #6216
Fix FastModel config passthrough for sequence classification by @alkinun in #6203
fix: decode subprocess output as UTF-8 in save.py on Windows by @dylanschroers in #6218
patch: fix EmptyLogits gathering in nested payloads and Accelerate recursively_apply by @MdHussain121 in #6092
Studio: show Apple GPU temperature and power in the GPU monitor (macOS) by @Ban921 in #6187
Studio: Add inline confirmation (Allow/Always allow/Deny) for tool calls by @oobabooga in #5869
Studio: guard Apple GPU power against negative counter-reset readings by @danielhanchen in #6235
Fix step count mismatch when sequence packing is enabled by @IrakliXYZ in #5967
fix/uv-bytecode-timeout by @alkinun in #6166
Studio: tune llama.cpp env for data-center GPUs by @danielhanchen in #6098
Studio: drop the on-disk freshness cache after a llama.cpp update by @danielhanchen in #6234
Add missing RAG deps to no-torch Studio runtime requirements by @danielhanchen in #6236
Studio: rounded rectangle hover states for menu items instead of pills by @shimmyshimmer in #6210
docs: repository cleanup by @Agnibha007 in #5617
Run cross-platform parity test on Windows and macOS in CI by @danielhanchen in #6241
chore(studio/frontend): normalize line endings to LF by @danielhanchen in #6012
fix: respect absolute export paths to prevent cross-drive copy failures (WinError 112) by @anmolxlight in #6088
Studio: Add Tensor-Parallel llama.cpp support by @oobabooga in #6040
Studio: Add custom provider option to Connections by @Imagineer99 in #6112
Studio: model selector and settings polish by @shimmyshimmer in #6240
Studio: login card polish and sidebar label alignment by @shimmyshimmer in #6242
Studio: pinnable plus menu items and saved prompt pins by @shimmyshimmer in #6237
Studio: bottom update banners, smooth llama.cpp progress, re-prompt after copy by @shimmyshimmer in #6233
fix(studio/responses): forward chat_template_kwargs enable_thinking to chat request by @Anai-Guo in #6202
Studio: fix WSL Strix Halo GPU on reinstall (ROCDXG drop-in + system HIP before bundle) by @danielhanchen in #6227
Studio: fully rounded Hub pills and refreshed menu icons by @shimmyshimmer in #6248
Studio: use px-2.5 for Hub option menu padding by @shimmyshimmer in #6249
Studio: fix Downloaded model list disappearing and order it by last download by @danielhanchen in #6247
Studio: new-chat shortcut, composer draft autosave, archive threads by @NilayYadav in #5771
Studio: persist speculative decoding preference across restart and model switch by @oobabooga in #6169
Studio: refine menu chevron, tick icon, and one-line plus-menu shape by @shimmyshimmer in #6251
Studio: serve DiffusionGemma with live in-place denoising and honest stats by @danielhanchen in #6250
Studio: bundle Gemma 4 chat templates (E2B/E4B + larger) and auto-apply to unsloth/gemma-4-*-GGUF by @danielhanchen in #6245
feat(studio): implement S3 dataset loading (completes #5951) by @ashzak in #6222
Studio: cache MCP tool discovery instead of re-probing every chat send by @oobabooga in #5828
Fix Studio S3 dataset panel layout by @wasimysaid in #6252
Attach DiffusionGemma visual-server from the prebuilt bundle by @danielhanchen in #6254

New Contributors

@jimdawdy-hub made their first contribution in #5909
@dylanschroers made their first contribution in #6218
@MdHussain121 made their first contribution in #6092
@Ban921 made their first contribution in #6187
@IrakliXYZ made their first contribution in #5967
@Agnibha007 made their first contribution in #5617

Full Changelog: v0.1.451-beta...v0.1.46-beta

unslothai/unsloth v0.1.463-beta DiffusionGemma + Gemma 4 MTP on GitHub