Hey guys, we revamped the entire Unsloth Studio UI and UX experience to put an emphasis on chat and training:
- Added a collapsible sidebar based on community feedback
- You can now delete chats and search past conversations
- New Preserve Thinking toggle for models that support it like Qwen3.6
- Cleaner, more consistent design with easier navigation
- Expanded Settings page with options to change your profile picture, name, and more
- No more entering your Hugging Face token twice
- gpt-oss now has low, medium and high thinking toggles.
- Now uses latest llama.cpp prebuilt, even on Linux CUDA
- Lots of bug, consistency and stability fixes
- Kimi-K2.6 can now be run!
- We also added experimental API support. Guides, announcement etc will come next week.
Qwen3.6 was also also previously already supported in Unsloth Studio for running and training. You can train and run Qwen3.6-27B right now!
What's Changed
- Only run ldconfig CUDA-linking recovery when we have permission by @danielhanchen in #4930
- Fix Mistral DPO/preference training crash on non-xformers platforms (e.g. Intel XPU) by @cheehook in #4889
- Fix raw text paragraph break normalization by @kiankyars in #4884
- Studio: keep chat input visible and fix compare pane clipping by @Imagineer99 in #4924
- fix: check find() return value before adding offset in try_fix_tokenizer by @Ricardo-M-L in #4923
- updated models template mappers. added lfm2.5vl450m to transformers 5… by @rolandtannous in #4939
- Revert "updated models template mappers. added lfm2.5vl450m to transformers 5…" by @rolandtannous in #4945
- Add AMD ROCm/HIP support across installer and hardware detection by @danielhanchen in #4720
- Pin bitsandbytes to continuous-release_main on ROCm (4-bit decode fix) by @danielhanchen in #4954
- Fix Gemma-4 GRPO catastrophic KL divergence with TRL 1.0.0+ by @danielhanchen in #4934
- Add ROCm test suite (companion to #4720) by @danielhanchen in #4824
- updating gemma4 script by @Manan17 in #4992
- Move gemma4 script by @Manan17 in #4994
- studio: fix route transition DOM duplication via AnimatePresence mode="wait" by @AdamPlatin123 in #4987
- Studio: Prompt manager, message deletion, and chat UI improvements by @Imagineer99 in #4938
- Pin kernels==0.12.1 to fix training import failure by @rolandtannous in #5000
- Studio: Expose openai and anthropic compatible external API end points by @danielhanchen in #4956
- studio: skip training status/metrics polling when idle by @AdamPlatin123 in #4988
- studio: fix api-keys access + refresh by @wasimysaid in #5005
- Studio: Polish API key copy button and harden async clipboard fallback by @Imagineer99 in #5006
- fix(studio): default chart view to full training history by @Barath19 in #5007
- [Studio] Show non exported models in chat UI by @Datta0 in #4892
- [Studio] Install flash attn at setup time for linux by @Datta0 in #4979
- fix(studio): remove 300s cap on load_checkpoint (inherits 3600s default) by @TF-MTGE in #4922
- Studio: honor explicit GGUF ctx and default to 4096 when weights exceed VRAM by @danielhanchen in #5011
- Studio: make GGUF disk-space preflight cache-aware by @danielhanchen in #5012
- Studio: anchor ctx-slider warning threshold at 4096 when weights exceed VRAM by @danielhanchen in #5014
- studio: show HF model download progress in training start overlay by @danielhanchen in #4894
- studio: stream export worker output into the export dialog by @danielhanchen in #4897
- Fix num_items_in_batch GA for Gemma4 by @Datta0 in #4998
- studio: pin peft to 0.18.1 to fix export subprocess issues by @rolandtannous in #5015
- Studio: live model-load progress + rate/ETA on download and load by @danielhanchen in #5017
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #5004
- Fix bitsandbytes ROCm install by using pip instead of uv by @edamamez in #4966
- Studio: split model-load progress label across two rows by @danielhanchen in #5020
- Studio: hard-stop at n_ctx with a 'Context limit reached' toast by @danielhanchen in #5021
- [moe][gemma4] Target MoE for gemma4 by @Datta0 in #4913
- Add configurable PyTorch mirror via UNSLOTH_PYTORCH_MIRROR env var by @rolandtannous in #5024
- Studio: support GGUF variant selection for non-suffixed repos by @Imagineer99 in #5023
- fix: prevent offline freeze by fixing stats retry and forwarding local_files_only by @DavidSolanas in #5016
- Respect classification head skip list on pre-quantized 4-bit checkpoints (#5027) by @danielhanchen in #5034
- fix(rocm): tighten gfx regex to ignore generic ISA lines by @danielhanchen in #5033
- Fix grad-accum accepts_loss_kwargs detection for vision wrappers by @danielhanchen in #5036
- grpo_compute_loss_slow called with wrong positional args by @jonahsamost in #4887
- Gate trl disable_gradient_checkpointing warning on UNSLOTH_ENABLE_LOGGING by @danielhanchen in #5038
- Studio: refresh Downloaded GGUF list and recurse into variant subdirs by @danielhanchen in #5032
- feat: Add support for OLMo-3 model by @OnePunchMonk in #4678
- feat: Add cactus QAT scheme support by @OnePunchMonk in #4679
- Re-apply #4939: updated models template mappers by @rolandtannous in #4950
- Studio: add folder browser modal for Custom Folders by @danielhanchen in #5035
- Bump Studio installer minimum to 2026.4.5 by @danielhanchen in #5041
- fix Gemma4 flash attn disable by @mmathew23 in #5045
- BUG: fix _fix_chat_template for ChatML templates missing add_generation_prompt (#4150) by @kimimgo in #4426
- fix: use direct registry API for PATH writes instead of SetEnvironmentVariable by @Etherll in #4961
- Chat-template repair: warn-by-default, AST classification, dict support by @danielhanchen in #5049
- Restrict flash attn to <=256 head dim. Consolidate attn impl checks by @Datta0 in #5051
- Remove legacy venv Scripts entry from User PATH on upgrade by @danielhanchen in #5060
- Fix review findings for chat-template repair (#5049) by @danielhanchen in #5056
- Studio: Ollama support, recommended folders, Custom Folders UX polish by @danielhanchen in #5050
- feat(studio): replace navbar with collapsible sidebar by @wasimysaid in #4936
- fix audio dataset preview and finetuning by @CodeMan62 in #5043
- Chat first onboarding by @wasimysaid in #5063
- Fix onboarding followups by @wasimysaid in #5064
- Studio: Default Gemma fallback for chat + AI assist by @Imagineer99 in #5066
- fix: multi-GPU inference crash for bnb 4-bit/8-bit models by @danielhanchen in #5068
- Add Qwen3.6 inference defaults for Studio by @danielhanchen in #5065
- Add qwen3.6 script by @Manan17 in #5084
- Studio: forward standard OpenAI tools / tool_choice to llama-server by @rolandtannous in #5099
- fix(studio/chat): stop stream when trashing a thread from sidebar by @rolandtannous in #5067
- Studio: Local profile customization in settings and sync sidebar identity by @Imagineer99 in #5088
- Studio: Show LoRA live logs and update GGUF quant options by @Imagineer99 in #5058
- Studio: prefer mainstream clipboard copy over deprecated one by @G07cha in #5109
- Studio: Improve chat composition, fix scroll behaviour, and refine sidebar UX by @Imagineer99 in #5089
- Studio: forward standard OpenAI tools / tool_choice on /v1/responses (Codex compat) by @rolandtannous in #5122
- Studio: support images on /v1/messages (Anthropic-compat) by @rolandtannous in #5128
- Coerce TRL's tuple-cached _*_available flags to bool by @danielhanchen in #5129
- Studio: Smoother thread switching in chat by @Imagineer99 in #5126
- Studio: Replace assistant UI shared autoscroll with per-panel scrolling by @Imagineer99 in #5127
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #5117
- Fix tokenizer save gemma by @Datta0 in #5115
- update gema4 chat templates by @Datta0 in #5116
- Bump installer floor to 2026.4.7 by @danielhanchen in #5134
- fix/llamacpp_prebuilt_install by @mmathew23 in #5135
- Studio: fix stale test_exception_result_cached test for vision cache by @danielhanchen in #5145
- fix: patch CONTROL type for special tokens in sentencepiece GGUF export by @octo-patch in #5080
- fix(install): clear STUDIO_LOCAL_* env on POSIX normal install by @danielhanchen in #5146
- Add tauri by @wasimysaid in #5144
- Studio: detect reasoning_effort and preserve_thinking in chat templates by @danielhanchen in #5149
New Contributors
- @cheehook made their first contribution in #4889
- @Ricardo-M-L made their first contribution in #4923
- @Barath19 made their first contribution in #5007
- @TF-MTGE made their first contribution in #4922
- @edamamez made their first contribution in #4966
- @DavidSolanas made their first contribution in #5016
- @jonahsamost made their first contribution in #4887
- @kimimgo made their first contribution in #4426
- @CodeMan62 made their first contribution in #5043
- @G07cha made their first contribution in #5109
- @octo-patch made their first contribution in #5080
Full Changelog: v0.1.36-beta...v0.1.37-beta