Hey guys, it's only been 2 days since our last release, but we’ve got a lot more important updates:
- Inference is now 20–30% faster. Previously, tool-calling and repeat penalty could slow inference below normal speeds. Inference tokens/s should now perform similar to
llama-server/llama.cpp. - Now Auto-detects older or pre-existing models downloaded from LM Studio, Hugging Face, and similar sources.
- Inference token/s speed is now calculated correctly. Previously, tokens/s included startup time, which made the displayed speed look slower than it actually was. It should now reflect 'true' inference speed.
- CPU usage no longer spikes. Previously, inline querier identity changed every render, causing
useLiveQueryto resubscribe continuously. - Unsloth Studio now has a shutdown x button and shuts down properly. Previously, closing it after opening from the desktop icon would not close it properly. Now, launching from the shortcut also opens the terminal, and closing that terminal fully exits Unsloth Studio. If you still have it open from a previous session you can restart your computer or run
lsof -i :8888thenkill -9 <PID>. - Even better tool-calling and websearch with reduced errors.
- Updated documentation with lots of new info on deleting models, uninstalling etc.
- Cleaner, smarter install and setup logging across Windows and Linux. Output is now easier to read with consistent formatting, quieter by default for a smoother experience, and supports richer
--verbosediagnostics when you want full technical detail.
{% endupdate %} - You can now view your training history
What's Changed
- Bump installer min version to 2026.3.12 by @danielhanchen in #4600
- Fix Colab Studio launch and setup.ps1 box alignment by @danielhanchen in #4601
- Fix Colab huggingface-hub conflict, ensurepip fallback, bump to 2026.3.14 by @danielhanchen in #4603
- Update README.md by @rolandtannous in #4604
- fix: skip flex_attention for models with non-zero attention_dropout by @Abhinavexists in #4605
- Fix Colab setup skipping llama.cpp installation by @rolandtannous in #4618
- fix: show recommended models in search results by @Shine1i in #4615
- studio: align Dataset/Parameters/Training cards, fix expandable height, animate LoRA settings by @Imagineer99 in #4614
- fix: Windows installer fails on _yaml.pyd Access Denied (os error 5) by @Etherll in #4617
- studio: humanize ETA display for long training runs by @RadouaneElhajali in #4608
- fix: add python-json-logger to data-designer-deps by @Shine1i in #4627
- [Studio] Colab fix - Allow install_python_stack to run on Colab by @rolandtannous in #4633
- Fix repetition_penalty default causing 24% TPS drop in GGUF inference by @danielhanchen in #4634
- fix: install.sh Mac Intel compatibility + Studio no-torch support by @danielhanchen in #4624
- tests: add no-torch / Intel Mac test suite by @danielhanchen in #4646
- fix: use unsloth[huggingfacenotorch] instead of --no-deps in no-torch mode by @danielhanchen in #4647
- Fix Gemma3N audio training stride assertion with non-reentrant checkpointing by @danielhanchen in #4629
- Fix missing num_items_in_batch in unsloth_prediction_step by @danielhanchen in #4616
- Make Studio shortcuts launch in a visible terminal by @danielhanchen in #4638
- studio: setup log styling by @Imagineer99 in #4494
- Fix ~1.2s TTFT penalty when tools are enabled in Studio by @danielhanchen in #4639
- Fix GGUF GPU fit check to account for KV cache VRAM by @danielhanchen in #4623
- feat: update app icons to rounded logo by @Shine1i in #4640
- Streaming tool detection: guard late tool_calls, filter incomplete fragments by @danielhanchen in #4648
- fix: install no-torch runtime deps via requirements file by @danielhanchen in #4649
- Fix orphan server cleanup killing user's own llama-server by @danielhanchen in #4622
- fix: add auth + UX improvements to shutdown button by @Shine1i in #4642
- Fix inference failing for transformers 5.x models (trust_remote_code) by @danielhanchen in #4652
- fix: no-torch install deps without pulling torch transitively by @danielhanchen in #4650
- Detect always-on reasoning models and show Think button as locked-on by @danielhanchen in #4654
- fix: replace navbar shutdown text button with icon-only button by @Shine1i in #4655
- Fall back to parsing model name when HF API has no param count by @danielhanchen in #4656
- fix: disable OCR in pymupdf4llm PDF extraction by @Shine1i in #4659
- Fix HF cache default and show LM Studio models in chat/inference by @rolandtannous in #4653
- Bump minimum unsloth version to 2026.3.16 in install scripts by @danielhanchen in #4663
New Contributors
- @Abhinavexists made their first contribution in #4605
- @RadouaneElhajali made their first contribution in #4608
Full Changelog: v0.1.2-beta...v0.1.25-beta