unslothai/unsloth v0.1.25-beta on GitHub

Hey guys, it's only been 2 days since our last release, but we’ve got a lot more important updates:

Inference is now 20–30% faster. Previously, tool-calling and repeat penalty could slow inference below normal speeds. Inference tokens/s should now perform similar to llama-server / llama.cpp.
Now Auto-detects older or pre-existing models downloaded from LM Studio, Hugging Face, and similar sources.
Inference token/s speed is now calculated correctly. Previously, tokens/s included startup time, which made the displayed speed look slower than it actually was. It should now reflect 'true' inference speed.
CPU usage no longer spikes. Previously, inline querier identity changed every render, causing useLiveQuery to resubscribe continuously.
Unsloth Studio now has a shutdown x button and shuts down properly. Previously, closing it after opening from the desktop icon would not close it properly. Now, launching from the shortcut also opens the terminal, and closing that terminal fully exits Unsloth Studio. If you still have it open from a previous session you can restart your computer or run lsof -i :8888 then kill -9 <PID>.
Even better tool-calling and websearch with reduced errors.
Updated documentation with lots of new info on deleting models, uninstalling etc.
Cleaner, smarter install and setup logging across Windows and Linux. Output is now easier to read with consistent formatting, quieter by default for a smoother experience, and supports richer --verbose diagnostics when you want full technical detail.
{% endupdate %}
You can now view your training history

What's Changed

Bump installer min version to 2026.3.12 by @danielhanchen in #4600
Fix Colab Studio launch and setup.ps1 box alignment by @danielhanchen in #4601
Fix Colab huggingface-hub conflict, ensurepip fallback, bump to 2026.3.14 by @danielhanchen in #4603
Update README.md by @rolandtannous in #4604
fix: skip flex_attention for models with non-zero attention_dropout by @Abhinavexists in #4605
Fix Colab setup skipping llama.cpp installation by @rolandtannous in #4618
fix: show recommended models in search results by @Shine1i in #4615
studio: align Dataset/Parameters/Training cards, fix expandable height, animate LoRA settings by @Imagineer99 in #4614
fix: Windows installer fails on _yaml.pyd Access Denied (os error 5) by @Etherll in #4617
studio: humanize ETA display for long training runs by @RadouaneElhajali in #4608
fix: add python-json-logger to data-designer-deps by @Shine1i in #4627
[Studio] Colab fix - Allow install_python_stack to run on Colab by @rolandtannous in #4633
Fix repetition_penalty default causing 24% TPS drop in GGUF inference by @danielhanchen in #4634
fix: install.sh Mac Intel compatibility + Studio no-torch support by @danielhanchen in #4624
tests: add no-torch / Intel Mac test suite by @danielhanchen in #4646
fix: use unsloth[huggingfacenotorch] instead of --no-deps in no-torch mode by @danielhanchen in #4647
Fix Gemma3N audio training stride assertion with non-reentrant checkpointing by @danielhanchen in #4629
Fix missing num_items_in_batch in unsloth_prediction_step by @danielhanchen in #4616
Make Studio shortcuts launch in a visible terminal by @danielhanchen in #4638
studio: setup log styling by @Imagineer99 in #4494
Fix ~1.2s TTFT penalty when tools are enabled in Studio by @danielhanchen in #4639
Fix GGUF GPU fit check to account for KV cache VRAM by @danielhanchen in #4623
feat: update app icons to rounded logo by @Shine1i in #4640
Streaming tool detection: guard late tool_calls, filter incomplete fragments by @danielhanchen in #4648
fix: install no-torch runtime deps via requirements file by @danielhanchen in #4649
Fix orphan server cleanup killing user's own llama-server by @danielhanchen in #4622
fix: add auth + UX improvements to shutdown button by @Shine1i in #4642
Fix inference failing for transformers 5.x models (trust_remote_code) by @danielhanchen in #4652
fix: no-torch install deps without pulling torch transitively by @danielhanchen in #4650
Detect always-on reasoning models and show Think button as locked-on by @danielhanchen in #4654
fix: replace navbar shutdown text button with icon-only button by @Shine1i in #4655
Fall back to parsing model name when HF API has no param count by @danielhanchen in #4656
fix: disable OCR in pymupdf4llm PDF extraction by @Shine1i in #4659
Fix HF cache default and show LM Studio models in chat/inference by @rolandtannous in #4653
Bump minimum unsloth version to 2026.3.16 in install scripts by @danielhanchen in #4663

New Contributors

@Abhinavexists made their first contribution in #4605
@RadouaneElhajali made their first contribution in #4608

Full Changelog: v0.1.2-beta...v0.1.25-beta

unslothai/unsloth v0.1.25-beta New Important Updates! on GitHub

What's Changed

New Contributors

unslothai/unsloth v0.1.25-beta
New Important Updates!

on GitHub