Hey everyone, please update Unsloth to use the latest updates! 🦥
- Unsloth now has its own 🐋 Docker image! Start training with no setup: Read our Guide • Docker image
- We collabed with NVIDIA for Blackwell and DGX Spark support. Read our Blackwell guide and DGX guide.
New model updates
- Qwen3-VL models are all now supported: Blogpost • SFT 8B notebook • GRPO 8B notebook
- IBM Granite-4.0 models are now supported. Granite-4.0 guide • Notebook
- OpenAI showcased our new gpt-oss RL notebook for autonomously solving the 2048 game. Blogpost • Notebook
- Read about our GLM-4.6 chat template fixes and how to run the model here
New features
- Introducing Quantization-Aware Training: We collabed with Pytorch for QAT, recovering as much 70% accuracy. Read blog
- Unsloth supports OpenEnv to allow for open RL environments. Blog coming soon • Notebook
- New customer support agent notebook to enable real-time analysis & solving of customer interactions. You'll also learn how to train models using data from Google Sheets.
- Support for Python 3.13, PyTorch 2.9 and the latest Hugging Face TRL and transformers are now fixed.
- Save to TorchAO supported as well:
from torchao.quantization import Int4WeightOnlyConfig
model.save_pretrained_torchao("model", tokenizer, torchao_config = Int4WeightOnlyConfig())Tip
Update Unsloth via pip install --upgrade --force-reinstall --no-cache-dir --no-deps unsloth unsloth_zoo
If you want PyTorch 2.9: pip install --upgrade unsloth unsloth_zoo
RL Improvements
- Fixed Standby consuming more VRAM than usual. Auto selects the maximum 80% to 95% of GPU utilization if
import os; os.environ["UNSLOTH_VLLM_STANDBY"] = "1"is used. - Fixed GRPO training hangs with better environment timers - works on DGX Spark and all other GPUs.
- Fixes GRPO
RuntimeError: shape '[1, 887, 1, 128]' is invalid for input of size 3633152for all models
RL Environment functions
- New
execute_with_time_limitfunction to force functions to execute within a time limit. E.g. with a 2 second time limit, use:
from unsloth import execute_with_time_limit
@execute_with_time_limit(2)
def execute_strategy(strategy, game):
return _execute_strategy(strategy, game)
try:
execute_strategy(strategy, game)
except TimeoutError as e:
print(f"Timed out with error = {str(e)}")- To check if only Python standard modules are used in a function, use
check_python_modules. - Use
create_locked_down_functionto create a function without leakage of global variables. - Use
Benchmarkeriefrom unsloth import Benchmarkerto benchmark functions accurately. It wipes the L1 to L3 cache approximately to reduce chances of benchmark cheating. - Use
launch_openenvto launch a continuous reloaded OpenEnv environment process (to stop it from closing down) iefrom unsloth import launch_openenvIt will auto find a port that is not used.
Bug fixes
- GPT-OSS BF16 The GPTOSSRouter works with
load_in_4bit = TrueAttributeError: 'GptOssTopKRouter' object has no attribute 'weight' - Mistral training fixed - sentencepiece proto issue fixed (any protobuf version works)
- Fix evaluation ie
UNSLOTH_RETURN_LOGITS="1"works. Fixes #3126 #3071 - Fixes
Output 0 of UnslothFusedLossBackward is a view and is being modified inplace.for Gemma 3 andtransformers>=4.57.1 - If you see
ImportError: cannot import name '_Ink' from 'PIL._typing' (/usr/local/lib/python3.12/dist-packages/PIL/_typing.py)please update and use our new notebooks
Don't forget to also join our Reddit: r/unsloth 🥰
What's Changed
- Fix loading as 8bit by @Etherll in #3384
- Nightly by @danielhanchen in #3392
- Nightly by @danielhanchen in #3394
- Update int8-int4 QAT config to use Int8DynamicActivationIntxWeightConfig by @metascroy in #3391
- Gemma 3 bug fixes by @danielhanchen in #3410
- Transformers Fix v4.57 rename from PretrainedConfig to PreTrainedConfig by @mmathew23 in #3445
- improve qat by @Etherll in #3446
- Fix eval metric issue by @pluesclues in #3420
- [Part2] Reinstate llama.cpp Compatibility and GGUF Conversion with Multiple Quantizations and Automated Ollama Modelfile Creation by @rolandtannous in #3356
- vLLM FP8 quantized support for SFT/GRPO by @Datta0 in #3414
- Fix by @danielhanchen in #3466
- AMD fixes by @danielhanchen in #3467
- Fix transformers 4.57.1 by @danielhanchen in #3473
- GRPO bug fixes by @danielhanchen in #3474
- EOL LF (unix line endings) normalization by @djsaunde in #3478
- Fix out of resources issue for llama3.2 sft on amd gpu by @wangxunx in #3455
- Bug fixes by @danielhanchen in #3483
- Bug fixes by @danielhanchen in #3484
- Patch sleep mode properly for trl by @Datta0 in #3492
- Sleep trl patch by @Datta0 in #3494
- fix cross entropy loss issue for small vocab size on amd gpu by @wangxunx in #3503
- Gemma 3n fix by @mmathew23 in #3499
- enable intel for torch2.8 by @leizhenyuan in #3381
- add code for intel qlora by @leizhenyuan in #3370
- fix for intel memory calculation by @leizhenyuan in #3513
- [intel] enable support 2.9 for intel xpu by @leizhenyuan in #3514
- FP8 training enhancements by @Datta0 in #3496
New Contributors
- @metascroy made their first contribution in #3391
- @djsaunde made their first contribution in #3478
- @wangxunx made their first contribution in #3455
Full Changelog: September-2025-v3...October-2025