github unslothai/unsloth October-2025
October Release + Unsloth Docker!

one day ago

Hey everyone, please update Unsloth to use the latest updates! 🦥

New model updates

New features

  • Introducing Quantization-Aware Training: We collabed with Pytorch for QAT, recovering as much 70% accuracy. Read blog
    qat2
  • Unsloth supports OpenEnv to allow for open RL environments. Blog coming soon • Notebook
  • New customer support agent notebook to enable real-time analysis & solving of customer interactions. You'll also learn how to train models using data from Google Sheets.
  • Support for Python 3.13, PyTorch 2.9 and the latest Hugging Face TRL and transformers are now fixed.
  • Save to TorchAO supported as well:
from torchao.quantization import Int4WeightOnlyConfig
model.save_pretrained_torchao("model", tokenizer, torchao_config = Int4WeightOnlyConfig())

Tip

Update Unsloth via pip install --upgrade --force-reinstall --no-cache-dir --no-deps unsloth unsloth_zoo
If you want PyTorch 2.9: pip install --upgrade unsloth unsloth_zoo

RL Improvements

  1. Fixed Standby consuming more VRAM than usual. Auto selects the maximum 80% to 95% of GPU utilization if import os; os.environ["UNSLOTH_VLLM_STANDBY"] = "1" is used.
  2. Fixed GRPO training hangs with better environment timers - works on DGX Spark and all other GPUs.
  3. Fixes GRPO RuntimeError: shape '[1, 887, 1, 128]' is invalid for input of size 3633152 for all models

RL Environment functions

  1. New execute_with_time_limit function to force functions to execute within a time limit. E.g. with a 2 second time limit, use:
from unsloth import execute_with_time_limit
@execute_with_time_limit(2)
def execute_strategy(strategy, game):
    return _execute_strategy(strategy, game)
try:
    execute_strategy(strategy, game)
except TimeoutError as e:
    print(f"Timed out with error = {str(e)}")
  1. To check if only Python standard modules are used in a function, use check_python_modules.
  2. Use create_locked_down_function to create a function without leakage of global variables.
  3. Use Benchmarker ie from unsloth import Benchmarker to benchmark functions accurately. It wipes the L1 to L3 cache approximately to reduce chances of benchmark cheating.
  4. Use launch_openenv to launch a continuous reloaded OpenEnv environment process (to stop it from closing down) ie from unsloth import launch_openenv It will auto find a port that is not used.

Bug fixes

  1. GPT-OSS BF16 The GPTOSSRouter works with load_in_4bit = True AttributeError: 'GptOssTopKRouter' object has no attribute 'weight'
  2. Mistral training fixed - sentencepiece proto issue fixed (any protobuf version works)
  3. Fix evaluation ie UNSLOTH_RETURN_LOGITS="1" works. Fixes #3126 #3071
  4. Fixes Output 0 of UnslothFusedLossBackward is a view and is being modified inplace. for Gemma 3 and transformers>=4.57.1
  5. If you see ImportError: cannot import name '_Ink' from 'PIL._typing' (/usr/local/lib/python3.12/dist-packages/PIL/_typing.py) please update and use our new notebooks

Don't forget to also join our Reddit: r/unsloth 🥰

What's Changed

New Contributors

Full Changelog: September-2025-v3...October-2025

Don't miss a new unsloth release

NewReleases is sending notifications on new releases.