github unslothai/unsloth September-2025-v3
gpt-oss Reinforcement Learning + Auto Kernel Notebook

5 hours ago

We’re introducing gpt-oss RL support and the fastest RL inference and lowest VRAM use vs. any implementation. Blog: https://docs.unsloth.ai/new/gpt-oss-reinforcement-learning

  • Unsloth now offers the fastest inference (~3x faster), lowest VRAM (50% less) and most context (8x longer) for gpt-oss RL vs. any implementation - with no accuracy loss.
  • Since RL on gpt-oss isn't yet vLLM compatible, we rewrote Transformers inference code to enable faster inference
  • gpt-oss-20b GSPO free Colab notebook
  • This notebook automatically creates faster matrix multiplication kernels and uses a new Unsloth reward function. We also show how to counteract reward-hacking which is one of RL's biggest challenges.
gptoss rl
  • We previously released Vision RL with GSPO support
  • ⚠️ Reminder to NOT use Flash Attention 3 for gpt-oss as it'll make your training loss wrong.
  • DeepSeek-V3.1-Terminus is here and you can run locally via our GGUF
    Read how our 3-bit GGUF beats Claude-4-Opus (thinking) on Aider Polyglot here
  • Magistral 1.2 is here and you can run it locally here or fine-tune it for free by using our Kaggle notebook
  • Fine-tuning the new Qwen3 models including Qwen3-VL, Qwen3-Omni and Qwen3-Next should work in Unsloth if you install the latest transformers. The models are big however so ensure you have enough VRAM.
  • BERT is now fixed! Feel free to use our BERT fine-tuning notebook
  • ⭐ We’re hosting a Developer event with Mistral AI & NVIDIA at Y Combinator’s Office in San Francisco on Oct 21. Come say hello!
  • We’re also joining Pytorch and AMD for a 2 day Virtual AI Agents Challenge with prizes. Join Hackathon

Don't forget to also join our Reddit: r/unsloth 🥰

What's Changed

New Contributors

Full Changelog: September-2025-v2...September-2025-v3

Don't miss a new unsloth release

NewReleases is sending notifications on new releases.