unslothai/unsloth February-2026 on GitHub

Our first release of 2026! This year we’ve got a lot of exciting things coming and to kick things off, we’re introducing faster MoE training, embedding model support, and ultra long context for Reinforcement Learning. We’ll also be launching our brand new UI very soon.

We’d like to thank all of you for 50K stars on GitHub! ⭐

We’ve also added support for many new models that you can now run and fine-tune locally, including DeepSeek-OCR 2, GLM-4.7-Flash, Kimi-2.5, and more.

🚀 Faster MoE training

You can now train MoE models 12× faster with 35% less VRAM and 6x longer context via our new Triton and math kernels (no accuracy loss). gpt-oss-20b works on 12.8GB VRAM. Qwen3-30B-A3B (16-bit LoRA) uses 63GB.

Unsloth supports fast training for gpt-oss, Qwen3 (30B, 235B, VL, Coder), DeepSeek R1/V3 arch and GLM (4.7, Flash) models.

Faster MoE Blog

🔎 Embedding models now train 2× faster

We collaborated with Hugging Face to enable 1.8-3.3x faster embedding, BERT and classifier model training with 20% less VRAM, 2x longer context & no accuracy loss vs. FA2 setups.

Embedding model Blog

💡 Ultra Long Context RL is here

We’re introducing new batching algorithms to enable ~7x longer context (can be more than 12x) RL training with no accuracy or speed degradation vs. other optimized setups that use FA3, kernels & chunked losses.

Unsloth now trains gpt-oss QLoRA with 380K context on a single 192GB NVIDIA B200 GPU

Long Context RL Blog

🔮 New models

🐳 DeepSeek-OCR 2 - Run and fine-tune the new OCR model.
🥝 Kimi 2.5 - Run the SOTA model locally with Unsloth GGUFs.
⚡ GLM-4.7-Flash - Run and fine-tune the best-in-class 30B LLM.

🎉 Extra Updates

As part of our MoE release, we also made Gemma-3 now use Flex-Attention by default, and this works in float16 settings as well (there were infinities which we solved a while back). Gemma-3 now uses O(N) memory and not O(N^2) memory, and trains >3x faster (scales even better with context length). Previous Unsloth versions would OOM.
Vision fine-tuning now accepts mixed data of only images and text data!
trl==0.27.1 and transformers==5.1.0 are supported well - previous coverage was 30% of all our 120 notebooks, but now we have >80% coverage - we plan to make it 100% over the next few days.
And many many other bug fixes and other updates!

📖 New Guides

</> How To Use Claude Code + Codex with local LLMs: Guide
👾 Train & deploy to LM Studio for local inference: Guide
🎨 Run Diffusion image models with Unsloth GGUFs: Guide

Tip

Update Unsloth via pip install --upgrade --force-reinstall --no-cache-dir --no-deps unsloth unsloth_zoo
If you want PyTorch 2.9: pip install --upgrade unsloth unsloth_zoo

February is shaping up to be an amazing month for LLM releases, and we hope you’re just as excited as we are. 😊

What's Changed

[FIX] [Transformers] VLM input embeds fix for gradients by @Datta0 in #3715
[fbgemm] Silence tma fbgemm by @Datta0 in #3735
[hf_hub] Token login by @Datta0 in #3739
Do not overwrite slots by @Datta0 in #3752
Fix VLM + DDP checkpointing by @djsaunde in #3751
Enable 4-bit quantization on AMD Radeon GPUs by @sstamenk in #3748
Nightly by @danielhanchen in #3753
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #3760
Nightly by @danielhanchen in #3767
Add missing import of inspect by @sstamenk in #3778
Clarify NotImplementedError for fast_inference with full_finetuning by @Fizza-Mukhtar in #3768
Update FUNDING.yml by @danielhanchen in #3792
fix(trainer): import psutil to prevent NameError in _prepare_dataset by @alkinun in #3780
fastrope fix for zero strided tensors by @f14-bertolotti in #3782
Fix crash when trl.experimental.openenv is unavailable by @Fizza-Mukhtar in #3787
Fix Boolean value of Tensor ambiguity error in mistral.py by @yurekami in #3790
fix: add support for init_lora_weights="corda" in get_peft_model by @majiayu000 in #3794
Fix correctness bugs in rl.py, rl_replacements.py, and vision.py by @danielhanchen in #3811
Fix correctness bugs across multiple model files by @danielhanchen in #3813
Fix 3D tensor support for bitsandbytes 8-bit matmul in forward pass by @Fizza-Mukhtar in #3806
FIX: weight tying for LoRA embeddings and lm_head by @oKatanaaa in #3711
Fix Gemma3 QAT training instability with int8-int4 scheme by @danielhanchen in #3818
Add helpful error messages for fast_generate when fast_inference=False by @danielhanchen in #3820
Bug fixes by @danielhanchen in #3821
Make llama.cpp CURL dependency optional when building from source by @Fizza-Mukhtar in #3822
remove redundant code of has_block by @ykaitao in #3832
rl.py fixes: buffer reset, safer attribute access, typo fix by @danielhanchen in #3834
Respect user quantization_config by @danielhanchen in #3835
Fix vLLM PDL bug on Blackwell GPUs (B200/B100) by @danielhanchen in #3841
Sync chat_template from tokenizer to vLLM by @danielhanchen in #3842
remove unused variable BlockDiagonalCausalMask by @ykaitao in #3836
Replace GitHub API check with vLLM version check for PDL fix by @danielhanchen in #3849
GRPO: restore model mode after generate (stacked on #3754) by @danielhanchen in #3851
Fix model training state restoration in GRPO trainer by @numb3r33 in #3754
Unify Version usage and fix TRL version handling by @danielhanchen in #3843
[ModelScope] Disable stats when modelscope is being used by @Datta0 in #3857
Fix FBGEMM/CUTLASS errors on SM100 (Blackwell) GPUs by @danielhanchen in #3863
Feature/raw text dataprep by @Vangmay in #3612
Fix Kaggle telemetry misclassification when COLAB_ keys exist by @hnxnq7 in #3869
reduce code duplication by _offload_frozen_module_for_training by @ykaitao in #3865
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #3881
wrong number of dimensions by @f14-bertolotti in #3880
Disable gradient checkpointing when explicitly off for vision by @ducviet00 in #3879
[trl] use non lora model as base for RL by @Datta0 in #3895
Chunk Across Batch and Context length for logprob calculations for grpo by @pluesclues in #3628
add weight-only int8 QAT scheme and update tests for torchao 0.15.0 by @electroglyph in #3859
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #3905
Fix vllm ipykernel patch by @pluesclues in #3907
Handle Transformers 5 vLLM import errors by @danielhanchen in #3908
add FastSentenceTransformer for easily finetuning SentenceTransformer models by @electroglyph in #3719
Guard torch.compile on ROCm when triton_key is missing by @hnxnq7 in #3923
Grpo compile settings update by @pluesclues in #3927
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #3937
chore: Update outdated GitHub Actions version by @pgoslatara in #3936
[trl] vllm trl topk fixup by @Datta0 in #3935
[fix] qwen3-guard tokenizer by @Datta0 in #3959
fix for intel devices torch compile configs by @leizhenyuan in #3952
Use standard gradient checkpointing for small sequence lengths by @danielhanchen in #3867
reduce code duplication by @ykaitao in #3877
Fix TRL 0.27.0 GRPO compatibility and PEFT model handling by @danielhanchen in #3969
Fix Vision GRPO string prompts and OpenEnv async compatibility by @danielhanchen in #3964
Fix num_train_epochs=None causing TypeError in GRPOConfig by @danielhanchen in #3972
Add TRL truncation regression and metadata loss fixes (Fixes 1 and 3) by @danielhanchen in #3971
Add vLLM + torch < 2.9.0 + SM100 compatibility check by @danielhanchen in #3973
Fix torchvision compatibility check for source builds and future torch versions by @danielhanchen in #3978
Trl 0.27.0 update by @pluesclues in #3965
Prefer flex attention when available by @danielhanchen in #3979
Fix GPT-OSS BlockMask error during inference by @danielhanchen in #3982
Silence third-party deprecation warnings and fix socket leak by @danielhanchen in #3983
Silence non-actionable TRL trainer import failures by @danielhanchen in #3980
Add PyTorch 2.10 and xformers 0.0.34 support by @danielhanchen in #3985
[MoE] Improve moe kernels for unsloth fine tuning by @Datta0 in #3812
Fix RuntimeError not caught when torchcodec fails to load by @danielhanchen in #3987
Fix cutlass inductor options for PyTorch < 2.8.0 by @danielhanchen in #3988
Disable torchcodec in transformers when FFmpeg is missing by @danielhanchen in #3989
Update rl_replacements.py to filter through correct trl version by @pluesclues in #3990
Fix multiprocessing crash on Windows/macOS and unify num_proc logic by @danielhanchen in #3999
Fix triton 3.6.0 + torch 2.9.x torch.compile crash (missing cluster_dims) by @danielhanchen in #4001
Add push_to_hub_gguf support for FastSentenceTransformer by @Etherll in #4002
[Feature] seperate gguf file path by @RektPunk in #3934
Refactor Ollama template wiring and harden packing helpers by @mmangkad in #3890
Fix multi-GPU loading for quantized models in distributed training by @Fizza-Mukhtar in #3917
Fix broken documentation links, typos, and formatting in README by @danielhanchen in #4003
fix: inputs_embeds ignored when input_ids is not None in _fast_prepare_inputs_for_generation by @siddhudonda in #3814
Fix notebook compatibility for transformers 4.57.6 and TRL 0.22-0.27 by @danielhanchen in #3998
Fix VLM model + text-only dataset ValueError in TRL 0.22.x by @danielhanchen in #4004
Fix trl.experimental thin wrapper compilation and OOM from peft_config overwrite by @danielhanchen in #4006
Fix dtype mismatch in fp16 + 4-bit/8-bit LoRA training by @danielhanchen in #4005
Silence TRL's batch_size=1 padding-free warning in compiled trainer source by @danielhanchen in #4007
Silence peft target_parameters RuntimeWarning for MoE models by @danielhanchen in #4008
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #4009
Suppress vLLM v1 executor sleep/wake log messages by @danielhanchen in #4011
Inject model reference for dynamic token_type_ids detection in SFTTrainer by @danielhanchen in #4012
Fix EmbeddingGemma float16 NaN via FORCE_FLOAT32 for gemma3_text by @danielhanchen in #4014
Fix #3397: Prevent trainer tokenization hang with safe num_proc by @Fizza-Mukhtar in #4013
add llama.cpp prefix to gguf conversion help messages by @rolandtannous in #4016
[Misc] Fixes by @Datta0 in #4015
FP8: Load model on-the-fly in vLLM by @andrewor14 in #3717
Fix Gemma3 4B training on transformers 5.x (token_type_ids) by @danielhanchen in #4017
Fix warmup_ratio deprecation for transformers >= 5.0 by @danielhanchen in #4019
Misc fixes by @Datta0 in #4018

Unsloth Zoo Changes

Fix training crash when using DoRA + 4-bit quantization by @Etherll in unslothai/unsloth-zoo#394
fix for #392, transformers 5 by @electroglyph in unslothai/unsloth-zoo#393
fix: adds missing import for torch.distributed by @namekian-mystifier in unslothai/unsloth-zoo#422
Fix dtype mismatch in full finetuning + float16 inference by @danielhanchen in unslothai/unsloth-zoo#424
Fix undefined variable 'e' in Version() function by @danielhanchen in unslothai/unsloth-zoo#425
Fix correctness bugs in logging_utils.py and loss_utils.py by @danielhanchen in unslothai/unsloth-zoo#426
Fix execute_with_time_limit start_method bug by @danielhanchen in unslothai/unsloth-zoo#428
Fix OpenEnv PYTHONPATH auto-detection for compatibility by @danielhanchen in unslothai/unsloth-zoo#429
Fix VARIANT_KWARG_KEYS import for peft >= 0.18.0 by @danielhanchen in unslothai/unsloth-zoo#430
Fix ZeroDivisionError in fused cross entropy when GPU memory exhausted by @GabrielArpini in unslothai/unsloth-zoo#432
Only enable gradient checkpointing when requested by @danielhanchen in unslothai/unsloth-zoo#433
Removing import check in compiler.py by @Vidit-Ostwal in unslothai/unsloth-zoo#431

Unsloth Notebooks changes

Add Gemma phone deployment notebook by @glee2429 in unslothai/notebooks#146
Use stable executorch 1.0.0 and optimum-executorch v0.1.0 by @danielhanchen in unslothai/notebooks#151
Update 2048 RL notebook with training results by @danielhanchen in unslothai/notebooks#152
Update 2048 RL notebook with extended training results by @danielhanchen in unslothai/notebooks#153
new GRPO update notebooks by @pluesclues in unslothai/notebooks#155
gemma3 1b changes by @pluesclues in unslothai/notebooks#156
nemo gym multi environment notebook by @cmunley1 in unslothai/notebooks#158
Add LFM2.5 notebooks by @mlabonne in unslothai/notebooks#159
Revert "Add LFM2.5 notebooks" by @danielhanchen in unslothai/notebooks#161
Restore UNSLOTH_VLLM_STANDBY in Kaggle Gemma3 Vision GRPO by @danielhanchen in unslothai/notebooks#163
Grpo update gemma notebooks correctly and news lines for notebooks by @pluesclues in unslothai/notebooks#157
Add LFM2.5 notebooks (reopen #159) by @danielhanchen in unslothai/notebooks#164
GLM 4.7 Flash finetuning notebook by @Datta0 in unslothai/notebooks#166
Embedding models notebooks by @Etherll in unslothai/notebooks#160
add Qwen3_Embedding_0.6B notebook by @Etherll in unslothai/notebooks#167
[UPDATE] Update openenv notebooks to use the latest implementation by @burtenshaw in unslothai/notebooks#165
Fix Vision GRPO chat template and Orpheus column removal by @danielhanchen in unslothai/notebooks#171
update nemo gym notebooks by @cmunley1 in unslothai/notebooks#169
Fix Vision GRPO notebooks and Orpheus TTS compatibility by @danielhanchen in unslothai/notebooks#172
Add AMD known issues note by @hnxnq7 in unslothai/notebooks#168
Update Dockerfile_DGX_Spark by @XEL-Maker in unslothai/notebooks#162
Revert PR #165 - OpenEnv notebooks by @danielhanchen in unslothai/notebooks#179
Fix update_all_notebooks.py script improvements by @danielhanchen in unslothai/notebooks#176
Makign qwen 2.5 7b compatible with old trl versions. by @pluesclues in unslothai/notebooks#177
Fix Ministral VL installation cells by @danielhanchen in unslothai/notebooks#181
Improve update_all_notebooks.py: format preservation, cross-platform fixes, parallelization by @danielhanchen in unslothai/notebooks#183
Refactor update_all_notebooks.py: reorder sections, CRLF handling, README categories by @danielhanchen in unslothai/notebooks#184
Separate OCR into its own README section by @danielhanchen in unslothai/notebooks#185
[MoE] notebooks for Colab by @Datta0 in unslothai/notebooks#187

New Contributors

@sstamenk made their first contribution in #3748
@Fizza-Mukhtar made their first contribution in #3768
@alkinun made their first contribution in #3780
@f14-bertolotti made their first contribution in #3782
@yurekami made their first contribution in #3790
@majiayu000 made their first contribution in #3794
@ykaitao made their first contribution in #3832
@numb3r33 made their first contribution in #3754
@Vangmay made their first contribution in #3612
@hnxnq7 made their first contribution in #3869
@ducviet00 made their first contribution in #3879
@electroglyph made their first contribution in #3859
@pgoslatara made their first contribution in #3936
@RektPunk made their first contribution in #3934
@mmangkad made their first contribution in #3890
@siddhudonda made their first contribution in #3814

Full Changelog: December-2025...February-2026

unslothai/unsloth February-2026 12x Faster MoE Training + Embedding support! on GitHub