Qwen 3 support + bug fixes
Please update Unsloth via pip install --upgrade --force-reinstall "unsloth==2025.4.7" unsloth_zoo
Qwen 3 notebook: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen3_(14B)-Reasoning-Conversational.ipynb
There are also many bug fixes in this release!
The 30B MoE is also fine-tunable in Unsloth!
from unsloth import FastModel
import torch
model, tokenizer = FastModel.from_pretrained(
model_name = "unsloth/Qwen3-30B-A3B",
max_seq_length = 2048, # Choose any for long context!
load_in_4bit = True, # 4 bit quantization to reduce memory
load_in_8bit = False, # [NEW!] A bit more accurate, uses 2x memory
full_finetuning = False, # [NEW!] We have full finetuning now!
# token = "hf_...", # use one if using gated models
)
What's Changed
- GGUF saving by @danielhanchen in #2017
- Gemma 3 readme by @danielhanchen in #2019
- Update README.md by @danielhanchen in #2028
- bug fix #2008 - load_in_4bit = True + fast_inference = True by @void-mckenzie in #2039
- unsloth_fast_generate model is not defined fix by @KareemMusleh in #2051
- Ensure trust_remote_code propagates down to unsloth_compile_transformers by @CuppaXanax in #2075
- Show
peft_error
by @IsaacBreen in #2080 - Add generation prompt error message change by @KareemMusleh in #2046
- Many bug fixes by @danielhanchen in #2087
- fix: config.torch_dtype in LlamaModel_fast_forward_inference by @lurf21 in #2091
- Updating new FFT 8bit support by @shimmyshimmer in #2110
- Bug fixes by @danielhanchen in #2113
- Small fix by @danielhanchen in #2114
- fix(utils): add missing importlib import to fix NameError by @naliazheli in #2134
- Add QLoRA Train and Merge16bit Test by @jeromeku in #2130
- Fix Transformers 4.45 by @danielhanchen in #2151
- Bug Fixes by @danielhanchen in #2197
- Issues templates by @jeromeku in #2242
- Fix feature_request ISSUE_TEMPLATE by @jeromeku in #2250
- Registry refactor by @jeromeku in #2255
- Update README.md by @Kimizhao in #2267
- Update README.md by @jackswl in #2119
- Update bug_report.md by @shimmyshimmer in #2323
- feat: Support custom
auto_model
for wider model compatibility (Whisper, Bert,etc) &attn_implementation
support by @Etherll in #2263 - fix: improved error handling when llama.cpp build fails by @Hansehart in #2358
- Revert "fix: improved error handling when llama.cpp build fails" by @shimmyshimmer in #2375
- Fix saving 4bit for VLM by @Erland366 in #2381
- [WIP] Initial support for Qwen3. Will udpate when the model is released by @Datta0 in #2211
- Fixup qwen3 by @Datta0 in #2423
- Fixup qwen3 qk norm by @Datta0 in #2427
- Qwen3 inference fixes by @Datta0 in #2436
- Update mapper.py to add Qwen3 base by @Etherll in #2439
- Qwen 3, Bug Fixes by @danielhanchen in #2445
New Contributors
- @void-mckenzie made their first contribution in #2039
- @CuppaXanax made their first contribution in #2075
- @IsaacBreen made their first contribution in #2080
- @lurf21 made their first contribution in #2091
- @naliazheli made their first contribution in #2134
- @jeromeku made their first contribution in #2130
- @Kimizhao made their first contribution in #2267
- @jackswl made their first contribution in #2119
- @Etherll made their first contribution in #2263
- @Hansehart made their first contribution in #2358
Full Changelog: 2025-03...May-2025