github unslothai/unsloth October-2024
Gradient Accumulation Fix

latest releases: August-2025-v2, August-2025, July-2025...
11 months ago

We fixed a gradient accumulation bug which was actually discovered since 2021 here, and rediscovered here. Read more in our blog post: https://unsloth.ai/blog/gradient

We have a Colab Notebook for Llama 3.2 using the fixed trainer and a Kaggle Notebook as well.

Essentially theoretically bsz * ga should be equivalent to full batch training with no gradient accumulation, but weirdly the training losses do no match up:

We fixed it in Unsloth!

To use Unsloth's fixed trainer with gradient accumulation, use:

from unsloth import unsloth_train
# trainer_stats = trainer.train() << Buggy if using gradient accumulation
trainer_stats = unsloth_train(trainer) # << Fixed gradient accumulation

Please update Unsloth on local machines (no need for Colab / Kaggle) via:

pip uninstall unsloth -y
pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"

Read our blog post: https://unsloth.ai/blog/gradient for more details!

What's Changed

New Contributors

Full Changelog: September-2024...October-2024

Don't miss a new unsloth release

NewReleases is sending notifications on new releases.