Patch release v4.51.1
Since the release of Llama 4, we have fixed a few issues that we are now releasing in patch v4.51.1
- Fixing flex attention for torch=2.6.0 (#37285)
- more fixes for post-training llama4 (#37329)
- Remove HQQ from caching allocator warmup (#37347)
- fix derived berts _init_weights (#37341)
- Fix init empty weights without accelerate (#37337)
- Fix deepspeed with quantization (#37324)
- fix llama4 training (#37319)
- fix flex attn when optional args aren't passed (#37327)
- Multiple llama4 fixe (#37353)
Thanks all for your patience