Patch release v4.48.1
Yet again we are dawned with a gradient accumulation fix! There is also a refactoring of the attention that let a small typo in, we made sure PHI is no longer broken!
Moonshine
had a small issue when wrapping generate so we removed that!
- [Phi] bias should be True (#35650) @ArthurZucker
- Fix condition when GA loss bug fix is not performed (#35651) @techkang
- Patch moonshine (#35731) @eustlb
🤗