transformers 4.36.2 on Python PyPI

Patch release to resolve some critical issues relating to the recent cache refactor, flash attention refactor and training in the multi-gpu and multi-node settings:

Resolve training bug with PEFT + GC #28031
Resolve cache issue when going beyond context window for Mistral/Mixtral FA2 #28037
Re-enable passing config to from_pretrained with FA #28043
Fix resuming from checkpoint when using FDSP with FULL_STATE_DICT #27891
Resolve bug when saving a checkpoint in the multi-node setting #28078

transformers 4.36.2 Patch release: v4.36.2 on Python PyPI

transformers 4.36.2
Patch release: v4.36.2

on Python PyPI