github facebookresearch/xformers v0.0.16
Pip wheels, improvements to mem-eff and more

latest releases: v0.0.28.post3, v0.0.28.post2, v0.0.28.post1...
22 months ago

This release contain many improvements to memory_efficient_attention, along with pip wheels now available on windows and linux!

New Features

Improvements

  • Stripe lineinfo from binaries, reducing the binary size [#549]
  • fMHA: Stricter inputs validation to avoid CUDA errors for unsupported inputs [#592]
  • fMHA/Flash-Attention: Updated to Dao-AILab/flash-attention@a1f49a2 with multiple changes from @TriDao that make the operator up to 20% faster
  • Updated triton dependency [#418]

Bug fixes

  • Fixed compatibility with Python 3.7 [#541] - thanks to @susumuota
  • fMHA: Fixed strides for QKV gradients for cutlass attention [#535]
  • fMHA/Flash-Attention: Fixed backward pass wrapper, where non-contiguous gradients could give the wrong result [#548]

Don't miss a new xformers release

NewReleases is sending notifications on new releases.