Pre-built binary wheels are available for PyTorch 2.8.0.
Added
- Support flash-attention package up to 2.8.2
- Speed improvements to
python -m xformers.profiler.find_slowest
Removed
- Removed autograd backward pass for merge_attentions as it is easy to use incorrectly.
- Attention biases are no longer
torch.Tensor
subclasses. This is no longer
necessary for torch.compile to work, and was adding more complexity