- torch=2.6.0 update - 5-10% faster attention on hopper
-> previously 2.4.1 -> does no longer work with torch.compile + bettertransformers. We recommend disabling torch.compile for this model class. - flash-attn included in docker image for nvidia.
What's Changed
- bump client version by @wirthual in #522
- add new st version by @michaelfeil in #523
- Version check step by @wirthual in #524
- README: add example for using local model wtth docker container by @wirthual in #528
- add vision client template by @wirthual in #526
- bump to 2.6 torch by @michaelfeil in #556
Full Changelog: 0.0.75...0.0.76