What's Changed
- add Flash-Attention+ optimum-BetterTransformers by @michaelfeil in #20
- Improve real-time / sleep strategy, async await for queues and result futures - reducing latency a bit by @michaelfeil in #12
- add better FIFO queueing strategy - your requests now have a upper bound how long they queue by @michaelfeil in #19
Docs:
- Docs: Update README.md by @michaelfeil in #8
- Update description. Update pyproject.toml by @michaelfeil in #9
- Refactor model dir by @michaelfeil in #10
- Update README.md by @michaelfeil in #14
- Update README.md by @michaelfeil in #15
Full Changelog: 0.0.2rc0...0.0.3