Hey!
Now you can use TOKEN_LIMIT optionally to trim the length of the LLM input.
This will enable better performance for smaller models.
What's Changed
- feat: add TOKEN_LIMIT environment variable for controlling maximum to… by @icereed in #161
- feat: update Docker workflow to build and push AMD64 and ARM64 images… by @icereed in #172
Full Changelog: v0.9.2...v0.10.0