What's Changed
- Add llama-2-70b GGML support by @oobabooga in #3285
- Bump bitsandbytes to 0.41.0 by @jllllll in #3258 -- faster speeds
- Bump exllama module to 0.0.8 by @jllllll in #3256 -- expanded LoRA support
Bug fixes
Extensions
- [extensions/openai] Fixes for: embeddings, tokens, better errors. +Docs update, +Images, +logit_bias/logprobs, +more. by @matatonic in #3122