GPTQ Integration
Now, you can finetune GPTQ quantized models using PEFT. Here are some examples of how to use PEFT with a GPTQ model: colab notebook and finetuning script.
Low-level API
Enables users and developers to use PEFT as a utility library, at least for injectable adapters (LoRA, IA3, AdaLoRA). It exposes an API to modify the model in place to inject the new layers into the model.
- [
core
] PEFT refactor + introducing inject_adapter_in_model public method by @younesbelkada #749 - [
Low-level-API
] Add docs about LLAPI by @younesbelkada in #836
Support for XPU and NPU devices
Leverage the support for more devices for loading and fine-tuning PEFT adapters.
- Support XPU adapter loading by @abhilash1910 in #737
- Support Ascend NPU adapter loading by @statelesshz in #772
Mix-and-match LoRAs
Stable support and new ways of merging multiple LoRAs. There are currently 3 ways of merging loras supported: linear
, svd
and cat
.
- Added additional parameters to mixing multiple LoRAs through SVD, added ability to mix LoRAs through concatenation by @kovalexal in #817
What's Changed
- Release version 0.5.0.dev0 by @pacman100 in #717
- Fix subfolder issue by @younesbelkada in #721
- Add falcon to officially supported LoRA & IA3 modules by @younesbelkada in #722
- revert change by @pacman100 in #731
- fix(pep561): include packaging type information by @aarnphm in #729
- [
Llama2
] Add disabling TP behavior by @younesbelkada in #728 - [
Patch
] patch trainable params for 4bit layers by @younesbelkada in #733 - FIX: Warning when initializing prompt encoder by @BenjaminBossan in #716
- ENH: Warn when disabling adapters and bias != 'none' by @BenjaminBossan in #741
- FIX: Disabling adapter works with modules_to_save by @BenjaminBossan in #736
- Updated Example in Class:LoraModel by @TianyiPeng in #672
- [
AdaLora
] Fix adalora inference issue by @younesbelkada in #745 - Add btlm to officially supported LoRA by @Trapper4888 in #751
- [
ModulesToSave
] add correct hook management for modules to save by @younesbelkada in #755 - Example notebooks for LoRA with custom models by @BenjaminBossan in #724
- Add tests for AdaLoRA, fix a few bugs by @BenjaminBossan in #734
- Add progressbar unload/merge by @BramVanroy in #753
- Support XPU adapter loading by @abhilash1910 in #737
- Support Ascend NPU adapter loading by @statelesshz in #772
- Allow passing inputs_embeds instead of input_ids by @BenjaminBossan in #757
- [
core
] PEFT refactor + introducinginject_adapter_in_model
public method by @younesbelkada in #749 - Add adapter error handling by @BenjaminBossan in #800
- add lora default target module for codegen by @sywangyi in #787
- DOC: Update docstring of PeftModel.from_pretrained by @BenjaminBossan in #799
- fix crash when using torch.nn.DataParallel for LORA inference by @sywangyi in #805
- Peft model signature by @kiansierra in #784
- GPTQ Integration by @SunMarc in #771
- Only fail quantized Lora unload when actually merging by @BlackHC in #822
- Added additional parameters to mixing multiple LoRAs through SVD, added ability to mix LoRAs through concatenation by @kovalexal in #817
- TST: add test about loading custom models by @BenjaminBossan in #827
- Fix unbound error in ia3.py by @His-Wardship in #794
- [
Docker
] Fix gptq dockerfile by @younesbelkada in #835 - [
Tests
] Add 4bit slow training tests by @younesbelkada in #834 - [
Low-level-API
] Add docs about LLAPI by @younesbelkada in #836 - Type annotation fix by @vwxyzjn in #840
New Contributors
- @TianyiPeng made their first contribution in #672
- @Trapper4888 made their first contribution in #751
- @abhilash1910 made their first contribution in #737
- @statelesshz made their first contribution in #772
- @kiansierra made their first contribution in #784
- @BlackHC made their first contribution in #822
- @His-Wardship made their first contribution in #794
- @vwxyzjn made their first contribution in #840
Full Changelog: v0.4.0...v0.5.0