Highlights
Poly PEFT method
Parameter-efficient fine-tuning (PEFT) for cross-task generalization consists of pre-training adapters on a multi-task training set before few-shot adaptation to test tasks. Polytropon [Ponti et al., 2023] (πΏπππ’) jointly learns an inventory of adapters and a routing function that selects a (variable-size) subset of adapters for each task during both pre-training and few-shot adaptation. To put simply, you can think of it as Mixture of Expert Adapters.
πΌπ·π (Multi-Head Routing) combines subsets of adapter parameters and outperforms πΏπππ’ under a comparable parameter budget; by only fine-tuning the routing function and not the adapters (πΌπ·π-z) they achieve competitive performance with extreme parameter efficiency.
- Add Poly by @TaoSunVoyage in #1129
LoRA improvements
Now, you can specify all-linear
to target_modules
param of LoraConfig
to target all the linear layers which has shown to perform better in QLoRA paper than only targeting query and valuer attention layers
- Add an option 'ALL' to include all linear layers as target modules by @SumanthRH in #1295
Embedding layers of base models are now automatically saved when the embedding layers are resized when fine-tuning with PEFT approaches like LoRA. This enables extending the vocabulary of tokenizer to include special tokens. This is a common use-case when doing the following:
- Instruction finetuning with new tokens being added such as <|user|>, <|assistant|>, <|system|>, <|im_end|>, <|im_start|>, </s>, <s> to properly format the conversations
- Finetuning on a specific language wherein language specific tokens are added, e.g., Korean tokens being added to vocabulary for finetuning LLM on Korean datasets.
- Instruction finetuning to return outputs in a certain format to enable agent behaviour of new tokens such as <|FUNCTIONS|>, <|BROWSE|>, <|TEXT2IMAGE|>, <|ASR|>, <|TTS|>, <|GENERATECODE|>, <|RAG|>.
A good blogpost to learn more about this https://www.philschmid.de/fine-tune-llms-in-2024-with-trl.
- save the embeddings even when they aren't targetted but resized by @pacman100 in #1383
New option use_rslora
in LoraConfig. Use it for ranks greater than 32 and see the increase in fine-tuning performance (same or better performance for ranks lower than 32 as well).
- Added the option to use the corrected scaling factor for LoRA, based on new research. by @Damjan-Kalajdzievski in #1244
Documentation improvements
- Refactoring and updating of the concept guides. [docs] Concept guides by @stevhliu in #1269
- Improving task guides to focus more on how to use different PEFT methods and related nuances instead of focusing more on different type of tasks. It condenses the individual guides into a single one to highlight the commonalities and differences, and to refer to existing docs to avoid duplication. [docs] Task guides by @stevhliu in #1332
- DOC: Update docstring for the config classes by @BenjaminBossan in #1343
- LoftQ: edit README.md and example files by @yxli2123 in #1276
- [Docs] make add_weighted_adapter example clear in the docs. by @sayakpaul in #1353
- DOC Add PeftMixedModel to API docs by @BenjaminBossan in #1354
- [docs] Docstring link by @stevhliu in #1356
- QOL improvements and doc updates by @pacman100 in #1318
- Doc about AdaLoraModel.update_and_allocate by @kuronekosaiko in #1341
- DOC: Improve target modules description by @BenjaminBossan in #1290
- DOC Troubleshooting for unscaling error with fp16 by @BenjaminBossan in #1336
- DOC Extending the vocab and storing embeddings by @BenjaminBossan in #1335
- Improve documentation for the
all-linear
flag by @SumanthRH in #1357 - Fix various typos in LoftQ docs. by @arnavgarg1 in #1408
What's Changed
- Bump version to 0.7.2.dev0 post release by @BenjaminBossan in #1258
- FIX Error in log_reports.py by @BenjaminBossan in #1261
- Fix ModulesToSaveWrapper getattr by @zhangsheng377 in #1238
- TST: Revert device_map for AdaLora 4bit GPU test by @BenjaminBossan in #1266
- remove a duplicated description in peft BaseTuner by @butyuhao in #1271
- Added the option to use the corrected scaling factor for LoRA, based on new research. by @Damjan-Kalajdzievski in #1244
- feat: add apple silicon GPU acceleration by @NripeshN in #1217
- LoftQ: Allow quantizing models loaded on the CPU for LoftQ initialization by @hiyouga in #1256
- LoftQ: edit README.md and example files by @yxli2123 in #1276
- TST: Extend LoftQ tests to check CPU initialization by @BenjaminBossan in #1274
- Refactor and a couple of fixes for adapter layer updates by @BenjaminBossan in #1268
- [
Tests
] Add bitsandbytes installed from source on new docker images by @younesbelkada in #1275 - TST: Enable LoftQ 8bit tests by @BenjaminBossan in #1279
- [
bnb
]Β Add bnb nightly workflow by @younesbelkada in #1282 - Fixed several errors in StableDiffusion adapter conversion script by @kovalexal in #1281
- [docs] Concept guides by @stevhliu in #1269
- DOC: Improve target modules description by @BenjaminBossan in #1290
- [
bnb-nightly
] Address final comments by @younesbelkada in #1287 - [BNB] Fix bnb dockerfile for latest version by @SunMarc in #1291
- fix fsdp auto wrap policy by @pacman100 in #1302
- [BNB] fix dockerfile for single gpu by @SunMarc in #1305
- Fix bnb lora layers not setting active adapter by @tdrussell in #1294
- Mistral IA3 config defaults by @pacman100 in #1316
- fix the embedding saving for adaption prompt by @pacman100 in #1314
- fix diffusers tests by @pacman100 in #1317
- FIX Use torch.long instead of torch.int in LoftQ for PyTorch versions <2.x by @BenjaminBossan in #1320
- Extend merge_and_unload to offloaded models by @blbadger in #1190
- Add an option 'ALL' to include all linear layers as target modules by @SumanthRH in #1295
- Refactor dispatching logic of LoRA layers by @BenjaminBossan in #1319
- Fix bug when load the prompt tuning in inference. by @yileld in #1333
- DOC Troubleshooting for unscaling error with fp16 by @BenjaminBossan in #1336
- ENH: Add attribute to show targeted module names by @BenjaminBossan in #1330
- fix some args desc by @zspo in #1338
- Fix logic in target module finding by @s-k-yx in #1263
- Doc about AdaLoraModel.update_and_allocate by @kuronekosaiko in #1341
- DOC: Update docstring for the config classes by @BenjaminBossan in #1343
- fix
prepare_inputs_for_generation
logic for Prompt Learning methods by @pacman100 in #1352 - QOL improvements and doc updates by @pacman100 in #1318
- New transformers caching ETA now v4.38 by @BenjaminBossan in #1348
- FIX Setting active adapter for quantized layers by @BenjaminBossan in #1347
- DOC Extending the vocab and storing embeddings by @BenjaminBossan in #1335
- [Docs] make add_weighted_adapter example clear in the docs. by @sayakpaul in #1353
- DOC Add PeftMixedModel to API docs by @BenjaminBossan in #1354
- Add Poly by @TaoSunVoyage in #1129
- [docs] Docstring link by @stevhliu in #1356
- Added missing getattr dunder methods for mixed model by @kovalexal in #1365
- Handle resizing of embedding layers for AutoPeftModel by @pacman100 in #1367
- account for the new merged/unmerged weight to perform the quantization again by @pacman100 in #1370
- add mixtral in LoRA mapping by @younesbelkada in #1380
- save the embeddings even when they aren't targetted but resized by @pacman100 in #1383
- Improve documentation for the
all-linear
flag by @SumanthRH in #1357 - Fix LoRA module mapping for Phi models by @arnavgarg1 in #1375
- [docs] Task guides by @stevhliu in #1332
- Add generic PeftConfig constructor from kwargs by @sfriedowitz in #1398
- Fix various typos in LoftQ docs. by @arnavgarg1 in #1408
- Release: v0.8.0 by @pacman100 in #1406
New Contributors
- @butyuhao made their first contribution in #1271
- @Damjan-Kalajdzievski made their first contribution in #1244
- @NripeshN made their first contribution in #1217
- @hiyouga made their first contribution in #1256
- @tdrussell made their first contribution in #1294
- @blbadger made their first contribution in #1190
- @yileld made their first contribution in #1333
- @s-k-yx made their first contribution in #1263
- @kuronekosaiko made their first contribution in #1341
- @TaoSunVoyage made their first contribution in #1129
- @arnavgarg1 made their first contribution in #1375
- @sfriedowitz made their first contribution in #1398
Full Changelog: v0.7.1...v0.8.0