The one-click-installers have been merged into the repository. Migration instructions can be found here.
The updated one-click install features an installation size several GB smaller and a more reliable update procedure.
What's Changed
- sd_api_pictures: Widen sliders for image size minimum and maximum by @GuizzyQC in #3326
- Bump exllama module to 0.0.9 by @jllllll in #3338
- Add an extension that makes chat replies longer by @oobabooga in #3363
- add chat instruction config for BaiChuan-chat model by @CrazyShipOne in #3332
- [extensions/openai] +Array input (batched) , +Fixes by @matatonic in #3309
- Add a scrollbar to notebook/default textboxes, improve chat scrollbar style by @jparmstr in #3403
- Add auto_max_new_tokens parameter by @oobabooga in #3419
- Add the --cpu option for llama.cpp to prevent CUDA from being used by @oobabooga in #3432
- Use character settings from API properties if present by @rafa-9 in #3428
- Add standalone Dockerfile for NVIDIA Jetson by @toolboc in #3336
- More models: +StableBeluga2 by @matatonic in #3415
- [extensions/openai] include content-length for json replies by @matatonic in #3416
- Fix llama.cpp truncation by @jparmstr in #3400
- Remove unnecessary chat.js by @missionfloyd in #3445
- Add back silero preview by @missionfloyd by @oobabooga in #3446
- Add SSL certificate support by @oobabooga in #3453
- Bump bitsandbytes to 0.41.1 by @jllllll in #3457
- [Bug fix] Remove html tags form the Prompt sent to Stable Diffusion by @SodaPrettyCold in #3151
- Fix: Mirostat fails on models split across multiple GPUs. by @Ph0rk0z in #3465
- Bump exllama wheels to 0.0.10 by @jllllll in #3467
- Create logs dir if missing when saving history by @jllllll in #3462
- Fix chat message order by @missionfloyd in #3461
- Add Classifier Free Guidance (CFG) for Transformers/ExLlama by @oobabooga in #3325
- Refactor everything by @oobabooga in #3481
- Use chat_instruct_command in API by @jllllll in #3482
- Make dockerfile respect specified cuda version by @sammcj in #3474
- Fixed a typo when displaying parameters on the llamm.cpp model did not correctly display "rms_norm_eps" by @berkut1 in #3494
- Add option for named cloudflare tunnels by @Fredddi43 in #3364
- Fix superbooga when using regenerate by @oderwat in #3362
- Added the logic for starchat model series by @giprime in #3185
- Streamline GPTQ-for-LLaMa support by @jllllll in #3526
- Add Vicuna-v1.5 detection by @berkut1 in #3524
- ctransformers: another attempt by @cal066 in #3313
- Bump ctransformers wheel version by @jllllll in #3558
- ctransformers: move thread and seed parameters by @cal066 in #3543
- Unify the 3 interface modes by @oobabooga in #3554
- Various ctransformers fixes by @netrunnereve in #3556
- Add "save defaults to settings.yaml" button by @oobabooga in #3574
- Add the --disable_exllama option for AutoGPTQ by @clefever in #3545
- ctransformers: Fix up model_type name consistency by @cal066 in #3567
- Add a "Show controls" button to chat UI by @oobabooga in #3590
- Improved chat scrolling by @oobabooga in #3601
- fixes error when not specifying tunnel id by @ausboss in #3606
- Fix print CSS by @missionfloyd in #3608
- Bump llama-cpp-python by @oobabooga in #3610
- Bump llama_cpp_python_cuda to 0.1.78 by @jllllll in #3614
- Refactor the training tab by @oobabooga in #3619
- llama.cpp: make Stop button work with streaming disabled by @cebtenzzre in #3620
- Unescape last message by @missionfloyd in #3623
- Improve readability of download-model.py by @Thutmose3 in #3497
- Add probability dropdown to perplexity_colors extension by @SeanScripts in #3148
- Add a simple logit viewer by @oobabooga in #3636
- Fix whitespace formatting in perplexity_colors extension. by @tdrussell in #3643
- ctransformers: add mlock and no-mmap options by @cal066 in #3649
- Update requirements.txt by @tkbit in #3651
- Add missing extensions to Dockerfile by @sammcj in #3544
- Implement CFG for ExLlama_HF by @oobabooga in #3666
- Add CFG to llamacpp_HF (second attempt) by @oobabooga in #3678
- ctransformers: gguf support by @cal066 in #3685
- Fix ctransformers threads auto-detection by @jllllll in #3688
- Use separate llama-cpp-python packages for GGML support by @jllllll in #3697
- GGUF by @oobabooga in #3695
- Fix ctransformers model unload by @marella in #3711
- Add ffmpeg to the Docker image by @kelvie in #3664
- accept floating-point alpha value on the command line by @cebtenzzre in #3712
- Bump llama-cpp-python to 0.1.81 by @jllllll in #3716
- Make it possible to scroll during streaming by @oobabooga in #3721
- Bump llama-cpp-python to 0.1.82 by @jllllll in #3730
- Bump ctransformers to 0.2.25 by @jllllll in #3740
- Add max_tokens_second param by @oobabooga in #3533
- Update requirements.txt by @VishwasKukreti in #3725
- Update llama.cpp.md by @q5sys in #3702
- Bump llama-cpp-python to 0.1.83 by @jllllll in #3745
- Update download-model.py (Allow single file download) by @bet0x in #3732
- Allow downloading single file from UI by @missionfloyd in #3737
- Bump exllama to 0.0.14 by @jllllll in #3758
- Bump llama-cpp-python to 0.1.84 by @jllllll in #3854
- Update transformers requirement from ==4.32.* to ==4.33.* by @dependabot in #3865
- Bump exllama to 0.1.17 by @jllllll in #3847
- Exllama new rope settings by @Ph0rk0z in #3852
- fix lora training with alpaca_lora_4bit by @johnsmith0031 in #3853
- Improve instructions for CPUs without AVX2 by @netrunnereve in #3786
- improve docker builds by @sammcj in #3715
- Read GGUF metadata by @oobabooga in #3873
- Add ExLlamaV2 and ExLlamav2_HF loaders by @oobabooga in #3881
- silero_tts: Add language option by @missionfloyd in #3878
- Bump optimum from 1.12.0 to 1.13.1 by @dependabot in #3872
- Handle Chunked Transfer Encoding in
openai
Extension for Streaming Requests by @mcc311 in #3870 - add pygmalion-2 and mythalion support by @netrunnereve in #3821
- Read more GGUF metadata (scale_linear and freq_base) by @berkut1 in #3877
- Bump llama-cpp-python to 0.1.85 by @jllllll in #3887
- Bump ctransformers to 0.2.27 by @cal066 in #3893
- Bump exllamav2 from 0.0.0 to 0.0.1 by @dependabot in #3896
- Fix NTK (alpha) and RoPE scaling for exllamav2 and exllamav2_HF by @Panchovix in #3897
- Reorganize chat buttons by @oobabooga in #3892
- Make the chat input expand upwards by @oobabooga in #3920
- Fix TheEncrypted777 theme in light mode by @missionfloyd in #3917
- Fix pydantic version conflict in elevenlabs extension by @jllllll in #3927
- Allow custom tokenizer for llamacpp_HF loader by @JohanAR in #3941
- Add customizable ban tokens by @sALTaccount in #3899
- Better solution to chat UI by @missionfloyd in #3947
- Fix exllama tokenizers by @sALTaccount in #3954
- Adjust model variable if it includes a hf URL already by @kalomaze in #3919
- Fix issue #3822 and #3839 by @Touch-Night in #3827
- Add whisper api support for OpenAI extension by @wizd in #3958
- Add speechrecognition dependency for OpenAI extension by @fablerq in #3959
- token probs for non HF loaders by @sALTaccount in #3957
- Training PRO extension by @FartyPants in #3961
- Training extension - added target selector by @FartyPants in #3969
- Fix unexpected extensions load after gradio restart by @Touch-Night in #3965
- Update requirements.txt - Bump ExLlamav2 to v0.0.2 by @Thireus in #3970
- Simplified ExLlama cloning instructions and failure message by @jamesbraza in #3972
- Move hover menu shortcuts to right side by @missionfloyd in #3951
- [extensions/openai] load extension settings via
settings.yaml
by @wangcx18 in #3953 - Update accelerate requirement from ==0.22.* to ==0.23.* by @dependabot in #3981
- llama.cpp: fix ban_eos_token by @cebtenzzre in #3987
- Bump llama-cpp-python to 0.2.6 by @jllllll in #3982
- Stops the generation immediately when using the "Maximum number of tokens/second" setting by @BadisG in #3952
- Multiple histories for each character by @oobabooga in #4022
- Various one-click-installer updates and fixes by @jllllll in #4029
- Move one-click-installers into the repository by @oobabooga in #4028
- Training PRO extension update by @FartyPants in #4036
New Contributors
- @CrazyShipOne made their first contribution in #3332
- @jparmstr made their first contribution in #3403
- @rafa-9 made their first contribution in #3428
- @toolboc made their first contribution in #3336
- @SodaPrettyCold made their first contribution in #3151
- @sammcj made their first contribution in #3474
- @berkut1 made their first contribution in #3494
- @Fredddi43 made their first contribution in #3364
- @oderwat made their first contribution in #3362
- @giprime made their first contribution in #3185
- @cal066 made their first contribution in #3313
- @clefever made their first contribution in #3545
- @ausboss made their first contribution in #3606
- @Thutmose3 made their first contribution in #3497
- @tdrussell made their first contribution in #3643
- @tkbit made their first contribution in #3651
- @marella made their first contribution in #3711
- @kelvie made their first contribution in #3664
- @VishwasKukreti made their first contribution in #3725
- @q5sys made their first contribution in #3702
- @bet0x made their first contribution in #3732
- @johnsmith0031 made their first contribution in #3853
- @mcc311 made their first contribution in #3870
- @JohanAR made their first contribution in #3941
- @sALTaccount made their first contribution in #3899
- @kalomaze made their first contribution in #3919
- @Touch-Night made their first contribution in #3827
- @wizd made their first contribution in #3958
- @fablerq made their first contribution in #3959
- @jamesbraza made their first contribution in #3972
- @wangcx18 made their first contribution in #3953
- @BadisG made their first contribution in #3952
Full Changelog: v1.5...v1.6