Highlights
🏃♀️ Faster inference. We've shipped server-side changes improving the inference speed by up to 30%. This is a result of profiling the server's inference performance (see details in #224 and #225). The public swarm will become faster once everyone upgrades to the latest Petals version and restarts their servers.
🐞 Prompt-tuning bug fixes. We've shipped bug fixes for prompt-tuning notebooks (see details in #231).
🧑🏫 New pretrained model. We've added a new model, BLOOMZ-176B by BigScience, to the public swarm. You can run it (or host its blocks) by specifying bigscience/bloomz-petals
as the model name.
- BLOOMZ is a version of BLOOM fine-tuned to follow human instructions in the zero-shot regime. See details in its model card and paper.
- The chatbot app now uses BLOOMZ by default. You can ask it to generate texts, code, or perform various tasks. It responds better than the regular BLOOM, which often went off-topic instead of actually doing the task you asked.
What's Changed
- Choose --num_blocks automatically for all models by @borzunov in #217
- Add one more link to the "Getting started" tutorial by @borzunov in #218
- Mention BLOOMZ in readme by @borzunov in #221
- Fix a typo in error message. by @zsc in #227
- Merge inference pools into one to increase inference speed by @justheuristic in #225
- Add citation to readme by @Muhtasham in #219
- Fix dtype error in fine-tuning notebooks by @artek0chumak in #231
- Prompt-tuning notebooks: suggest to use a smaller model for faster prototyping by @borzunov in #234
- Bump version to 1.1.2 by @borzunov in #244
New Contributors
- @zsc made their first contribution in #227
- @Muhtasham made their first contribution in #219
Full Changelog: v1.1.1...v1.1.2