bigscience-workshop/petals v1.1.2 on GitHub

Highlights

🏃‍♀️ Faster inference. We've shipped server-side changes improving the inference speed by up to 30%. This is a result of profiling the server's inference performance (see details in #224 and #225). The public swarm will become faster once everyone upgrades to the latest Petals version and restarts their servers.

🐞 Prompt-tuning bug fixes. We've shipped bug fixes for prompt-tuning notebooks (see details in #231).

🧑‍🏫 New pretrained model. We've added a new model, BLOOMZ-176B by BigScience, to the public swarm. You can run it (or host its blocks) by specifying bigscience/bloomz-petals as the model name.

BLOOMZ is a version of BLOOM fine-tuned to follow human instructions in the zero-shot regime. See details in its model card and paper.
The chatbot app now uses BLOOMZ by default. You can ask it to generate texts, code, or perform various tasks. It responds better than the regular BLOOM, which often went off-topic instead of actually doing the task you asked.

What's Changed

Choose --num_blocks automatically for all models by @borzunov in #217
Add one more link to the "Getting started" tutorial by @borzunov in #218
Mention BLOOMZ in readme by @borzunov in #221
Fix a typo in error message. by @zsc in #227
Merge inference pools into one to increase inference speed by @justheuristic in #225
Add citation to readme by @Muhtasham in #219
Fix dtype error in fine-tuning notebooks by @artek0chumak in #231
Prompt-tuning notebooks: suggest to use a smaller model for faster prototyping by @borzunov in #234
Bump version to 1.1.2 by @borzunov in #244

New Contributors

@zsc made their first contribution in #227
@Muhtasham made their first contribution in #219

Full Changelog: v1.1.1...v1.1.2

bigscience-workshop/petals v1.1.2 v1.1.2: Faster inference, new model, and more on GitHub

Highlights

What's Changed

New Contributors

bigscience-workshop/petals v1.1.2
v1.1.2: Faster inference, new model, and more

on GitHub