bigscience-workshop/petals v1.1.3 on GitHub

Highlights

🐞 Bug fixes. We have fixed a variety of minor issues related to timeout errors in the client, fine-tuning, and tensor parallelism.

⚙️ New options in the client. Added allowed_servers and max_retries options:

allowed_servers allows to restrict the set of servers a client can use for its requests (e.g., to only use the servers trusted to process your data).
max_retries allows to limit the number of retries a client does before raising an exception (previously, clients continued retrying indefinitely).

📚 FAQ. We have released the FAQ page that covers common questions about running clients and servers, as well as troubleshooting common problems.

Fix typo in prompt-tuning-sst2.ipynb by @borzunov in #245
Minor changes to examples/prompt-tuning notebooks by @justheuristic in #247
Fix examples/sst, add cls_model embeddings by @justheuristic in #248
Fix TP crashing when hypo_ids are used by @borzunov in #249
Add allowed_servers, max_retries options to the client, improve logs by @borzunov in #235
Lower payload size threshold for stream handlers by @borzunov in #251
Improve reachability logs by @borzunov in #253
Link FAQ in readme by @borzunov in #260
Show visible maddrs for public swarm too by @borzunov in #263
Limit max delay between retries to 15 min by @borzunov in #264
Use get_logger(name) instead of get_logger(file) by @borzunov in #265
Improve "connect your GPU" message by @borzunov in #266
Fix use_chunked_forward="auto" on non-x86_64 machines by @borzunov in #267
Use inference mode in _MergedInferenceStep by @justheuristic in #275
Increase default request_timeout by @borzunov in #276

Full Changelog: v1.1.2...v1.1.3