Highlights
🐞 Bug fixes. We have fixed a variety of minor issues related to timeout errors in the client, fine-tuning, and tensor parallelism.
⚙️ New options in the client. Added allowed_servers
and max_retries
options:
allowed_servers
allows to restrict the set of servers a client can use for its requests (e.g., to only use the servers trusted to process your data).max_retries
allows to limit the number of retries a client does before raising an exception (previously, clients continued retrying indefinitely).
📚 FAQ. We have released the FAQ page that covers common questions about running clients and servers, as well as troubleshooting common problems.
What's Changed
- Fix typo in prompt-tuning-sst2.ipynb by @borzunov in #245
- Minor changes to examples/prompt-tuning notebooks by @justheuristic in #247
- Fix examples/sst, add cls_model embeddings by @justheuristic in #248
- Fix TP crashing when hypo_ids are used by @borzunov in #249
- Add
allowed_servers
,max_retries
options to the client, improve logs by @borzunov in #235 - Lower payload size threshold for stream handlers by @borzunov in #251
- Improve reachability logs by @borzunov in #253
- Link FAQ in readme by @borzunov in #260
- Show visible maddrs for public swarm too by @borzunov in #263
- Limit max delay between retries to 15 min by @borzunov in #264
- Use get_logger(name) instead of get_logger(file) by @borzunov in #265
- Improve "connect your GPU" message by @borzunov in #266
- Fix use_chunked_forward="auto" on non-x86_64 machines by @borzunov in #267
- Use inference mode in _MergedInferenceStep by @justheuristic in #275
- Increase default request_timeout by @borzunov in #276
Full Changelog: v1.1.2...v1.1.3