bigscience-workshop/petals v1.1.5 on GitHub

Highlights

⏱ Faster fine-tuning. Fine-tuning uses ~2x less traffic (tensors are now sent in bfloat16 by default) and builds routes using a heuristic maximizing the swarm's throughput. This should address timeout errors that could happen during fine-tuning.

🐞 Bug fixes. On servers, this release fixes out-of-memory errors and freezing network throughput evals. On clients, it fixes issues with slicing RemoteSequential and silently ignoring unsupported .generate() kwargs. Also, this release fixes warnings originated from hivemind.p2p and hivemind.compression.

🛣️ Updated throughput formula. We have updated the throughput formula to reflect that servers hosting many blocks still run forward and backward passes through only one block at a time. Don't be surprised if your throughput became smaller than in 1.1.4 — these numbers are not directly comparable!

🖼️ Improved lower-level interfaces. We have refactored lower-level interfaces, such as RemoteSequential and RemoteSequenceManager, to be more reliable (e.g. when doing retries) and much easier to use. Some rarely used low-level functions in petals.dht_utils were removed.

What's Changed

Fix OOMs happening in case of accelerate >= 0.16.0 by @borzunov in #310
Refactor RemoteSequenceManager by @borzunov in #309
Update hivemind to 1.1.8, enable efficient bfloat16 encoding by @borzunov in #311
Replace .make_sequence(..., mode="random") with mode="max_throughput" by @borzunov in #313
Divide compute throughput by average no. of used blocks by @borzunov in #314
Raise error for unexpected .generate() kwargs by @borzunov in #315
Abort speedtest if it runs too long by @borzunov in #316
Bump version to 1.1.5 by @borzunov in #312

Full Changelog: v1.1.4...v1.1.5

bigscience-workshop/petals v1.1.5 v1.1.5: Faster fine-tuning, bug fixes, and more on GitHub

Highlights

What's Changed

bigscience-workshop/petals v1.1.5
v1.1.5: Faster fine-tuning, bug fixes, and more

on GitHub