EXO v1.0.66 Release Notes
This is a stability release that fixes a regression with RDMA / Tensor Parallelism where models were getting stuck in LOADING state. It also fixes a download edge case with GLM 4.7 Flash and nodes getting stuck in UNKNOWN state / zombie states after periods of inactivity.
All models have been confirmed working with RDMA / Tensor Parallel on various configurations (including Mac Minis, MacBooks and Mac Studios). Thank you to users who reported bugs to help us resolve these issues - it helps a lot!
Bug Fixes
- Use EXO shard instead of upstream shard for all models, loading models layer-by-layer, fixing models getting stuck in
LOADINGstate e.g.GLM-4.7-Flash-4bit,gpt-oss-120b-MXFP4-Q8andQwen3-Coder-480B-A35B-Instruct-8bit. Also fixes memory not being released when an instance is deleted. (#1291) - Fix downloads getting stuck when model files change in Huggingface repo e.g.
GLM-4.7-Flash-4bitwhich was updated upstream on Jan 25 (#1290) - Always publish info gatherer events, preventing nodes getting stuck in
UNKNOWNstate / zombie states after a period of inactivity (#1283) - Fix tool calls with empty text content (#1292)
MLX
- Upgrade
mlx-lmto0.30.5
Full Changelog: v1.0.65...v1.0.66