EXO v1.0.64 Release Notes

This release comes with support for GLM-4.7-Flash, IP-less RDMA discovery (removing the need for custom network locations) and OpenAI-compatible tool calling via the API. It also includes bug fixes for auto parallelism, fixing various models including Qwen, GPT-OSS and MiniMax that were getting stuck in LOADING / WARMING UP, as well as better error messages when things go wrong.

Model Support

Added support for GLM-4.7-Flash (#1214)

API

Added tool calling support to the OpenAI-compatible chat completions API (#1233)

UX

Add proxy and custom SSL certificate support for corporate networks (#1189)
More responsive node info by splitting information sources and state (#928, #1209)
Less spammy logs (#1218, #1225)
Better error messages when things go wrong (#1198, #1173, #1177)
Detect Thunderbolt Bridge cycles and show a native prompt to disable Thunderbolt Bridge (#1222)
Add the ability to filter nodes for placement in the dashboard by clicking them (#1248)

Bug Fixes

Fix placement edge cases for heterogeneous devices with lopsided memory availabilities (#1200)
Fix various issues with auto parallel for loading and sharding models (#1202, #1201, #1206, #1211, #1229, #1223)
Prepend think tags for certain models that include a think tag in their chat template e.g. GLM 4.7 (#1186)

Full Changelog: v1.0.63...v1.0.64

exo-explore/exo v1.0.64 on GitHub

EXO v1.0.64 Release Notes

Model Support

API

UX

Bug Fixes

exo-explore/exo v1.0.64
on GitHub