modular/modular max/v26.3.0 on GitHub

The Modular 26.3 release brings video generation to MAX with Wan 2.1 / 2.2 diffusion models (image-to-video and video-to-video), a new multi-GPU Python API via max.experimental.sharding that distributes models across a DeviceMesh with Replicated, Sharded, and Partial placement primitives, and NVFP4 grouped matmul kernels that now outperform FlashInfer on B200 for Kimi K2.5. New model support lands for Gemma 4 with multimodal vision, Qwen3 and Qwen3-VL (including MoE variants), MiniMax-M2 with 4×H100 multi-GPU serving, and FLUX.2 with TaylorSeer and TeaCache denoising caches for faster image-to-image generation.

The Mojo 1.0 beta 1 (v1.0.0b1) release introduces type refinement from compile-time assumptions: where conforms_to(T, Trait) clauses, comptime if, and comptime assert now narrow types automatically, eliminating the need for manual trait_downcast calls throughout the standard library. Closure unification advances further—stateless closures auto-lift to function pointers, the ref capture convention is supported, and the new thin function effect cleanly distinguishes function pointer types from closure traits. The fn keyword now emits a deprecation warning (becoming a hard error next release), negative indexing on standard collections has been removed in favor of cheap CPU bounds checks that are now on by default, and UnsafePointer is non-null by design with nullability expressed via Optional[UnsafePointer[...]].

Check out all the updates with the full MAX changelog and Mojo changelog.

modular/modular max/v26.3.0 MAX 26.3 / Mojo 1.0.0b1 on GitHub

modular/modular max/v26.3.0
MAX 26.3 / Mojo 1.0.0b1

on GitHub