mlc-ai/xgrammar v0.1.16 on GitHub

What's Changed

[Fix] Postpone cuda import to the calling site by @Ubospica in #231
[Feature] Support model vocab size being less than tokenizer by @Ubospica in #237
[Style] Remove unused headers by @DarkSharpness in #219
Fallback to triton if we fail to compile for CUDA by @zbowling in #223
[Feature] Build and run C++ Python tests by @DarkSharpness in #218
[Fix] Fix missing dependency in ci by @DarkSharpness in #239

Full Changelog: v0.1.15...v0.1.16