mlc-ai/xgrammar v0.1.23 on GitHub

Highlights

Significant speedup for grammars with repeat, e.g., a{1, 100}. Now the preprocessing of it is O(1) instead of O(n)
Release the new serialization library
Fix bugs about max_rollback_tokens
Fix bugs in cuda kernels
Refactor: migrate the grammar backend to FSMs

[Refactor] JSON serializer and MemorySize by @DarkSharpness in #380
[Feature] Add a new expression to represent repetition to speed up. by @Seven-Streams in #368
[Minor] remove unnecessary debug file by @DarkSharpness in #381
Fix document error and add docs for serialization by @Ubospica in #384
Bind fill_next_token_bitmask against nb::ndarray by @Ahajha in #338
[Feature] support internal check option in cmake by @DarkSharpness in #370
[Refactor] Remove logics about max_rollback_tokens by @Ubospica in #385
Fix and improve apply_token_bitmask benchmark script by @Jialin in #391
Fix apply_bitmask logit for both CPU and triton versions when shape and stride doesn't match by @Jialin in #390
Fix apply_bit_mask cuda implementation by @Jialin in #394
[Fix]Fix the grammar_compiler. by @Seven-Streams in #395
[Feature] Migrate the parsing backend to fsms. by @Seven-Streams in #376
[Fix] Fix Multi-byte unicode characters in StructuralTagItem. by @Seven-Streams in #396
Bump to v0.1.23 by @Ubospica in #399

Full Changelog: v0.1.22...v0.1.23