OpenZL v0.2.0
This release bundles several months of work since the original v0.1.0 (October 2025). The headline changes are a reworked SDDL runtime and compiler, a native LZ codec promoted to the default for serial inputs, automatic chunking for very large inputs, and an improved graph visualizer.
SDDL2
SDDL has been rebuilt from the ground up to reach its intended design goals. Where the original demo was a thin runtime, SDDL2 is a real compiler: a parser feeds a semantic analyzer, the analyzer hands a typed AST to an optimizer, and the optimizer drives a code generator that emits VM bytecode.
A key result is instant-parse. When a record's layout can be fully determined from parameters and constants alone, the engine jumps directly to any field without scanning preceding bytes, enabling zero-copy access and multi-GB/s throughput.
The language itself has grown along with the toolchain. It now supports when blocks for conditional layouts, parameterized and anonymous records, member access on record fields, and bitwise and logical operators.
On the developer-experience side, a semantic analysis phase now catches undefined references, type mismatches, and arity mistakes at compile time—with source locations—rather than at runtime, and a VS Code syntax-highlighting extension is published for .sddl files.
A more-thorough writeup of the redesign and current language features is available on the SDDL docs page.
A new native LZ codec
OpenZL now ships its own LZ codec, exposed as ZL_GRAPH_LZ, and the serial profile in zli. It is still being actively developed to expand its feature set and improve performance on small inputs. Today it supports the equivalent of zstd level 1 with a 64K window size. with a 64K window size.
OpenZL gives us an opportunity to redesign each stage of the LZ pipeline to improve speed. Its graph-based architecture also allows us to mix-and-match the entropy-coding stages, rather than having a single pipeline that works well enough for every use case. Multiple stages can then be fused into a single operation, in order to improve processing speed. This allows OpenZL to offer 10% faster compression speed and 70% faster decompression speed compared to Zstandard level 1 on the Silesia corpus in our benchmarks.
| Compressor | Compression Ratio | Compression Speed | Decompression Speed |
|---|---|---|---|
| OpenZL LZ level 1 | 2.74 | 466 MB/s | 2288 MB/s |
| Zstd level 1 with 64K window size | 2.74 | 419 MB/s | 1254 MB/s |
| Zstd level 1 | 2.89 | 424 MB/s | 1345 MB/s |
While we have exciting early results, there is still a lot of work left to do: support multiple compression levels, improve performance on small inputs, optimize ARM performance, support 32-bit offsets, add repeat offset encoding, support training LZ graphs, and much more. So stay tuned to future OpenZL releases for improved LZ support.
Support for very large inputs
OpenZL's zli is now compatible with huge inputs (multi-GBs).
They are now split automatically into chunks of controllable size (roughly 16 MB by default) before compression, which keeps memory bounded, improves locality, and opens the door to parallel processing. SDDL2 has gained equivalent automatic chunking on the schema-driven path. New segmenters have been created or updated along the way — for CSV, Parquet, and standard numeric data — and all segmenters are now serializable and parameterizable, so a chosen layout can be persisted in a compressor and reused later.
This is transparently applied at compression time. Note that the training pipeline is different and remains unaffected, it is therefore not designed to accept gigantic inputs as training material.
Graph Visualizer improvements
The visualizer now distinguishes compression and decompression traces end-to-end.
A stream-preview pane lets you inspect the bytes that actually flow through each edge, with truncation controls so large streams stay manageable.
A settings panel collects display preferences in one place, and a full set of keyboard shortcuts — directional navigation, sorted traversal, expand and collapse, node selection — make the tool practical to drive without a mouse.
Traces are now versioned, chunked compressions render correctly, and zli decompress can finally emit traces of its own through the new --trace and --trace-streams-dir flags.
Try out the new features at openzl.org/tools/trace
Miscellaneous
Several codecs join the catalog. Partition and bitpack now benefit from a fused decoder. Floating-point bitsplit gained dedicated encoders and decoders for fp16, fp32, fp64, and bf16, with specialized fast paths. There is a range-aware split (split_byrange), a length multiplexer, a sentinel codec, an lz4 graph, and small helpers like tryParseInt and splitByParam.
The public API has been cleaned up. A few historical names have been renamed to follow the new convention, and the public headers now compile cleanly under C99. ZL_compressBound() is also much tighter now, thanks to StoreOnExpansion which prevents unwanted size expansion.
Robustness benefits from a more thorough fuzzing setup. A new random graph builder, registry-driven fuzzing, and a dedicated compressor-deserialization fuzzer helped surface and fix deep issues, now more easily reachable.
Building and packaging are friendlier on more platforms. Windows now has precompiled binaries, the source compiles cleanly under MSVC, lz4 and xgboost are vendored as git submodules with the xgboost dependency now optional, and Python packaging has been set up so that wheels can be produced from the source tree.
A handful of behaviors have changed in ways worth checking before upgrading. The serial profile now defaults to ZL_GRAPH_LZ, zli defaults to permissive mode, the SDDL Record keyword is now lowercase record, and chunk-size flags now accept standard suffixes — with chunkByteSize=0 meaning "use the default" instead of "no chunking".
Full Changelog: v0.1.0...v0.2.0