cue-lang/cue v0.3.0-alpha1 on GitHub

This release implements the long-anticipated move of cue to the new evaluator. This particular alpha release only switches over the cue command. A follow-up release will switch over the API as well. This allows people to upgrade to the tool without having to worry about API breakage just yet.

Note that this is an alpha release. This release includes an almost complete rewrite of CUE core. Although it will see many benefits, it can be expected to need some more ironing out. This includes new bugs in semantics and performance. But it also means dealing with bugs in the old evaluator that are now uncovered.

Overview of new capabilities

This release of the new evaluator mostly aims to address old bugs and doesn’t introduce many new things. However, this still means it addresses some important things:

Considerable performance improvements. More on that later.
Many bug fixes: the new evaluator is designed with embeddings, definitions, and optional constraints in mind, instead of them being an afterthought.
Structural cycle detection: this should put an end to stack overflows (but see caveats below).
Non-monotonic constraints: this allows things like structs.MinFields(3) to work. This also enables the implementation of things like not (for JSON Schema support), association lists, and all sorts of other goodness.

A more official ticking off the box of existing issues is still forthcoming.

Under the hood it has an almost complete implementation of the following features:

JSON Schema semantics for pattern constraints (bulk optional fields) and additional constraints (allowing ...T in structs as well).
Embedded scalars: embedding scalars is allowed at the top-level, but now would be allowed anywhere: with the same rules that apply at the top-level. For instance, { "foo", #field: 2 } would be allowed.

These features will be enabled in a later release (in most cases by simply relaxing the parser), once they are better tested.

Under the hood, the new evaluator has been broken up into about 10 components, instead of being one large monolith, separating out clear responsibilities. The largest package is now about 1/15th the size of the original implementation. This clarifies dependencies and responsibilities.

Things that are known to be broken or incompatible

Tooling overlay

Previously, *_tool.cue files were interpreted in a different namespace that lived as an overlay over the other configurations. This was hard to explain for users and also results in many awkward situations, like the inability to print _tool.cue files.

This functionality is not implemented in the new evaluator. This means the command section is merged in with the other files. This may cause clashes.

The idea behind not implementing it is that the new #-style definitions, which live in a different namespace, allow for another, better approach: commands can simply be declared as definitions to avoid clashes with regular fields.

For cases where the user also has no control over the definition names, the idea is to allow for a convention where a tooling logic can be in a separate pkg_tool package to accompany a regular package, not unlike a foo_test package in Go. To CUE this would then just be a regular package. Feedback appreciated.

`if` comprehension fix

The old evaluator used to silently ignore an error of the condition of an if clause, making it the same meaning as false. This was incorrect and led to several bugs. The new evaluator now treats errors as errors.

If the old behavior was desired, one can achieve the same result using disjunctions. One such case was:

foo: string
if len(foo) > 0 {
}

The old (faulty) behavior can be simulated writing this as follows:

foo: string
if *(len(foo) > 0) | false {
}

Disjunction bug fix

In the old evaluator, sometimes allowed a selector to index into an unresolved disjunction, causing (null | { c: 2 }).c to resolve. This is against the spec and is always illegal in the new evaluator. So one may find cases where one will have to write (d&{}).c to first resolve the disjunction.

Error messages

Please don’t send bug reports on faulty error messages for this release.

The new evaluator implements a different approach to error message generation. This allows the creation of very tailored and detailed error messages. It is just not done yet for the most part. Net effect:
a lot of the line number information is missing,
the path is sometimes missing (but also improved in some cases, as the path information is now more reliable),
some of the error messages are still sloppy.

Reentrancy limitations are now enforced

This means CUE should no longer crash on structural cycles.

Please submit bug reports if CUE hangs in an infinite loop or has a stack overflow

The spec disallows structural cycles. This includes reentrancy for the form

foo: { n: int, out: foo&{n:2}.out+1 }

The new evaluator also has some performance stats built in. This showed that CUE really isn’t great at doing reentrancy and will quickly venture into unexpected exponential behavior.

CUE’s strength is to combine results of computation, not do the computation itself. So instead, the way forward for CUE is to make it easier to shell out to other languages like Go, Python, Javascript and/or WASM.

That said, at the moment shelling out to other languages can only be done by creating custom invocations in the tooling layer. A convenient way to do this is not yet supported.

For the time being, we could consider supporting a structural cycle allowance e.g. as a command line flag or environment variable that allows a certain number of structural cycles to be ignored by CUE. This could help people transition off the use of reentrancy. Feedback appreciated.

Printing

Printing of CUE values has been completely rewritten and as a result will result in the inevitable differences. CUE now has a stricter distinction between values and expressions. This may cause printed values to be rendered differently. Also, printing of definitions still needs some work and can be expected to have some bugs.

There will always be the inevitable cosmetic changes, but:

please report where changes of printing results in bad incompatibilities.

Field order

CUE used to print fields in the order that a field name was declared anywhere. This often resulted in fields being sorted in schema order, but this was clearly not always the case.

The new evaluator does a topologic sort on fields before printing them. This means that fields will observe the order of fields of all structs that combine into a result. Insofar there remains ambiguity, the fields will be combined arbitrarily. In case different structs give conflicting orderings, the result is undefined.

The algorithm does not yet take the relative ordering of embedded structs into account.

`tools/trim`

Trim relied on very intricate behavior of the API as well as various deprecated API features. It made assumptions that no longer held as the language progressed even in the old evaluator.

The code has been adjusted to use the new APIs. It now often does not remove the top-level element that was inserted by a comprehension. Otherwise it should largely give the same results. The new evaluator also fixes some inconsistencies that fixes known bugs in the old evaluator that failed to remove some fields in some cases.

Other things to watch out for

Cycle handling

One of the benefits of the “always evaluate everything” approach is that CUE can annotate a result of evaluation with cycle points. This means it is possible to rely on a single point where cycles are detected. This is great, as cycle detection is complex! Previously, stack overflow detection was implemented in many components, each with their own peculiarities.

This means that cycle detection has been removed from many components. As a result, there may be bugs in the assumptions in components like the printer, for instance, that will still result in cycles. For instance, printing debug strings of values that are being evaluated may be treacherous. In general, though, the idea is that the evaluator clearly marks cyclic points in the graph, forcing an API to make a conscious decision about descending into a cycle.

Please submit bug reports if CUE hangs in an infinite loop or has a stack overflow

Note that although cycle detection is complex, it is not expensive. The implementation is based on unification algorithms that have cycle detection as a nearly free side-effect. The cost is currently minimal and can probably be improved still.

Performance characteristics

CUE now always evaluates an entire configuration value. This is required by the spec to be fully correct. It also simplifies things greatly, and is paramount in supporting certain new capabilities. It does mean that the API may have very different performance characteristics. cue.Unify may do more than expected, while cue.Validate now just collects errors from an evaluated tree so may run a lot faster.

An aim of the new evaluator was to remove any gratuitous exponential computation. The old evaluator could run into exponential behavior for aliases, for instance. The new algorithm is a non-copying unification algorithm: it therefore no longer needs to descend the graph to make copies. This was a performance quagmire in the old evaluator.

The new evaluator has been designed to allow for O(n) execution under certain circumstances (not using comprehensions or disjunctions without discriminators). It has not been implemented that way though! Currently the performance characteristics can be expected to be strictly better than the old implementation, though, barring having a worse constant in some cases.

That said, there may still be performance bugs that still need to be ironed out. Please report performance issues.

For one, there are still known cases where work is unnecessarily duplicated. The plan is to remove these so that each node only gets processed once.

Unexpected interpretation of `{}` / embedded scalars

The new evaluator is in an advanced state of allowing embedded scalars. For the most part, removing the restrictions in the parser will enable it.

There are a few kinks to work out though. For instance, what does [] & {} mean? One could argue that {} means {_} (embedded any value) and thus that [] & {} means []. One could also argue that {} is a struct by default. But as per the spec, defaults are not resolved before applying &, so even then [] & {} should be considered legal to mean [].

The new evaluator currently aims to mimic the old semantics as much as possible, but error messages may sometimes be unexpected as a consequence of the partial implementation of embedded scalars.

Changelog

0528376 all: adjustments package to work with new evaluator
7f52c10 all: switch to using cue port
0cfe411 cue/internal/adt: add builtin and import support
bdb45b3 cue: fix misspells
b2ea648 cue: hoist functionality to cue/internal/runtime
94fac7f cue: more renamings to minimize diffs
3efd10d cue: prepare to hoist index type
a675387 cue: refactor marshal to simplify move
03092d9 cue: rename fields to ease transition to package adt
0dcc335 cue: simplify implementation
b6a3a4b cue: use Kind and Op from Package adt
e2a58e9 cue: use higher-level API when possible
1d29ac3 internal/core/adt: add Resolve helper
651d379 internal/core/adt: add path location for majority of errors
f159888 internal/core/adt: apply Default also to list
88b4b1f internal/core/adt: distinguish dynamic fields from pattern constraints
7504519 internal/core/adt: finalize comprehension arcs
6a08830 internal/core/adt: initial commit
1f42c81 internal/core/compile: add new compiler
cf94469 internal/core/compile: allow resolution in custom scope
e986a8f internal/core/compile: separate out phase of let clauses
08a9fdb internal/core/convert: initial commit
6be2b4f internal/core/debug: add compact mode
17ade83 internal/core/eval: add Stats counters
cfc6c8c internal/core/eval: add more positions for closedness errors
d15def3 internal/core/eval: allow closedness override
d857f23 internal/core/eval: centralize type checking
b68adfe internal/core/eval: fix Environment linkage for Disjunctions
07e40c6 internal/core/eval: fix Evaluator sync bug
3acdde8 internal/core/eval: fix double increase of close ID
d464582 internal/core/eval: fix multi-conjunct pattern constraints
ff8373c internal/core/eval: fixes to closedness algorithm
711bbb1 internal/core/eval: handle structural cycles
0f935f8 internal/core/eval: implement core evaluator
9b4e65a internal/core/eval: tweak to disjunction
7903529 internal/core/eval: use cue's import support in tests
311f5bc internal/core/export: allow testing imports
18736bf internal/core/export: explicit generation of adt types
5378b08 internal/core/export: export let
0d12c33 internal/core/export: initial commit
7eafd8b internal/core/runtime: add Build method
5cf7756 internal/core/runtime: temporary helpers to aid transition
0ca4098 internal/core/subsume: initial implementation
38a19f8 internal/core/validate: initial implementation
489eb90 internal/core/validate: more cases to ignore concreteness
765f87e internal/core/validate: use defaults in concrete mode
6a495ae internal/core: add runtime hooks
50dac24 internal/cuetxtar: add WriteFile method
29dd250 internal/diff: use compact printing mode
bc76c5f internal/diff: use field Selector instead of Name
76d8c61 internal/diff: use new definitions in test
8912ee5 internal/legacy/cue: copy relevant cue files for new implementation
7d54042 internal/legacy/cue: update to new build
5a51083 internal/legacy/cue: use UnifyAccept and some other changes
fbc6f86 pkg/tool/os: check error of value
1dd08f3 tools/fix: remove support to rewrite old definitions
dd312be tools/trim: fix list element removal bug