github cue-lang/cue v0.2.0
Language Changes and Streamlined CLI semantics

latest releases: v0.11.0-rc.1, v0.11.0-alpha.5, v0.11.0-alpha.4...
3 years ago

This release makes several minor and bigger language changes. The changes are a result of analyzing CUE’s interoperability with other languages and what is needed for this. It is expected that these are the last major changes and that this fixes the look and feel of the language. In addition to the usual cue fmt-based rewrites, we now also provide a cue fix command to aid in the transition. Also, the old formats are still supported for now. Minor language changes and additions to the language are still planned on the path to stability.

The new language changes allow for a simpler JSON Schema mapping. This, in turn, allows for a more straightforward combination of schema and data on the command line.

New syntax for definitions

Before, definitions were indicated with a double colon ::. They lived in the same namespace as regular fields. This property complicated defining automated conversions of some other data formats to CUE. For instance, JSON Schema keeps separate sections for schema and definitions. As there is no enforced convention, as in Go, for naming the two kinds differently, one could not map these to the same struct. This forced schemas to be moved to another location, which turned out to be cumbersome and unnatural.

Another issue with the old syntax is that one could not determine from a reference a.b whether b would refer to a definition or regular field. Not a big problem per se, but it lacked clarity.

As of v0.2.0, definitions are denoted with special identifier starting with a #. As this was not a legal identifier before, such identifiers can safely be used exclusively for definitions. Effectively, definitions now have their own namespace. Other than that, definitions work as before. They still close a struct, they are not output during export, and you simply use their identifier to refer to them.

Before:

A :: b: int
b: c :: string

D: A
E: b.c

After

#A: b: int
b: #c: string
D: #A
E: b.#c

The #c notation is just an identifier: no special syntax is needed to handle them. As before, "#foo" still denotes a regular field with a name starting with #.

Using an initial # to distinguish between definitions and regular fields was inspired by Go's use of initial casing to distinguish between exported and non-exported identifiers. Indeed because CUE is a JSON superset and interop focused, there is little control over casing. As we explain later in the Hidden Fields section, we also resurrect _ as a means of excluding fields and definitions from output.

The new notation may take a bit of getting used to, but the increased readability and the flexibility from the separate namespaces are big wins. So far the feedback has been overwhelmingly positive.

The Go, Protobuf, JSON Schema, and OpenAPI mappings, as well as the tutorials, have been ported to use #-style definitions. Please beware that if one has other templates based on this they need to be rewritten as well.

Limitations

The new definition syntax is more restricted than the old syntax. In general, it is no longer allowed to have dynamic definition names. For instance, the syntax disallows using interpolations for creating names. Also bulk optional fields only apply to regular fields. We think this actually benefits static analysis and is overall a good change. If need be, though, the old semantics can be simulated by containing a struct within a definition used as a map. In case this proves to be too limiting, we have a possible language change up our sleeves that would allow dynamic definitions again.

Old style definitions will keep being supported up to the next minor release. Use cue fix to rewrite current files (see below).

API changes

The current API assumed definitions and fields lived in the same namespace. The current API is therefore inherently broken. The Struct.FieldByName has been deliberately broken to force users to disambiguate. To evaluate a reference, it is recommended to use Dereference instead of passing the result of Reference to Lookup.

There are some good ideas to make the API considerably simpler and more powerful. This is contemplated to be built on top of the new adt package, which is developed as part of the evaluator rewrite, which is planned for the next minor release.

cue fix

To aid in the conversion to #-style definitions, the cue fix command is introduced. This command does a module-wide update of packages by default, but allows specifying individual files and packages.

This is no longer piggybacked on cue fmt. Unlike previous rewrites, the definition rewrite requires evaluation. In addition, not all current definitions are legal in the new syntax. The converter will not remap those. Also, unfortunately, due to API limitations, the converter can also not handle all legal cases. We hope that such cases will be rare, though.

Cases that cannot be handled are clearly marked with locations that may reference the definition included. Definitions with issues have are marked with a deliberately obscure @tmpNoExportNewDef attribute, indicating they need manual fixing

There is also a possibility there will be undetected breakage. This may happen if a selection is made in a disjunction where the field may be either a definition or regular field. The fact that this is even possible indicates that the old model was probably too flexible and possibly more bug prone. With new-style definitions such shenanigans are no longer possible.

Unfortunately, cue fix currently does not handle updating multiple packages within the same directory at once unless manually specified. We intend to allow this at some point.

Tool semantics changes

Previously, the cue tool handled schemas and data differently. In practice, the distinction is not all that clear, making this distinction somewhat “forced”. The cue tool now treats data and schemas as the same thing and always unifies them without distinction by default. This greatly simplifies its use and makes it a very powerful tool for operating directly on non-CUE schema.

There are two exceptions to the new “always unify all files” rules: 1) with cue vet multiple data files are still individually verified against a single schema. 2) the -d option still separates schema from data files. One could argue that the latter isn’t really necessary anymore, as we will see next.

JSON Schema mapping

The following command

cue export schema.json data.yaml

converts the schema.json (which has a proper value for $schema to detect the format) to a schema and data.yaml to data, and unifies the results. This will naturally fail if the contents of data doesn’t correspond to the schema. There is no special logic needed to detect these cases, other than knowing when to interpret a data file as JSON Schema.

Note that #-style definitions were key to make this possible. Without them, there would be no obvious way to map JSON Schema to the root of a config as merging fields and schema to the same namespace could result in conflicts.

OpenAPI merging

Another example of the new capabilities is that one can now merge schema as one would merge data. For instance, the following command merges two OpenAPI files and then outputs it again as OpenAPI:

cue def openapi: file1.json file2.json --out openapi

The user won’t even see the intermediate CUE.

Hidden fields

This release also officially resurrects hidden fields (identifiers written as, for example, _foo). They were removed from the spec as with the introduction of definitions they were believed to be an incongruent feature. They fit, however, quite nicely with the new-style definitions. They were still in use and proved to be more useful than expected. As they were never really removed from the implementation nothing changes. However, their comeback is now official.

For clarity, both hidden fields and definitions are not shown in exported output. They serve different functions, though.

A definition defines a new composite type, the most general specification of something. They are at the other end of the spectrum to concrete instances. As definitions are supposed to fully define a schema, CUE can use them to catch typos in field names by detecting fields that are not supposed to be there.

A hidden field can be any value and is not subjected to such scrutiny. They are used to define fields that are not converted to output without making them a definition.

Although this is not yet enforced, hidden fields are local to a package. To make the matrix complete identifiers starting with _# denote hidden definitions, or definitions local to a package. These can be used today, although this restriction is also not yet enforced.

Streamlined syntax for list comprehension

List comprehensions are now written as

[ for x in src { x + 1 } ]

This brings them in line with field comprehensions, where the value also comes after the comprehension clauses. Overall, this greatly increases readability.

This also harmonized the syntax. It simplifies both spec and implementation, and prepares for some of the constructs mentioned in the query proposal.

The one thing that may require some getting used to is the curly braces alluding to the value being a struct. If the curly braces contain a single scalar value, however, it is promoted to be the result of an iteration. CUE users may already be familiar with this construct at the top-level scope, where “embedded scalars” are the final result of evaluation. This construct was originally invented to allow omitting curly braces, while still being a superset of JSON (JSonnet uses the same trick). It turns out, however, that this trick can be quite useful when generalized to apply to any struct, not just the top-level one. We continue to investigate this possibility. Consider this the first application.

Using cue fmt will automatically update old list comprehensions to use the new format.

Added limitations on bulk optional fields

Previously, it was allowed to have a bulk optional field alongside regular fields within a struct

a: {
  [string]: T
  foo: S
}

Many people expected T to only apply to fields other than foo, but not foo itself. This is, in fact, how JSON Schema’s patternProperties and additionalProperties work. With hindsight, this approach turns out to have many benefits.

The plan is to transition to these new semantics. We will first disallow bulk optional fields alongside other fields altogether. This can easily be worked around by singling out such fields by wrapping them in curly braces:

a: {
  { [string]: T }
  foo: S
}

Bulk optional fields of this style

a: [string]: int

are effectively already singled out and can remain unchanged.

The resulting embedding means the same thing in CUE. After the original semantics of mixing these fields has been banned for some time, we can introduce the original syntax, but now with “fallback” semantics analogous to JSON Schema.

For this release cue fmt simply rewrites bulk optional fields to be singled out. Forcing the rewrite is planned for a next release, followed by dropping support, waiting and then changing the semantics.

API: astutil.Sanitize

A noteworthy addition to the API is astutil.Sanitize. This function aids automated rewrites of CUE ASTs. Adding fields to arbitrary CUE ASTs may inadvertently shadow references. Inserting references to imports may cause similar issues. Not handling these cases correctly was a common source of bugs. Sanitize shares code with the actual resolver and can therefore be expected to be more accurate than any parallel implementation.

The Sanitizer automatically detects and resolves shadowed fields and fixes the AST in place accordingly. It uses the following trick: like in Go, the CUE resolver uses an extra field in ast.Ident (a reference) to mark what an identifier points to. The Sanitizer runs resolution on an AST which already may have such fields filled out from previous resolutions. If the result is not consistent, it will modify the AST to make it so.

Users can manually set such resolutions as well. For instance, setting the resolution to an ast.ImportSpec will cause the spec to be inserted automatically or the node to resolve to a preexisting equivalent one.

More functionality can be added in the future. The key is that all converters can share the same logic. It has already been used to fix import inserting in JSON Schema and to fix import disambiguation in the protobuf converter.

CI improvements

CUE uses GitHub Actions for its CI. The configuration for that was YAML. The lack of validation causes some breakages here and there when these needed to be adjusted. The configuration is now written in CUE! We'll let you know how it goes.

We also now have a TryBot for the Gerrit repo, which should also prevent breakages in the future.

Deprecations

Space-separated labels are now no longer supported and will give a compiler error complaining about a missing :.

Legacy (“back-style”) field comprehensions are now also no longer supported.

Some APIs have deliberately been broken to force handling the distinction between identifiers and strings.

Changelog

f1c757f Update broken link in tutorial docs README.md
238d821 all: re-run go generate using Go 1.14.3
4c7d062 ci: add -f (fail) flag to curl calls in GitHub Actions trigger
37b1de2 ci: define repository_dispatch build
6bbacf9 ci: fix bad testscript file name and update CI with module get check
8fcefc8 ci: fix error in workflow generation task
8f1d260 ci: fix the GitHub actions repository_dispatch workflow
99f4ce0 ci: fix up Yaml error in specifying constraint on generate step
063fa12 ci: fix up some build breakages
3527700 ci: re-run go generate against stable Go version as part of CI
a801188 ci: tidy up and refactor GitHub workflow specifications
65163a0 ci: use CUE for GitHub Actions workflow specifications
a846fcc cmd/cue/cmd: add fix command for new-style definitions
d8d240d cmd/cue/cmd: bug fix: allow vet to combine schema and data
c114dbb cmd/cue/cmd: fix build breakage
dc62e36 cmd/cue/cmd: fix typo 'filetype' -> 'filetypes'
8f09dde cmd/cue/cmd: fix windows test breakage
50bcd97 cmd/cue/cmd: fmt rewrites alias to let clauses
5114891 cmd/cue/cmd: merge data files by default
5163831 cmd/cue/cmd: print full error messages in tools mode
99117ba cmd/cue/cmd: remove unused func mustParseFlags
8be0ffe cmd/cue/cmd: update "get go" to use new-style definitions
7ed2fc1 cmd: fix cli.Print usage in docstring
b19b41a cue/ast/astutil: add and use Santize function
00a8bb9 cue/ast/astutil: refactor to prepare for Sanitize func
afe86c1 cue/ast: allow LetClause in NewStruct
b70cc97 cue/ast: print StructLit in regular mode by default
a5c0866 cue/build: fix typo reference to cue/load package
9954ecd cue/cmd/cue: disallow quoted identifiers by default
aeb0a8c cue/format: don't simplify when parent field has comment
8938f35 cue/format: fix formatting of generated ImportsDecl
989e984 cue/parser: allow keywords in more places
8040ce7 cue/parser: remove support for legacy field comprehensions
c1c4c63 cue/parser: remove support for space-separated labels
7491622 cue/testdata: replace spaces in txtar file names
dbf1c00 cue: allow # and disallow #0 as identifiers
56c994d cue: disallow bulk optional fields with other fields
1370f0a cue: implement "front-style" list comprehensions
724da18 cue: introduce let declaration
fb7bdab cue: make test for Expr more precise
dd262ee cue: merge attributes for top-level fields
12927e8 cue: prevent crash in JSON conversion
0240d4e cue: some API adjustments for new-style definitions
9c9cdba cue: support for #-style definitions
f416ee8 cue: support new-style definitions in LookupDef
8837685 cue: uniform handling of definitions
b7083ff cue: use _# instead of #_ for hidden definition
e027d80 deps: put all tool deps in root go.mod file
c174a08 doc/ref/spec.md: allow ... in an ast.File
86e1a64 doc/ref/spec.md: fix doc bug in Label production
cb8f4f5 doc/ref/spec: alternative syntax for definitions
de0c53d doc/ref/spec: harmonize list and struct comprehension
f395f12 encoding/jsonschema: fix a few json schema bugs
b5b7521 encoding/jsonschema: improve handling of $id and $ref
faf1dfc encoding/jsonschema: introduce extra pass for metadata
53e5561 encoding/jsonschema: use astutil.Sanitize to handle imports
435989a encoding/jsonschema: use new definition mapping
a85238c encoding/openapi: make title and version defaults
98aaff9 encoding/openapi: update to new-style definitions
0212bb9 encoding/protobuf: fix typo in example
2b118fb encoding/protobuf: move to new-style definitions
3970555 encoding/yaml: use idiomatic indentation (regression)
be0649e internal/filetypes: print attributes for definitions
4d8d154 internal: factor out FileComment code
0d4abd7 testdata: convert old tests to txtar format
be60cd9 tools/trim: fix package doc

Don't miss a new cue release

NewReleases is sending notifications on new releases.