🚀 Features
Support Unix domain socket (UDS) communication for coprocessors (Issue #5739)
Many coprocessor deployments run side-by-side with the router, typically on the same host (for example, within the same Kubernetes pod).
This change brings coprocessor communication to parity with subgraphs by adding Unix domain socket (UDS) support. When the router and coprocessor are co-located, communicating over a Unix domain socket bypasses the full TCP/IP network stack and uses shared host memory instead, which can meaningfully reduce latency compared to HTTP.
Add redact_query_validation_errors supergraph config option (PR #8888)
The new redact_query_validation_errors option in the supergraph configuration section replaces all query validation errors with a single generic error:
{
"message": "invalid query",
"extensions": {
"code": "UNKNOWN_ERROR"
}
}Support multiple @listSize directives on the same field (PR #8872)
Warning
Multiple @listSize directives on a field only take effect after Federation supports repeatable @listSize in the supergraph schema. Until then, composition continues to expose at most one directive per field. This change makes the router ready for that Federation release.
The router now supports multiple @listSize directives on a single field, enabling more flexible cost estimation when directives from different subgraphs are combined during federation composition.
- The router processes all
@listSizedirectives on a field (stored asVec<ListSizeDirective>instead ofOption<ListSizeDirective>). - When multiple directives specify
assumedSizevalues, the router uses the maximum value for cost calculation. - Existing schemas with single directives continue to work exactly as before.
This change prepares the router for federation's upcoming support for repeatable @listSize directives, and maintains full compatibility with current non-repeatable directive schemas.
Add parser recursion and lexical token metrics (PR #8845)
The router now emits two new metrics: apollo.router.operations.recursion for the recursion level reached, and apollo.router.operations.lexical_tokens for the number of lexical tokens in a query.
Support subgraph-level demand control (PR #8829)
Subgraph-level demand control lets you enforce per-subgraph query cost limits in the router, in addition to the existing global cost limit for the whole supergraph. This helps you protect specific backend services that have different capacity or cost profiles from being overwhelmed by expensive operations.
When a subgraph-specific cost limit is exceeded, the router:
- Still runs the rest of the operation, including other subgraphs whose cost is within limits.
- Skips calls to only the over-budget subgraph, and composes the response as if that subgraph had returned null, instead of rejecting the entire query.
Per-subgraph limits apply to the total work for that subgraph in a single operation. For each request, the router tracks the aggregate estimated cost per subgraph across the entire query plan. If the same subgraph is fetched multiple times (for example, through entity lookups, nested fetches, or conditional branches), those costs are summed together and the subgraph's limit is enforced against that total.
Configuration
demand_control:
enabled: true
mode: enforce
strategy:
static_estimated:
max: 10
list_size: 10
actual_cost_mode: by_subgraph
subgraphs: # everything from here down is new (all fields optional)
all:
max: 8
list_size: 10
subgraphs:
products:
max: 6
# list_size omitted, 10 implied because of all.list_size
reviews:
list_size: 50
# max omitted, 8 implied because of all.maxExample
Consider a topProducts query that fetches a list of products from a products subgraph and then performs an entity lookup for each product in a reviews subgraph. Assume the products cost is 10 and the reviews cost is 5, leading to a total estimated cost of 15 (10 + 5).
Previously, you could only restrict that query via demand_control.static_estimated.max:
- If you set it to 15 or higher, the query executes.
- If you set it below 15, the query is rejected.
Subgraph-level demand control enables much more granular control. In addition to demand_control.static_estimated.max, which operates as before, you can also set per-subgraph limits.
For example, if you set max = 20 and reviews.max = 2, the query passes the aggregate check (15 < 20) and executes on the products subgraph (no limit specified), but doesn't execute against the reviews subgraph (5 > 2). The result is composed as if the reviews subgraph had returned null.
By @carodewig in #8829
Improve @listSize directive parsing and nested path support (PR #8893)
Demand control cost calculation now supports:
- Array-style parsing for
@listSizesizing (for example, list arguments) - Nested input paths when resolving list size from query arguments
- Nested field paths in the
sizedFieldsargument on@listSizefor more accurate cost estimation
These changes are backward compatible with existing schemas and directives.
Add coprocessor hooks for connector request and response stages (PR #8869)
You can now configure a coprocessor hook for the ConnectorRequest and ConnectorResponse stages of the router lifecycle.
coprocessor:
url: http://localhost:3007
connector:
all:
request:
uri: true
headers: true
body: true
context: all
service_name: true
response:
headers: true
body: true
context: all
service_name: trueBy @andrewmcgivery in #8869
🐛 Fixes
Pass variables to introspection queries (PR #8816)
Introspection queries now receive variables, enabling @include and @skip directives during introspection.
Log warning instead of returning error for non-UTF-8 headers in externalize_header_map (PR #8828)
- The router now emits a warning log with the name of the header instead of returning an error.
- The remaining valid headers are returned, which is more consistent with the router's default behavior when a coprocessor isn't used.
By @rohan-b99 in #8828
Place http_client span attributes on the http_request span (PR #8798)
Attributes configured under telemetry.instrumentation.spans.http_client are now added to the http_request span instead of subgraph_request.
Given this config:
telemetry:
instrumentation:
spans:
http_client:
attributes:
http.request.header.content-type:
request_header: "content-type"
http.response.header.content-type:
response_header: "content-type"Both attributes are now placed on the http_request span.
By @rohan-b99 in #8798
Validate ObjectValue variable fields against input type definitions (PR #8821 and PR #8884)
The router now validates individual fields of input object variables against their type definitions. Previously, variable validation checked that the variable itself was present but didn't validate the fields within the object.
Example:
## schema ##
input MessageInput {
content: String
author: String
}
type Receipt {
id: ID!
}
type Query{
send(message: MessageInput): Receipt
}
## query ##
query(: MessageInput) {
send(message: ) {
id
}
}
## input variables ##
{"msg":
{
"content": "Hello",
"author": "Me",
"unknownField": "unknown",
}
}This request previously passed validation because the variable msg was present in the input, but the fields of msg weren't validated against the MessageInput type.
Warning
To opt out of this behavior, set the supergraph.strict_variable_validation config option to measure.
Enabled:
supergraph:
strict_variable_validation: enforceDisabled:
supergraph:
strict_variable_validation: measureBy @conwuegb in #8821 and #8884
Increase internal Redis timeout from 5s to 10s (PR #8863)
Because mTLS handshakes can be slow in some environments, the internal Redis timeout is now 10s (previously 5s). The connection "unresponsive" threshold is also increased from 5s to 10s.
By @aaronArinder in #8863
Enforce and log operation limits for cached query plans (PR #8810)
The router now logs the operation-limits warning for cached query plans as well, ensuring the query text is included whenever limits are exceeded. This also fixes a case where a cached plan could bypass enforcement after changing warn_only from true to false during a hot reload.
By @rohan-b99 in #8810
Prevent duplicate content-type headers in connectors (PR #8867)
When you override the content-type header in a connector @source directive, the router no longer appends the default value. The custom header value now properly replaces the default.
For example:
@source(
name: "datasetInsightsAPI"
http: {
headers: [
{ name: "Content-Type", value: "application/vnd.iaas.v1+json" },
]
}
)Previously resulted in:
content-type: application/json, application/vnd.iaas.v1+jsonNow correctly results in:
content-type: application/vnd.iaas.v1+jsonBy @andrewmcgivery in #8867
Prevent duplicate tags in router spans added by dynamic attributes (PR #8865)
When dynamic attributes are added via SpanDynAttribute::insert, SpanDynAttribute::extend, LogAttributes::insert, LogAttributes::extend, EventAttributes::insert, or EventAttributes::extend and the key already exists, the router now replaces the existing value instead of creating duplicate attributes.
By @rohan-b99 in #8865
Compute actual demand control costs across all subgraph fetches (PR #8827)
The demand control feature estimates query costs by summing together the cost of each subgraph operation, capturing any intermediate work that must be completed to return a complete response.
Previously, the actual query cost computation only considered the final response shape and didn't include any of the intermediate work in its total.
The router now computes the actual query cost as the sum of all subgraph response costs. This more accurately reflects the work done per operation and enables a more meaningful comparison between actual and estimated costs.
To disable the new actual cost computation behavior, set the router configuration option demand_control.strategy.static_estimated.actual_cost_mode to response_shape:
demand_control:
enabled: true
mode: enforce
strategy:
static_estimated:
max: 10
list_size: 10
actual_cost_mode: by_subgraph # the default value
# actual_cost_mode: response_shape # revert to prior actual cost computation modeBy @carodewig in #8827
📚 Documentation
Correct response caching FAQ for schema updates and multi-root-field caching (PR #8794)
Updated the response caching FAQ to accurately describe caching behavior:
- Clarify that schema updates generate new cache keys, so old entries don't receive cache hits (effectively expired from your perspective) instead of implying stale data might be served.
- Correct the multi-root-field caching explanation to state that the router caches the entire subgraph response as a single unit, not separately per root field.
- Add clarification that the configured TTL is a fallback when subgraph responses don't include
Cache-Control: max-ageheaders. - Change example TTL from
300sto5mfor better readability.
By @the-gigi-apollo in #8794