🚀 Features
Connectors: support for traffic shaping (PR #6737)
Traffic shaping is now supported for connectors. To target a specific source, use the subgraph_name.source_name
under the new connector.sources
property of traffic_shaping
. Settings under connector.all
will apply to all connectors. deduplicate_query
is not supported at this time.
Example config:
traffic_shaping:
connector:
all:
timeout: 5s
sources:
connector-graph.random_person_api:
global_rate_limit:
capacity: 20
interval: 1s
experimental_http2: http2only
timeout: 1s
By @andrewmcgivery in #6737
Connectors: Support TLS configuration (PR #6995)
Connectors now supports TLS configuration for using custom certificate authorities and utilizing client certificate authentication.
tls:
connector:
sources:
connector-graph.random_person_api:
certificate_authorities: ${file.ca.crt}
client_authentication:
certificate_chain: ${file.client.crt}
key: ${file.client.key}
By @andrewmcgivery in #6995
Update JWT handling (PR #6930)
This PR updates JWT-handling in the AuthenticationPlugin
;
- Users may now set a new config option
config.authentication.router.jwt.on_error
.- When set to the default
Error
, JWT-related errors will be returned to users (the current behavior). - When set to
Continue
, JWT errors will instead be ignored, and JWT claims will not be set in the request context.
- When set to the default
- When JWTs are processed, whether processing succeeds or fails, the request context will contain a new variable
apollo::authentication::jwt_status
which notes the result of processing.
Add batching.maximum_size
configuration option to limit maximum client batch size (PR #7005)
Add an optional maximum_size
parameter to the batching configuration.
- When specified, the router will reject requests which contain more than
maximum_size
queries in the client batch. - When unspecified, the router performs no size checking (the current behavior).
If the number of queries provided exceeds the maximum batch size, the entire batch fails with error code 422 (Unprocessable Content
). For example:
{
"errors": [
{
"message": "Invalid GraphQL request",
"extensions": {
"details": "Batch limits exceeded: you provided a batch with 3 entries, but the configured maximum router batch size is 2",
"code": "BATCH_LIMIT_EXCEEDED"
}
}
]
}
By @carodewig in #7005
Introduce PQ manifest hot_reload
option for local manifests (PR #6987)
This change introduces a persisted_queries.hot_reload
configuration option to allow the router to hot reload local PQ manifest changes.
If you configure local_manifests
, you can set hot_reload
to true
to automatically reload manifest files whenever they change. This lets you update local manifest files without restarting the router.
persisted_queries:
enabled: true
local_manifests:
- ./path/to/persisted-query-manifest.json
hot_reload: true
Note: This change explicitly does not piggyback on the existing --hot-reload
flag.
By @trevor-scheer in #6987
Add support to get/set URI scheme in Rhai (Issue #6897)
This adds support to read and write the scheme from the request.uri.scheme
/request.subgraph.uri.scheme
functions in Rhai,
enabling the ability to switch between http
and https
for subgraph fetches. For example:
fn subgraph_service(service, subgraph){
service.map_request(|request|{
log_info(`${request.subgraph.uri.scheme}`);
if request.subgraph.uri.scheme == {} {
log_info("Scheme is not explicitly set");
}
request.subgraph.uri.scheme = "https"
request.subgraph.uri.host = "api.apollographql.com";
request.subgraph.uri.path = "/api/graphql";
request.subgraph.uri.port = 1234;
log_info(`${request.subgraph.uri}`);
});
}
By @starJammer in #6906
Add router config validate
subcommand (PR #7016)
Adds new router config validate
subcommand to allow validation of a router config file without fully starting up the Router.
./router config validate <path-to-config-file.yaml>
By @andrewmcgivery in #7016
Enable remote proxy downloads of the Router
This enables users without direct download access to specify a remote proxy mirror location for the GitHub download of
the Apollo Router releases.
By @LongLiveCHIEF in #6667
Add metric to measure cardinality overflow frequency (PR #6998)
Adds a new counter metric, apollo.router.telemetry.metrics.cardinality_overflow
, that is incremented when the cardinality overflow log from opentelemetry-rust occurs. This log means that a metric in a batch has reached a cardinality of > 2000 and that any excess attributes will be ignored.
By @rregitsky in #6998
Add metrics for value completion errors (PR #6905)
When the router encounters a value completion error, it is not included in the GraphQL errors array, making it harder to observe. To surface this issue in a more obvious way, router now counts value completion error metrics via the metric instruments apollo.router.graphql.error
and apollo.router.operations.error
, distinguishable via the code
attribute with value RESPONSE_VALIDATION_FAILED
.
By @timbotnik in #6905
Add apollo.router.pipelines
metrics (PR #6967)
When the router reloads, either via schema change or config change, a new request pipeline is created.
Existing request pipelines are closed once their requests finish. However, this may not happen if there are ongoing long requests that do not finish, such as Subscriptions.
To enable debugging when request pipelines are being kept around, a new gauge metric has been added:
apollo.router.pipelines
- The number of request pipelines active in the routerschema.id
- The Apollo Studio schema hash associated with the pipeline.launch.id
- The Apollo Studio launch id associated with the pipeline (optional).config.hash
- The hash of the configuration
By @BrynCooke in #6967
Add apollo.router.open_connections
metric (PR #7023)
To help users to diagnose when connections are keeping pipelines hanging around, the following metric has been added:
apollo.router.open_connections
- The number of request pipelines active in the routerschema.id
- The Apollo Studio schema hash associated with the pipeline.launch.id
- The Apollo Studio launch id associated with the pipeline (optional).config.hash
- The hash of the configuration.server.address
- The address that the router is listening on.server.port
- The port that the router is listening on if not a unix socket.http.connection.state
- Eitheractive
orterminating
.
You can use this metric to monitor when connections are open via long running requests or keepalive messages.
By @BrynCooke in #7023
Add span events to error spans for connectors and demand control plugin (PR #6727)
New span events have been added to trace spans which include errors. These span events include the GraphQL error code that relates to the error. So far, this only includes errors generated by connectors and the demand control plugin.
Changes to experimental error metrics (PR #6966)
In 2.0.0, an experimental metric telemetry.apollo.errors.experimental_otlp_error_metrics
was introduced to track errors with additional attributes. A few related changes are included here:
- Sending these metrics now also respects the subgraph's
send
flag e.g.telemetry.apollo.errors.subgraph.[all|(subgraph name)].send
. - A new configuration option
telemetry.apollo.errors.subgraph.[all|(subgraph name)].redaction_policy
has been added. This flag only applies whenredact
is set totrue
. When set toErrorRedactionPolicy.Strict
, error redaction will behave as it has in the past. Setting this toErrorRedactionPolicy.Extended
will allow theextensions.code
value from subgraph errors to pass through redaction and be sent to Studio. - A warning about incompatibility of error telemetry with connectors will be suppressed when this feature is enabled, since it does support connectors when using the new mode.
By @timbotnik in #6966
🐛 Fixes
Export gauge instruments (Issue #6859)
Previously in router 2.x, when using the router's OTel meter_provider()
to report metrics from Rust plugins, gauge instruments such as those created using .u64_gauge()
weren't exported. The router now exports these instruments.
Use batch_processor
config for Apollo metrics PeriodicReader
(PR #7024)
The Apollo OTLP batch_processor
configurations telemetry.apollo.batch_processor.scheduled_delay
and telemetry.apollo.batch_processor.max_export_timeout
now also control the Apollo OTLP PeriodicReader
export interval and timeout, respectively. This update brings parity between Apollo OTLP metrics and non-Apollo OTLP exporter metrics.
By @rregitsky in #7024
Reduce Brotli encoding compression level (Issue #6857)
The Brotli encoding compression level has been changed from 11
to 4
to improve performance and mimic other compression algorithms' fast
setting. This value is also a much more reasonable value for dynamic workloads.
By @carodewig in #7007
CPU count inference improvements for cgroup
environments (PR #6787)
This fixes an issue where the fleet_detector
plugin would not correctly infer the CPU limits for a system which used cgroup
or cgroup2
.
By @nmoutschen in #6787
Separate entity keys and representation variables in entity cache key (Issue #6673)
This fix separates the entity keys and representation variable values in the cache key, to avoid issues with @requires
for example.
Important
If you have enabled Distributed query plan caching, this release contains changes which necessarily alter the hashing algorithm used for the cache keys. On account of this, you should anticipate additional cache regeneration cost when updating between these versions while the new hashing algorithm comes into service.
Replace Rhai-specific hot-reload functionality with general hot-reload (PR #6950)
In Router 2.0 the rhai hot-reload capability was not working. This was because of architectural improvements to the router which meant that the entire service stack was no longer re-created for each request.
The fix adds the rhai source files into the primary list of elements, configuration, schema, etc..., watched by the router and removes the old Rhai-specific file watching logic.
If --hot-reload is enabled, the router will reload on changes to Rhai source code just like it would for changes to configuration, for example.
📃 Configuration
Make experimental OTLP error metrics feature flag non-experimental (PR #7033)
Because the OTLP error metrics feature is being promoted to preview
from experimental
, this change updates its feature flag name from experimental_otlp_error_metrics
to preview_extended_error_metrics
.
Tip
All notable changes to Router v2.x after its initial release will be documented in this file. To see previous history, see the changelog prior to v2.0.0.