github apollographql/router v2.1.0

latest release: v1.61.1
8 days ago

🚀 Features

Connectors: support for traffic shaping (PR #6737)

Traffic shaping is now supported for connectors. To target a specific source, use the subgraph_name.source_name under the new connector.sources property of traffic_shaping. Settings under connector.all will apply to all connectors. deduplicate_query is not supported at this time.

Example config:

traffic_shaping:
  connector:
    all:
      timeout: 5s
    sources:
      connector-graph.random_person_api:
        global_rate_limit:
          capacity: 20
          interval: 1s
        experimental_http2: http2only
        timeout: 1s

By @andrewmcgivery in #6737

Connectors: Support TLS configuration (PR #6995)

Connectors now supports TLS configuration for using custom certificate authorities and utilizing client certificate authentication.

tls:
  connector:
    sources:
      connector-graph.random_person_api:
        certificate_authorities: ${file.ca.crt}
        client_authentication:
          certificate_chain: ${file.client.crt}
          key: ${file.client.key}

By @andrewmcgivery in #6995

Update JWT handling (PR #6930)

This PR updates JWT-handling in the AuthenticationPlugin;

  • Users may now set a new config option config.authentication.router.jwt.on_error.
    • When set to the default Error, JWT-related errors will be returned to users (the current behavior).
    • When set to Continue, JWT errors will instead be ignored, and JWT claims will not be set in the request context.
  • When JWTs are processed, whether processing succeeds or fails, the request context will contain a new variable apollo::authentication::jwt_status which notes the result of processing.

By @Velfi in #6930

Add batching.maximum_size configuration option to limit maximum client batch size (PR #7005)

Add an optional maximum_size parameter to the batching configuration.

  • When specified, the router will reject requests which contain more than maximum_size queries in the client batch.
  • When unspecified, the router performs no size checking (the current behavior).

If the number of queries provided exceeds the maximum batch size, the entire batch fails with error code 422 (Unprocessable Content). For example:

{
  "errors": [
    {
      "message": "Invalid GraphQL request",
      "extensions": {
        "details": "Batch limits exceeded: you provided a batch with 3 entries, but the configured maximum router batch size is 2",
        "code": "BATCH_LIMIT_EXCEEDED"
      }
    }
  ]
}

By @carodewig in #7005

Introduce PQ manifest hot_reload option for local manifests (PR #6987)

This change introduces a persisted_queries.hot_reload configuration option to allow the router to hot reload local PQ manifest changes.

If you configure local_manifests, you can set hot_reload to true to automatically reload manifest files whenever they change. This lets you update local manifest files without restarting the router.

persisted_queries:
  enabled: true
  local_manifests:
    - ./path/to/persisted-query-manifest.json
  hot_reload: true

Note: This change explicitly does not piggyback on the existing --hot-reload flag.

By @trevor-scheer in #6987

Add support to get/set URI scheme in Rhai (Issue #6897)

This adds support to read and write the scheme from the request.uri.scheme/request.subgraph.uri.scheme functions in Rhai,
enabling the ability to switch between http and https for subgraph fetches. For example:

fn subgraph_service(service, subgraph){
    service.map_request(|request|{
        log_info(`${request.subgraph.uri.scheme}`);
        if request.subgraph.uri.scheme == {} {
            log_info("Scheme is not explicitly set");
        }
        request.subgraph.uri.scheme = "https"
        request.subgraph.uri.host = "api.apollographql.com";
        request.subgraph.uri.path = "/api/graphql";
        request.subgraph.uri.port = 1234;
        log_info(`${request.subgraph.uri}`);
    });
}

By @starJammer in #6906

Add router config validate subcommand (PR #7016)

Adds new router config validate subcommand to allow validation of a router config file without fully starting up the Router.

./router config validate <path-to-config-file.yaml>

By @andrewmcgivery in #7016

Enable remote proxy downloads of the Router

This enables users without direct download access to specify a remote proxy mirror location for the GitHub download of
the Apollo Router releases.

By @LongLiveCHIEF in #6667

Add metric to measure cardinality overflow frequency (PR #6998)

Adds a new counter metric, apollo.router.telemetry.metrics.cardinality_overflow, that is incremented when the cardinality overflow log from opentelemetry-rust occurs. This log means that a metric in a batch has reached a cardinality of > 2000 and that any excess attributes will be ignored.

By @rregitsky in #6998

Add metrics for value completion errors (PR #6905)

When the router encounters a value completion error, it is not included in the GraphQL errors array, making it harder to observe. To surface this issue in a more obvious way, router now counts value completion error metrics via the metric instruments apollo.router.graphql.error and apollo.router.operations.error, distinguishable via the code attribute with value RESPONSE_VALIDATION_FAILED.

By @timbotnik in #6905

Add apollo.router.pipelines metrics (PR #6967)

When the router reloads, either via schema change or config change, a new request pipeline is created.
Existing request pipelines are closed once their requests finish. However, this may not happen if there are ongoing long requests that do not finish, such as Subscriptions.

To enable debugging when request pipelines are being kept around, a new gauge metric has been added:

  • apollo.router.pipelines - The number of request pipelines active in the router
    • schema.id - The Apollo Studio schema hash associated with the pipeline.
    • launch.id - The Apollo Studio launch id associated with the pipeline (optional).
    • config.hash - The hash of the configuration

By @BrynCooke in #6967

Add apollo.router.open_connections metric (PR #7023)

To help users to diagnose when connections are keeping pipelines hanging around, the following metric has been added:

  • apollo.router.open_connections - The number of request pipelines active in the router
    • schema.id - The Apollo Studio schema hash associated with the pipeline.
    • launch.id - The Apollo Studio launch id associated with the pipeline (optional).
    • config.hash - The hash of the configuration.
    • server.address - The address that the router is listening on.
    • server.port - The port that the router is listening on if not a unix socket.
    • http.connection.state - Either active or terminating.

You can use this metric to monitor when connections are open via long running requests or keepalive messages.

By @BrynCooke in #7023

Add span events to error spans for connectors and demand control plugin (PR #6727)

New span events have been added to trace spans which include errors. These span events include the GraphQL error code that relates to the error. So far, this only includes errors generated by connectors and the demand control plugin.

By @bonnici in #6727

Changes to experimental error metrics (PR #6966)

In 2.0.0, an experimental metric telemetry.apollo.errors.experimental_otlp_error_metrics was introduced to track errors with additional attributes. A few related changes are included here:

  • Sending these metrics now also respects the subgraph's send flag e.g. telemetry.apollo.errors.subgraph.[all|(subgraph name)].send.
  • A new configuration option telemetry.apollo.errors.subgraph.[all|(subgraph name)].redaction_policy has been added. This flag only applies when redact is set to true. When set to ErrorRedactionPolicy.Strict, error redaction will behave as it has in the past. Setting this to ErrorRedactionPolicy.Extended will allow the extensions.code value from subgraph errors to pass through redaction and be sent to Studio.
  • A warning about incompatibility of error telemetry with connectors will be suppressed when this feature is enabled, since it does support connectors when using the new mode.

By @timbotnik in #6966

🐛 Fixes

Export gauge instruments (Issue #6859)

Previously in router 2.x, when using the router's OTel meter_provider() to report metrics from Rust plugins, gauge instruments such as those created using .u64_gauge() weren't exported. The router now exports these instruments.

By @yanns in #6865

Use batch_processor config for Apollo metrics PeriodicReader (PR #7024)

The Apollo OTLP batch_processor configurations telemetry.apollo.batch_processor.scheduled_delay and telemetry.apollo.batch_processor.max_export_timeout now also control the Apollo OTLP PeriodicReader export interval and timeout, respectively. This update brings parity between Apollo OTLP metrics and non-Apollo OTLP exporter metrics.

By @rregitsky in #7024

Reduce Brotli encoding compression level (Issue #6857)

The Brotli encoding compression level has been changed from 11 to 4 to improve performance and mimic other compression algorithms' fast setting. This value is also a much more reasonable value for dynamic workloads.

By @carodewig in #7007

CPU count inference improvements for cgroup environments (PR #6787)

This fixes an issue where the fleet_detector plugin would not correctly infer the CPU limits for a system which used cgroup or cgroup2.

By @nmoutschen in #6787

Separate entity keys and representation variables in entity cache key (Issue #6673)

This fix separates the entity keys and representation variable values in the cache key, to avoid issues with @requires for example.

Important

If you have enabled Distributed query plan caching, this release contains changes which necessarily alter the hashing algorithm used for the cache keys. On account of this, you should anticipate additional cache regeneration cost when updating between these versions while the new hashing algorithm comes into service.

By @bnjjj in #6888

Replace Rhai-specific hot-reload functionality with general hot-reload (PR #6950)

In Router 2.0 the rhai hot-reload capability was not working. This was because of architectural improvements to the router which meant that the entire service stack was no longer re-created for each request.

The fix adds the rhai source files into the primary list of elements, configuration, schema, etc..., watched by the router and removes the old Rhai-specific file watching logic.

If --hot-reload is enabled, the router will reload on changes to Rhai source code just like it would for changes to configuration, for example.

By @garypen in #6950

📃 Configuration

Make experimental OTLP error metrics feature flag non-experimental (PR #7033)

Because the OTLP error metrics feature is being promoted to preview from experimental, this change updates its feature flag name from experimental_otlp_error_metrics to preview_extended_error_metrics.

By @MerylC in #7033

Tip

All notable changes to Router v2.x after its initial release will be documented in this file. To see previous history, see the changelog prior to v2.0.0.

Don't miss a new router release

NewReleases is sending notifications on new releases.