github apollographql/router v1.50.0

3 days ago

๐Ÿš€ Features

Support local persisted query manifests for use with offline licenses (Issue #4587)

Adds experimental support for passing persisted query manifests to use instead of the hosted Uplink version.

For example:

persisted_queries:
  enabled: true
  log_unknown: true
  experimental_local_manifests: 
    - ./persisted-query-manifest.json
  safelist:
    enabled: true
    require_id: false

By @lleadbet in #5310

Support conditions on standard telemetry events (Issue #5475)

Enables setting conditions on standard events.
For example:

telemetry:
  instrumentation:
    events:
      router:
        request:
          level: info
          condition: # Only log the router request if you sent `x-log-request` with the value `enabled`
            eq:
            - request_header: x-log-request
            - "enabled"
        response: off
        error: error
        # ...

Not supported for batched requests.
By @bnjjj in #5476

Make status_code available for router_service responses in Rhai scripts (Issue #5357)

Adds response.status_code on Rhai router_service responses. Previously, status_code was only available on subgraph_service responses.

For example:

fn router_service(service) {
    let f = |response| {
        if response.is_primary() {
            print(response.status_code);
        }
    };

    service.map_response(f);
}

By @IvanGoncharov in #5358

Add new values for the supergraph query selector (PR #5433)

Adds support for four new values for the supergraph query selector:

  • aliases: the number of aliases in the query
  • depth: the depth of the query
  • height: the height of the query
  • root_fields: the number of root fields in the query

You can use this data to understand how your graph is used and to help determine where to set limits.

For example:

telemetry:
  instrumentation:
    instruments:
      supergraph:
        'query.depth':
          description: 'The depth of the query'
          value:
            query: depth
          unit: unit
          type: histogram

By @garypen in #5433

Add the ability to drop metrics using otel views (PR #5531)

You can drop specific metrics if you don't want these metrics to be sent to your APM using otel views.

telemetry:
  exporters:
    metrics:
      common:
        service_name: apollo-router
        views:
          - name: apollo_router_http_request_duration_seconds # Instrument name you want to edit. You can use wildcard in names. If you want to target all instruments just use '*'
            aggregation: drop

By @bnjjj in #5531

Add operation_name selector for router service in custom telemetry (PR #5392)

Adds an operation_name selector for the router service.
Previously, accessing operation_name was only possible through the response_context router service selector.

For example:

telemetry:
  instrumentation:
    instruments:
      router:
        http.server.request.duration:
          attributes:
            graphql.operation.name:
              operation_name: string

By @bnjjj in #5392

๐Ÿ› Fixes

Fix Cache-Control aggregation and age calculation in entity caching (PR #5463)

Enhances the reliability of caching behaviors in the entity cache feature by:

  • Ensuring the proper calculation of max-age and s-max-age fields in the Cache-Control header sent to clients.
  • Setting appropriate default values if a subgraph does not provide a Cache-Control header.
  • Guaranteeing that the Cache-Control header is aggregated consistently, even if the plugins is disabled entirely or on specific subgraphs.

By @Geal in #5463

Fix telemetry events when trace isn't sampled and preserve attribute types (PR #5464)

Improves accuracy and performance of event telemetry by:

  • Displaying custom event attributes even if the trace is not sampled
  • Preserving original attribute type instead of converting it to string
  • Ensuring http.response.body.size and http.request.body.size attributes are treated as numbers, not strings

โš ๏ธ Exercise caution if you have monitoring enabled on your logs, as attribute types may have changed. For example, attributes like http.response.status_code are now numbers (200) instead of strings ("200").

By @bnjjj in #5464

Enable coprocessors for subscriptions (PR #5542)

Ensures that coprocessors correctly handle subscriptions by preventing skipped data from being overwritten.

By @bnjjj in #5542

Improve accuracy of query_planning.plan.duration (PR #5)

Previously, the apollo.router.query_planning.plan.duration metric inaccurately included additional processing time beyond query planning. The additional time included pooling time, which is already accounted for in the metric. After this update, apollo.router.query_planning.plan.duration now accurately reflects only the query planning duration without additional processing time.

For example, before the change, metrics reported:

2024-06-21T13:37:27.744592Z WARN  apollo.router.query_planning.plan.duration 0.002475708
2024-06-21T13:37:27.744651Z WARN  apollo.router.query_planning.total.duration 0.002553958

2024-06-21T13:37:27.748831Z WARN  apollo.router.query_planning.plan.duration 0.001635833
2024-06-21T13:37:27.748860Z WARN  apollo.router.query_planning.total.duration 0.001677167

Post-change metrics now accurately reflect:

2024-06-21T13:37:27.743465Z WARN  apollo.router.query_planning.plan.duration 0.00107725
2024-06-21T13:37:27.744651Z WARN  apollo.router.query_planning.total.duration 0.002553958

2024-06-21T13:37:27.748299Z WARN  apollo.router.query_planning.plan.duration 0.000827
2024-06-21T13:37:27.748860Z WARN  apollo.router.query_planning.total.duration 0.001677167

By @xuorig and @lrlna in #5530

Remove deno_crypto package due to security vulnerability (Issue #5484)

Removes deno_crypto due to the vulnerability reported in curve25519-dalek.
Since the router exclusively used deno_crypto for generating UUIDs using the package's random number generator, this vulnerability had no impact on the router.

By @Geal in #5483

Add content-type header to failed auth checks (Issue #5496)

Adds content-type header when returning AUTH_ERROR from authentication service.

By @andrewmcgivery in #5497

Implement manual caching for AWS Security Token Service credentials (PR #5508)

In the AWS Security Token Service (STS), the CredentialsProvider chain includes caching, but this functionality was missing for AssumeRoleProvider.
This change introduces a custom CredentialsProvider that functions as a caching layer with these rules:

  • Cache Expiry: Credentials retrieved are stored in the cache based on their credentials.expiry() time if specified, or indefinitely (ever) if not.
  • Automatic Refresh: Five minutes before cached credentials expire, an attempt is made to fetch updated credentials.
  • Retry Mechanism: If credential retrieval fails, another attempt is scheduled after a one-minute interval.
  • (Coming soon, not included in this change) Manual Refresh: The CredentialsProvider will expose a refresh_credentials() function. This can be manually invoked, for instance, upon receiving a 401 error during a subgraph call.

By @o0Ignition0o in #5508

๐Ÿ“ƒ Configuration

Align entity caching configuration structure for subgraph overrides (PR #5474)

Aligns the entity cache configuration structure to the same all/subgraphs override pattern found in other parts of the router configuration. For example, see the header propagation configuration.
An automated configuration migration is provided so existing usage is unaffected.

By @Geal in #5474

Restrict custom instrument values to relevant stages (PR #5472)

Previously, custom instruments at each request lifecycle stage could specify unrelated values, like using event_unit for a router instrument. Now, only relevant values for each stage are allowed.

Additionally, GraphQL instruments no longer need to specify field_event. There is no automatic migration for this change since GraphQL instruments are still experimental.

telemetry:
  instrumentation:
    instruments:
      graphql:
        # OLD definition of a custom instrument that measures the number of fields
        my.unit.instrument:
          value: field_unit # Changes to unit
        
        # NEW definition
        my.unit.instrument:
          value: unit 

        # OLD  
        my.custom.instrument:
          value: # Changes to not require `field_custom`
            field_custom:
              list_length: value
        # NEW
        my.custom.instrument:
          value: 
            list_length: value

The following misconfiguration is now not possible:

router_instrument:
  value:
    event_custom:
      request_header: foo

By @BrynCooke in #5472

๐Ÿ›  Maintenance

Add cost information to protobuf traces (PR #5430)

Exports query cost information on Apollo protobuf traces if experimental_demand_control is enabled. Also displays exported information in GraphOS Studio.

By @BrynCooke in #5430

Improve xtask release process (PR #5275)

Introduces a new xtask command to automate the release process by:

  • Following the commands defined in our RELEASE_CHECKLIST.md file
  • Storing the current state of the process in the .release-state.json file
  • Prompting the user regularly for new info.

These changes remove a lot of the manual environment variable setup and command copying previously required.

Executed the new command by running cargo xtask release start, then calling cargo xtask release continue at each step.

By @Geal in #5275

Isolate usage of hyper v0.14 types for future compatibility (PR #5175)

Isolates usage of hyper types in response to the recent release of hyper v1.0. The new major version introduced improvements along with breaking changes. The goal is to reduce the impact of these breaking changes for the upcoming Router upgrade to the new hyper, and ensure that future upgrades are straightforward.

This change only affects internal code and doesn't affect the router's public API or execution.

By @Geal in #5175

Introduce fuzz testing comparison between the router and monolithic subgraph (PR #5302)

Implements a fuzzer that can run on any router configuration to enhance router robustness and battle test new features.

Adds a router fuzzing target, to compare the result of a query sent to to router vs a monolithic subgraph, with a supergraph schema that points all subgraphs to that same monolith.

The monolithic subgraph consolidates code from typical subgraphs like accounts, products, reviews, and inventory (taken from the starstuff repository).
This setup allows the subgraph to directly handle queries traditionally handled by individual subgraphs.
The invariant we check is that we should get the same result by sending the query to the subgraph directly or through a router that will artificially cut up the query into multiple subgraph requests, according to the supergraph schema.

To execute it:

  • Start a router using the schema fuzz/subgraph/supergraph.graphql
  • Start the subgraph with cargo run --release in fuzz/subgraph. It will start a subgraph on port 4005.
  • Start the fuzzer from the repo root with cargo +nightly fuzz run router

By @Geal in #5302

๐Ÿ“š Documentation

Add telemetry docs pages for Dynatrace (PR #5533)

Adds telemetry documentation for Dynatrace metrics and trace exporters.

By @andrewmcgivery in #5533

Fix docs for 'exists' condition (PR #5446)

Fixes documentation example for the exists condition.
The condition expects a single selector instead of an array.

For example:

telemetry:
  instrumentation:
    instruments:
      router:
        my.instrument:
          value: duration
          type: counter
          unit: s
          description: "my description"
          # ...
          # This instrument will only be mutated if the condition evaluates to true
          condition:
            exists:
              request_header: x-req-header

By @bnjjj in #5446

๐Ÿงช Experimental

Add experimental extended reference reporting configuration (PR #5331)

Adds an experimental configuration to turn on extended references in Apollo usage reports, including references to input object fields and enum values.

This new configuration (telemetry.apollo.experimental_apollo_metrics_reference_mode: extended) only works when experimental_apollo_metrics_generation_mode: new is configured.
Apollo doesn't yet recommend these configurations in production while we continue to verify that the new functionality works as expected.

By @bonnici in #5331

Add experimental field metric reporting configuration (PR #5443)

Adds an experimental configuration to report field usage metrics to GraphOS Studio without requiring subgraphs to support federated tracing (ftv1).

The reported field usage data doesn't currently appear in GraphOS Studio.

telemetry:
  apollo:
    experimental_local_field_metrics: true

There is currently a small performance impact from enabling this feature.

By @tninesling, @Geal, @bryn in #5443

Add experimental h2c communication capability for communicating with coprocessor (Issue #5299)

Allows HTTP/2 Cleartext (h2c) communication with coprocessors for scenarios where the networking architecture/mesh connections don't support or require TLS for outbound communications from the router.

Introduces a new coprocessor.client configuration. The first and currently only option is experimental_http2. The available option settings are the same as the as experimental_http2 traffic shaping settings.

  • disable - disable HTTP/2, use HTTP/1.1 only
  • enable - HTTP URLs use HTTP/1.1, HTTPS URLs use TLS with either HTTP/1.1 or HTTP/2 based on the TLS handshake
  • http2only - HTTP URLs use h2c, HTTPS URLs use TLS with HTTP/2
  • not set - defaults to enable

Note

Configuring experimental_http2: http2only where the network doesn't support HTTP2 results in a failed coprocessor connection!

By @theJC in #5300

Don't miss a new router release

NewReleases is sending notifications on new releases.