apollographql/router v1.49.0 on GitHub

🚀 Features

Override tracing span names using custom span selectors (Issue #5261)

Adds the ability to override span names by setting the otel.name attribute on any custom telemetry selectors .

This example changes the span name to router:

telemetry:
  instrumentation:
    spans:
      router:
        otel.name:
           static: router # Override the span name to router

By @bnjjj in #5365

Add description and units to standard instruments (PR #5407)

This PR adds description and units to standard instruments available in the router. These descriptions and units have been copy pasted directly from the OpenTelemetry semantic conventions and are needed for better integrations with APMs.

By @bnjjj in #5407

Add `with_lock()` method to `Extensions` to facilitate avoidance of timing issues (PR #5360)

In the case that you necessitated writing custom Rust plugins, we've introduced with_lock() which explicitly restricts the lifetime of the Extensions lock.

Without this method, it was too easy to run into issues interacting with the Extensions since we would inadvertently hold locks for too long. This was a source of bugs in the router and caused a lot of tests to be flaky.

By @garypen in #5360

Add support for `unix_ms_now` in Rhai customizations (Issue #5182)

Rhai customizations can now use the unix_ms_now() function to obtain the current Unix timestamp in milliseconds since the Unix epoch.

For example:

fn supergraph_service(service) {
    let now = unix_ms_now();
}

By @shaikatzz in #5181

🐛 Fixes

Improve error message produced when subgraphs responses don't include an expected `content-type` header value (Issue #5359)

To enhance debuggability when a subgraph response lacks an expected content-type header value, the error message now includes additional details.

Examples:

HTTP fetch failed from 'test': subgraph response contains invalid 'content-type' header value \"application/json,application/json\"; expected content-type: application/json or content-type: application/graphql-response+json

HTTP fetch failed from 'test': subgraph response does not contain 'content-type' header; expected content-type: application/json or content-type: application/graphql-response+json

By @IvanGoncharov in #5223

Performance improvements for demand control (PR #5405)

Removes unneeded logic in the hot path for our recently released public preview of demand control feature to improve performance.

By @BrynCooke in #5405

Skip hashing the entire schema on every query plan cache lookup (PR #5374)

This fixes performance issues when looking up query plans for large schemas.

Important

If you have enabled Distributed query plan caching, this release changes the hashing algorithm used for the cache keys. On account of this, you should anticipate additional cache regeneration cost when updating between these versions while the new hashing algorithm comes into service.

By @Geal in #5374

Optimize GraphQL instruments (PR #5375)

When processing selectors for GraphQL instruments, heap allocations should be avoided for optimal performance. This change removes Vec allocations that were previously performed per field, yielding significant performance improvements.

By @BrynCooke in #5375

Log metrics overflow as a warning rather than an error (Issue #5173)

If a metric has too high a cardinality, the following is displayed as a warning instead of an error:

OpenTelemetry metric error occurred: Metrics error: Warning: Maximum data points for metric stream exceeded/ Entry added to overflow

By @bnjjj in #5287

Add support of `response_context` selectors for error conditions (PR #5288)

Provides the ability to configure custom instruments. For example:

http.server.request.timeout:
  type: counter
  value: unit
  description: "request in timeout"
  unit: request
  attributes:
    graphql.operation.name:
      response_context: operation_name
  condition:
    eq:
    - "request timed out"
    - error: reason

By @bnjjj in #5288

Inaccurate `apollo_router_opened_subscriptions` counter (PR #5363)

Fixes the apollo_router_opened_subscriptions counter which previously only incremented. The counter now also decrements.

By @bnjjj in #5363

📃 Configuration

🛠 Maintenance

Skip GraphOS tests when Apollo key not present (PR #5362)

Some tests require APOLLO_KEY and APOLLO_GRAPH_REF to execute successfully.
These are now skipped if these env variables are not present allowing external contributors to the router to successfully run the entire test suite.

By @BrynCooke in #5362

📚 Documentation

Standard instrument configuration documentation for subgraphs (PR #5422)

Added documentation about standard instruments available at the subgraph service level:

http.client.request.body.size - A histogram of request body sizes for requests handled by subgraphs.
http.client.request.duration - A histogram of request durations for requests handled by subgraphs.
http.client.response.body.size - A histogram of response body sizes for requests handled by subgraphs.

These instruments are configurable in router.yaml:

telemetry:
  instrumentation:
    instruments:
      subgraph:
        http.client.request.body.size: true # (default false)
        http.client.request.duration: true # (default false)
        http.client.response.body.size: true # (default false)

By @bnjjj in #5422

Update docs frontmatter for consistency and discoverability (PR #5164)

Makes title case consistent for page titles and adds subtitles and meta-descriptions are updated for better discoverability.

By @Meschreiber in #5164

🧪 Experimental

Warm query plan cache using persisted queries on startup (Issue #5334)

Adds support for the router to use persisted queries to warm the query plan cache upon startup using a new experimental_prewarm_query_plan_cache configuration option under persisted_queries.

To enable:

persisted_queries:
  enabled: true
  experimental_prewarm_query_plan_cache: true

By @lleadbet in #5340

Apollo reporting signature enhancements (PR #5062)

Adds a new experimental configuration option to turn on some enhancements for the Apollo reporting stats report key:

Signatures will include the full normalized form of input objects
Signatures will include aliases
Some small normalization improvements

This new configuration (telemetry.apollo.experimental_apollo_signature_normalization_algorithm) only works when in experimental_apollo_metrics_generation_mode: new mode and we don't yet recommend enabling it while we continue to verify that the new functionality works as expected.

By @bonnici in #5062

Add experimental support for sending traces to Studio via OTLP (PR #4982)

As the ecosystem around OpenTelemetry (OTel) has been expanding rapidly, we are evaluating a migration of Apollo's internal
tracing system to use an OTel-based protocol.

In the short-term, benefits include:

A comprehensive way to visualize the router execution path in GraphOS Studio.
Additional spans that were previously not included in Studio traces, such as query parsing, planning, execution, and more.
Additional metadata such as subgraph fetch details, router idle / busy timing, and more.

Long-term, we see this as a strategic enhancement to consolidate these two disparate tracing systems.
This will pave the way for future enhancements to more easily plug into the Studio trace visualizer.

Configuration

This change adds a new configuration option experimental_otlp_tracing_sampler. This can be used to send
a percentage of traces via OTLP instead of the native Apollo Usage Reporting protocol. Supported values:

always_off (default): send all traces via Apollo Usage Reporting protocol.
always_on: send all traces via OTLP.
0.0 - 1.0: the ratio of traces to send via OTLP (0.5 = 50 / 50).

Note that this sampler is only applied after the common tracing sampler, for example:

Sample 1% of traces, send all traces via OTLP:

telemetry:
  apollo:
    # Send all traces via OTLP
    experimental_otlp_tracing_sampler: always_on

  exporters:
    tracing:
      common:
        # Sample traces at 1% of all traffic
        sampler: 0.01

by @timbotnik in #4982

Set Apollo metrics generation mode to `new` by default (PR #5265)

Changes the default value of experimental_apollo_metrics_generation_mode to
new. All metrics are showing that identical signatures are being generated in
this mode.

By @bonnici in #5265