github grafana/tempo v2.7.0-rc.0

pre-release5 days ago

Deprecations

  • Tempo serverless features are now deprecated and will be removed in an upcoming release #4017 @electron0zero

Breaking Changes

  • Added maximum spans per span set to prevent queries from overwhelming read path. Users can set max_spans_per_span_set to 0 to obtain the old behavior. #4275 (@carles-grafana)
  • The dynamic injection of X-Scope-OrgID header for metrics generator remote-writes is changed. If the header is aleady set in per-tenant overrides or global tempo configuration, then it is honored and not overwritten. #4021 (@mdisibio)
  • Migrate from OpenTracing to OpenTelemetry instrumentation. Removed the use_otel_tracer configuration option. Use the OpenTelemetry environment variables to configure the span exporter #3646 (@andreasgerstmayr)
  • Update the Open-Telemetry dependencies to v0.116.0 #4466 (@yvrhdn)
    After this update the Open-Telemetry Collector receiver will connect to localhost instead of all interfaces 0.0.0.0.
    Due to this, Tempo installations running inside Docker have to update the address they listen.
    For more details on this change, see #4465
    For more information about the security risk this change addresses, see https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks
  • Removed querier_forget_delay setting from the frontend. This configuration option did nothing. #3996 (@joe-elliott)
  • Use Prometheus fast regexp for TraceQL regular expression matchers. #4329 (@joe-elliott)
    All regular expression matchers will now be fully anchored. span.foo =~ "bar" will now be evaluated as span.foo =~ "^bar$"

Changes

  • [CHANGE] Disable gRPC compression in the querier and distributor for performance reasons #4429 (@carles-grafana)
    If you would like to re-enable it, we recommend 'snappy'. Use the following settings:
ingester_client:
    grpc_client_config:
        grpc_compression: "snappy"
metrics_generator_client:
    grpc_client_config:
        grpc_compression: "snappy"
querier:
    frontend_worker:
        grpc_client_config:
            grpc_compression: "snappy"
  • [CHANGE] slo: include request cancellations within SLO [#4355] (#4355) (@electron0zero)
    request cancellations are exposed under result label in tempo_query_frontend_queries_total and tempo_query_frontend_queries_within_slo_total with completed or canceled values to differentiate between completed and canceled requests.
  • [CHANGE] update default config values to better align with production workloads #4340 (@electron0zero)
  • [CHANGE] tempo-cli: add support for /api/v2/traces endpoint #4127 (@electron0zero)
    BREAKING CHANGE The tempo-cli now uses the /api/v2/traces endpoint by default,
    please use --v1 flag to use /api/traces endpoint, which was the default in previous versions.
  • [CHANGE] TraceByID: don't allow concurrent_shards greater than query_shards. #4074 (@electron0zero)
  • [CHANGE] BREAKING CHANGE The dynamic injection of X-Scope-OrgID header for metrics generator remote-writes is changed. If the header is aleady set in per-tenant overrides or global tempo configuration, then it is honored and not overwritten. #4021 (@mdisibio)
  • [CHANGE] BREAKING CHANGE Migrate from OpenTracing to OpenTelemetry instrumentation. Removed the use_otel_tracer configuration option. Use the OpenTelemetry environment variables to configure the span exporter #4028,#3646 (@andreasgerstmayr)
    To continue using the Jaeger exporter, use the following environment variable: OTEL_TRACES_EXPORTER=jaeger.
  • [CHANGE] No longer send the final diff in GRPC streaming. Instead we rely on the streamed intermediate results. #4062 (@joe-elliott)
  • [CHANGE] Update Go to 1.23.3 #4146 #4147 #4380 (@javiermolinar @mdisibio)
  • [CHANGE] Return 422 for TRACE_TOO_LARGE queries #4160 (@zalegrala)
  • [CHANGE] Tighten file permissions #4251 (@zalegrala)
  • [CHANGE] Drop max live traces log message and rate limit trace too large. #4418 (@joe-elliott)
  • [CHANGE] Update the Open-Telemetry dependencies to v0.116.0 #4466 (@yvrhdn)
    BREAKING CHANGE After this update the Open-Telemetry Collector receiver will connect to localhost instead of all interfaces 0.0.0.0.
    Due to this, Tempo installations running inside Docker have to update the address they listen.
    For more details on this change, see #4465
    For more information about the security risk this change addresses, see https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks

Features

  • [FEATURE] tempo-cli: support dropping multiple traces in a single operation #4266 (@ndk)
  • [FEATURE] Discarded span logging log_discarded_spans #3957 (@dastrobu)
  • [FEATURE] TraceQL support for instrumentation scope #3967 (@ie-pham)
  • [FEATURE] Export cost attribution usage metrics from distributor #4162 (@mdisibio)
  • [FEATURE] TraceQL metrics: avg_over_time #4073 (@javiermolinar)
  • [FEATURE] Limit tags and tag values search #4320 (@javiermolinar)

Enhancements

  • [ENHANCEMENT] TraceQL: Add range condition for byte predicates #4198 (@ie-pham)
  • [ENHANCEMENT] Add throughput and SLO metrics in the tags and tag values endpoints #4148 (@electron0zero)
  • [ENHANCEMENT] BREAKING CHANGE Add maximum spans per span set. Users can set max_spans_per_span_set to 0 to obtain the old behavior. #4275 (@carles-grafana)
  • [ENHANCEMENT] Add query-frontend limit for max length of query expression #4397 (@electron0zero)
  • [ENHANCEMENT] distributor: return trace id length when it is invalid #4407 (@carles-grafana)
  • [ENHANCEMENT] Changed log level from INFO to DEBUG for the TempoDB Find operation using traceId to reduce excessive/unwanted logs in log search. #4179 (@Aki0x137)
  • [ENHANCEMENT] Pushdown collection of results from generators in the querier #4119 (@electron0zero)
  • [ENHANCEMENT] The span multiplier now also sources its value from the resource attributes. #4210 (@JunhoeKim)
  • [ENHANCEMENT] TraceQL: Attribute iterators collect matched array values #3867 (@electron0zero, @stoewer)
  • [ENHANCEMENT] Allow returning partial traces that exceed the MaxBytes limit for V2 #3941 (@javiermolinar)
  • [ENHANCEMENT] Added new middleware to validate request query values #3993 (@javiermolinar)
  • [ENHANCEMENT] Prevent massive allocations in the frontend if there is not sufficient pressure from the query pipeline. #3996 (@joe-elliott)
    BREAKING CHANGE Removed querier_forget_delay setting from the frontend. This configuration option did nothing.
  • [ENHANCEMENT] Update metrics-generator config in Tempo distributed docker compose example to serve TraceQL metrics #4003 (@javiermolinar)
  • [ENHANCEMENT] Reduce allocs related to marshalling dedicated columns repeatedly in the query frontend. #4007 (@joe-elliott)
  • [ENHANCEMENT] Improve performance of TraceQL queries #4114 (@mdisibio)
  • [ENHANCEMENT] Improve performance of TraceQL queries #4163 (@mdisibio)
  • [ENHANCEMENT] Improve performance of some TraceQL queries using select() operation #4438 (@mdisibio)
  • [ENHANCEMENT] Reduce memory usage of classic histograms in the span-metrics and service-graphs processors #4232 (@mdisibio)
  • [ENHANCEMENT] Implement simple Fetch by key for cache items #4032 (@javiermolinar)
  • [ENHANCEMENT] TraceQL metrics queries: add min_over_time #3975 (@javiermolinar)
  • [ENHANCEMENT] TraceQL metrics queries: add max_over_time #4065 (@javiermolinar)
  • [ENHANCEMENT] Write tenantindex as proto and json with a preference for proto #4072 (@zalegrala)
  • [ENHANCEMENT] Pool zstd encoding/decoding for tempodb/backend #4208 (@zalegrala)
  • [ENHANCEMENT] Send semver version in api/stattus/buildinfo for cloud deployments #4110 [@Aki0x137]
  • [ENHANCEMENT] Add completed block validation on startup.#4256 (@joe-elliott)
  • [ENHANCEMENT] Speedup DistinctString and ScopedDistinctString collectors #4109 (@electron0zero)
  • [ENHANCEMENT] Speedup collection of results from ingesters in the querier #4100 (@electron0zero)
  • [ENHANCEMENT] Speedup DistinctValue collector and exit early for ingesters #4104 (@electron0zero)
  • [ENHANCEMENT] Add disk caching in ingester SearchTagValuesV2 for completed blocks #4069 (@electron0zero)
  • [ENHANCEMENT] Add a max flush attempts and metric to the metrics generator #4254 (@joe-elliott)
  • [ENHANCEMENT] Collection of query-frontend changes to reduce allocs. #4242 (@joe-elliott)
  • [ENHANCEMENT] Added insecure-skip-verify option in tempo-cli to skip SSL certificate validation when connecting to the S3 backend. #4259 (@faridtmammadov)
  • [ENHANCEMENT] Add invalid_utf8 to reasons spanmetrics will discard spans. #4293 (@zalegrala)
  • [ENHANCEMENT] Reduce frontend and querier allocations by dropping HTTP headers early in the pipeline. #4298 (@joe-elliott)
  • [ENHANCEMENT] Reduce ingester working set by improving prelloc behavior. #4344,#4369 (@joe-elliott)
    Added tunable prealloc env vars PREALLOC_BKT_SIZE, PREALLOC_NUM_BUCKETS, PREALLOC_MIN_BUCKET and metric tempo_ingester_prealloc_miss_bytes_total to observe and tune prealloc behavior.
  • [ENHANCEMENT] Use Prometheus fast regexp for TraceQL regular expression matchers. #4329 (@joe-elliott)
    BREAKING CHANGE All regular expression matchers will now be fully anchored. span.foo =~ "bar" will now be evaluated as span.foo =~ "^bar$"
  • [ENHANCEMENT] Reuse generator code to better refuse "too large" traces. #4365 (@joe-elliott)
    This will cause the ingester to more aggressively and correctly refuse traces. Also added two metrics to better track bytes consumed per tenant in the ingester.
    tempo_metrics_generator_live_trace_bytes and tempo_ingester_live_trace_bytes.
  • [ENHANCEMENT] Reduce goroutines in all non-querier components. #4484 (@joe-elliott)

Bugfixes

  • [BUGFIX] Handle invalid TraceQL query filter in tag values v2 disk cache #4392 (@electron0zero)
  • [BUGFIX] Replace hedged requests roundtrips total with a counter. #4063 #4078 (@galalen)
  • [BUGFIX] Metrics generators: Correctly drop from the ring before stopping ingestion to reduce drops during a rollout. #4101 (@joe-elliott)
  • [BUGFIX] Correctly handle 400 Bad Request and 404 Not Found in gRPC streaming #4144 (@mapno)
  • [BUGFIX] Correctly handle Authorization header in gRPC streaming #4419 (@mdisibio)
  • [BUGFIX] Pushes a 0 to classic histogram's counter when the series is new to allow Prometheus to start from a non-null value. #4140 (@mapno)
  • [BUGFIX] Fix counter samples being downsampled by backdate to the previous minute the initial sample when the series is new #4236 (@javiermolinar)
  • [BUGFIX] Fix traceql metrics returning incorrect data for falsey queries and unscoped attributes #4409 (@mdisibio)
  • [BUGFIX] Fix traceql metrics time range handling at the cutoff between recent and backend data #4257 (@mdisibio)
  • [BUGFIX] Fix several issues with exemplar values for traceql metrics #4366 #4404 (@mdisibio)
  • [BUGFIX] Skip computing exemplars for instant queries. #4204 (@javiermolinar)
  • [BUGFIX] Gave context to orphaned spans related to various maintenance processes. #4260 (@joe-elliott)
  • [BUGFIX] Initialize histogram buckets to 0 to avoid downsampling. #4366 (@javiermolinar)
  • [BUGFIX] Utilize S3Pass and S3User parameters in tempo-cli options, which were previously unused in the code. #4259 (@faridtmammadov)
  • [BUGFIX] Fixed an issue in the generator where the first batch was counted 2x against a traces size. #4365 (@joe-elliott)
  • [BUGFIX] Fix compaction bug in SingleBinaryMode that could lead to 2x, 3x, etc TraceQL metrics results #4446 (@mdisibio)
  • [BUGFIX] Unstable compactors can occasionally duplicate data. Check for job ownership during compaction and cancel a job if ownership changes. #4420 (@joe-elliott)

Don't miss a new tempo release

NewReleases is sending notifications on new releases.