Deprecations
- Tempo serverless features are now deprecated and will be removed in an upcoming release #4017 @electron0zero
Breaking Changes
- Added maximum spans per span set to prevent queries from overwhelming read path. Users can set
max_spans_per_span_set
to 0 to obtain the old behavior. #4275 (@carles-grafana) - The dynamic injection of X-Scope-OrgID header for metrics generator remote-writes is changed. If the header is aleady set in per-tenant overrides or global tempo configuration, then it is honored and not overwritten. #4021 (@mdisibio)
- Migrate from OpenTracing to OpenTelemetry instrumentation. Removed the
use_otel_tracer
configuration option. Use the OpenTelemetry environment variables to configure the span exporter #4028,#3646 (@andreasgerstmayr)
To continue using the Jaeger exporter, use the following environment variable: OTEL_TRACES_EXPORTER=jaeger - Update the Open-Telemetry dependencies to v0.116.0 #4466 (@yvrhdn)
After this update the Open-Telemetry Collector receiver will connect tolocalhost
instead of all interfaces0.0.0.0
.
Due to this, Tempo installations running inside Docker have to update the address they listen.
For more details on this change, see #4465
For more information about the security risk this change addresses, see https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks - Removed
querier_forget_delay
setting from the frontend. This configuration option did nothing. #3996 (@joe-elliott) - Use Prometheus fast regexp for TraceQL regular expression matchers. #4329 (@joe-elliott)
All regular expression matchers will now be fully anchored.span.foo =~ "bar"
will now be evaluated asspan.foo =~ "^bar$"
Changes
- [CHANGE] Disable gRPC compression in the querier and distributor for performance reasons #4429 (@carles-grafana)
If you would like to re-enable it, we recommend 'snappy'. Use the following settings:
ingester_client:
grpc_client_config:
grpc_compression: "snappy"
metrics_generator_client:
grpc_client_config:
grpc_compression: "snappy"
querier:
frontend_worker:
grpc_client_config:
grpc_compression: "snappy"
- [CHANGE] slo: include request cancellations within SLO [#4355] (#4355) (@electron0zero)
request cancellations are exposed underresult
label intempo_query_frontend_queries_total
andtempo_query_frontend_queries_within_slo_total
withcompleted
orcanceled
values to differentiate between completed and canceled requests. - [CHANGE] update default config values to better align with production workloads #4340 (@electron0zero)
- [CHANGE] tempo-cli: add support for /api/v2/traces endpoint #4127 (@electron0zero)
BREAKING CHANGE Thetempo-cli
now uses the/api/v2/traces
endpoint by default,
please use--v1
flag to use/api/traces
endpoint, which was the default in previous versions. - [CHANGE] TraceByID: don't allow concurrent_shards greater than query_shards. #4074 (@electron0zero)
- [CHANGE] BREAKING CHANGE The dynamic injection of X-Scope-OrgID header for metrics generator remote-writes is changed. If the header is aleady set in per-tenant overrides or global tempo configuration, then it is honored and not overwritten. #4021 (@mdisibio)
- [CHANGE] BREAKING CHANGE Migrate from OpenTracing to OpenTelemetry instrumentation. Removed the
use_otel_tracer
configuration option. Use the OpenTelemetry environment variables to configure the span exporter #4028,#3646 (@andreasgerstmayr)
To continue using the Jaeger exporter, use the following environment variable:OTEL_TRACES_EXPORTER=jaeger
. - [CHANGE] No longer send the final diff in GRPC streaming. Instead we rely on the streamed intermediate results. #4062 (@joe-elliott)
- [CHANGE] Update Go to 1.23.3 #4146 #4147 #4380 (@javiermolinar @mdisibio)
- [CHANGE] Return 422 for TRACE_TOO_LARGE queries #4160 (@zalegrala)
- [CHANGE] Tighten file permissions #4251 (@zalegrala)
- [CHANGE] Drop max live traces log message and rate limit trace too large. #4418 (@joe-elliott)
- [CHANGE] Update the Open-Telemetry dependencies to v0.116.0 #4466 (@yvrhdn)
BREAKING CHANGE After this update the Open-Telemetry Collector receiver will connect tolocalhost
instead of all interfaces0.0.0.0
.
Due to this, Tempo installations running inside Docker have to update the address they listen.
For more details on this change, see #4465
For more information about the security risk this change addresses, see https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks
Features
- [FEATURE] tempo-cli: support dropping multiple traces in a single operation #4266 (@ndk)
- [FEATURE] Discarded span logging
log_discarded_spans
#3957 (@dastrobu) - [FEATURE] TraceQL support for instrumentation scope #3967 (@ie-pham)
- [FEATURE] Export cost attribution usage metrics from distributor #4162 (@mdisibio)
- [FEATURE] TraceQL metrics: avg_over_time #4073 (@javiermolinar)
- [FEATURE] TraceQL metrics queries: add min_over_time #3975 (@javiermolinar)
- [FEATURE] TraceQL metrics queries: add max_over_time #4065 (@javiermolinar)
- [FEATURE] Limit tags and tag values search #4320 (@javiermolinar)
Enhancements
- [ENHANCEMENT] TraceQL: Add range condition for byte predicates #4198 (@ie-pham)
- [ENHANCEMENT] Add throughput and SLO metrics in the tags and tag values endpoints #4148 (@electron0zero)
- [ENHANCEMENT] BREAKING CHANGE Add maximum spans per span set. Users can set
max_spans_per_span_set
to 0 to obtain the old behavior. #4275 (@carles-grafana) - [ENHANCEMENT] Add query-frontend limit for max length of query expression #4397 (@electron0zero)
- [ENHANCEMENT] distributor: return trace id length when it is invalid #4407 (@carles-grafana)
- [ENHANCEMENT] Changed log level from INFO to DEBUG for the TempoDB Find operation using traceId to reduce excessive/unwanted logs in log search. #4179 (@Aki0x137)
- [ENHANCEMENT] Pushdown collection of results from generators in the querier #4119 (@electron0zero)
- [ENHANCEMENT] The span multiplier now also sources its value from the resource attributes. #4210 (@JunhoeKim)
- [ENHANCEMENT] TraceQL: Attribute iterators collect matched array values #3867 (@electron0zero, @stoewer)
- [ENHANCEMENT] Allow returning partial traces that exceed the MaxBytes limit for V2 #3941 (@javiermolinar)
- [ENHANCEMENT] Added new middleware to validate request query values #3993 (@javiermolinar)
- [ENHANCEMENT] Prevent massive allocations in the frontend if there is not sufficient pressure from the query pipeline. #3996 (@joe-elliott)
BREAKING CHANGE Removedquerier_forget_delay
setting from the frontend. This configuration option did nothing. - [ENHANCEMENT] Update metrics-generator config in Tempo distributed docker compose example to serve TraceQL metrics #4003 (@javiermolinar)
- [ENHANCEMENT] Reduce allocs related to marshalling dedicated columns repeatedly in the query frontend. #4007 (@joe-elliott)
- [ENHANCEMENT] Improve performance of TraceQL queries #4114 (@mdisibio)
- [ENHANCEMENT] Improve performance of TraceQL queries #4163 (@mdisibio)
- [ENHANCEMENT] Improve performance of some TraceQL queries using select() operation #4438 (@mdisibio)
- [ENHANCEMENT] Reduce memory usage of classic histograms in the span-metrics and service-graphs processors #4232 (@mdisibio)
- [ENHANCEMENT] Implement simple Fetch by key for cache items #4032 (@javiermolinar)
- [ENHANCEMENT] Write tenantindex as proto and json with a preference for proto #4072 (@zalegrala)
- [ENHANCEMENT] Pool zstd encoding/decoding for tempodb/backend #4208 (@zalegrala)
- [ENHANCEMENT] Send semver version in api/stattus/buildinfo for cloud deployments #4110 [@Aki0x137]
- [ENHANCEMENT] Add completed block validation on startup.#4256 (@joe-elliott)
- [ENHANCEMENT] Speedup DistinctString and ScopedDistinctString collectors #4109 (@electron0zero)
- [ENHANCEMENT] Speedup collection of results from ingesters in the querier #4100 (@electron0zero)
- [ENHANCEMENT] Speedup DistinctValue collector and exit early for ingesters #4104 (@electron0zero)
- [ENHANCEMENT] Add disk caching in ingester SearchTagValuesV2 for completed blocks #4069 (@electron0zero)
- [ENHANCEMENT] Add a max flush attempts and metric to the metrics generator #4254 (@joe-elliott)
- [ENHANCEMENT] Collection of query-frontend changes to reduce allocs. #4242 (@joe-elliott)
- [ENHANCEMENT] Added
insecure-skip-verify
option in tempo-cli to skip SSL certificate validation when connecting to the S3 backend. #4259 (@faridtmammadov) - [ENHANCEMENT] Add
invalid_utf8
to reasons spanmetrics will discard spans. #4293 (@zalegrala) - [ENHANCEMENT] Reduce frontend and querier allocations by dropping HTTP headers early in the pipeline. #4298 (@joe-elliott)
- [ENHANCEMENT] Reduce ingester working set by improving prelloc behavior. #4344,#4369 (@joe-elliott)
Added tunable prealloc env vars PREALLOC_BKT_SIZE, PREALLOC_NUM_BUCKETS, PREALLOC_MIN_BUCKET and metric tempo_ingester_prealloc_miss_bytes_total to observe and tune prealloc behavior. - [ENHANCEMENT] Use Prometheus fast regexp for TraceQL regular expression matchers. #4329 (@joe-elliott)
BREAKING CHANGE All regular expression matchers will now be fully anchored.span.foo =~ "bar"
will now be evaluated asspan.foo =~ "^bar$"
- [ENHANCEMENT] Reuse generator code to better refuse "too large" traces. #4365 (@joe-elliott)
This will cause the ingester to more aggressively and correctly refuse traces. Also added two metrics to better track bytes consumed per tenant in the ingester.
tempo_metrics_generator_live_trace_bytes
andtempo_ingester_live_trace_bytes
. - [ENHANCEMENT] Reduce goroutines in all non-querier components. #4484 (@joe-elliott)
Bugfixes
- [BUGFIX] Handle invalid TraceQL query filter in tag values v2 disk cache #4392 (@electron0zero)
- [BUGFIX] Replace hedged requests roundtrips total with a counter. #4063 #4078 (@galalen)
- [BUGFIX] Metrics generators: Correctly drop from the ring before stopping ingestion to reduce drops during a rollout. #4101 (@joe-elliott)
- [BUGFIX] Correctly handle 400 Bad Request and 404 Not Found in gRPC streaming #4144 (@mapno)
- [BUGFIX] Correctly handle Authorization header in gRPC streaming #4419 (@mdisibio)
- [BUGFIX] Pushes a 0 to classic histogram's counter when the series is new to allow Prometheus to start from a non-null value. #4140 (@mapno)
- [BUGFIX] Fix counter samples being downsampled by backdate to the previous minute the initial sample when the series is new #4236 (@javiermolinar)
- [BUGFIX] Fix traceql metrics returning incorrect data for falsey queries and unscoped attributes #4409 (@mdisibio)
- [BUGFIX] Fix traceql metrics time range handling at the cutoff between recent and backend data #4257 (@mdisibio)
- [BUGFIX] Fix several issues with exemplar values for traceql metrics #4366 #4404 (@mdisibio)
- [BUGFIX] Skip computing exemplars for instant queries. #4204 (@javiermolinar)
- [BUGFIX] Gave context to orphaned spans related to various maintenance processes. #4260 (@joe-elliott)
- [BUGFIX] Initialize histogram buckets to 0 to avoid downsampling. #4366 (@javiermolinar)
- [BUGFIX] Utilize S3Pass and S3User parameters in tempo-cli options, which were previously unused in the code. #4259 (@faridtmammadov)
- [BUGFIX] Fixed an issue in the generator where the first batch was counted 2x against a traces size. #4365 (@joe-elliott)
- [BUGFIX] Fix compaction bug in SingleBinaryMode that could lead to 2x, 3x, etc TraceQL metrics results #4446 (@mdisibio)
- [BUGFIX] Unstable compactors can occasionally duplicate data. Check for job ownership during compaction and cancel a job if ownership changes. #4420 (@joe-elliott)