Breaking Changes
- [CHANGE] BREAKING CHANGE We are no longer publishing rpm and deb packages due to an internal change to the handling of signing keys. This can be restored if we find that folks are actually using these packages. #5684 (@joe-elliott)
- [CHANGE] BREAKING CHANGE Migrated Tempo Vulture and Integration Tests from the deprecated Jaeger agent/exporter to the standard OTLP exporter. Vulture now pushes traces to the Tempo OTLP GRCP endpoint. #5058 (@iamrajiv, @javiermolinar)
- [CHANGE] BREAKING CHANGE TraceQL Metrics buckets are calculated based on data in past. #5366 (@ruslan-mikhailov)
- [CHANGE] BREAKING CHANGE Fix incorrect TraceQL metrics results when series labels include strings and integers with same textural representation.
This also changes the TraceQL metrics responses of/api/metrics/query_range
and/api/metrics/query
to remove the redundant
prom_labels
field which was the error source. There may be an interruption to TraceQL metrics queries during rollout while components are running the previous version. #5659 (@mdisibio)
Changes
- [CHANGE] Return Bad Request from all frontend endpoints if the tenant can't be extracted. #5480 (@carles-grafana)
- [CHANGE] Do not count cached querier responses for SLO metrics such as inspected bytes. #5185 (@carles-grafana)
- [CHANGE] Adjust the definition of
tempo_metrics_generator_processor_service_graphs_expired_edges
to exclude edges that are counted in the service graph. #5319 (@joe-elliott) - [CHANGE] Command
tempo-cli analyse block
(s) excludes attributes with array values. #5380 (@stoewer) - [CHANGE] Remove .005s and add a 1.5s bucket to all request duration histograms. #5492 (@joe-elliott)
- [CHANGE] Improve tempo writes dashboard. #5500 (@javiermolinar)
- [CHANGE] Upgrade Tempo to go 1.25.0. #5548 (@javiermolinar)
- [CHANGE] Drop tracing bridges in favor of OTEL only tracing. #5594 (@zalegrala)
- [CHANGE] Enable HTTP writes in the multi-tenant example. #5297 (@carles-grafana)
- [CHANGE] Upgrade Tempo to go 1.25.1 #5685 (@electron0zero)
Features
- [FEATURE] Add MCP Server support. #5212 (@joe-elliott)
- [FEATURE] Add query hints
sample=true
andsample=0.xx
which can speed up TraceQL metrics queries by sampling a subset of the data to provide an approximate result. #5469 (@mdisibio) - [FEATURE] New block encoding vParquet5-preview1 with low-resolution timestamp columns for better TraceQL metrics performance. This format is in development and breaking changes are expected before final release. #5495 (@mdisibio)
- [FEATURE] New block encoding vParquet5-preview2 with dedicated attribute columns for integers. This format is in development and breaking changes are expected before final release. #5639 (@stoewer)
Enhancement
- [ENHANCEMENT] Add counter
query_frontend_bytes_inspected_total
, which shows the total number of bytes read from disk and object storage #5310 (@carles-grafana) - [ENHANCEMENT] Add histograms
spans_distance_in_future_seconds
/spans_distance_in_past_seconds
that count spans with end timestamp in the future / past. While spans in the future are accepted, they are invalid and may not be found using the Search API. #4936 (@carles-grafana) - [ENHANCEMENT] Add support for scope in cost-attribution usage tracker. #5646 (@electron0zero)
- [ENHANCEMENT] Add alert for high error rate reported by vulture. #5206 (@ruslan-mikhailov)
- [ENHANCEMENT] Support the new
db.namespace
attribute for service-graph DB nodes. #5602 (@gouthamve) - [ENHANCEMENT] TraceQL metrics performance increase for simple queries. #5247 (@mdisibio)
- [ENHANCEMENT] TraceQL search and metrics performance increase. #5280 (@mdisibio)
- [ENHANCEMENT] TraceQL performance improvement. #5218 (@mdisibio)
- [ENHANCEMENT] TraceQL
compare()
performance improvement. #5419 (@mdisibio) - [ENHANCEMENT] Align traceql attribute struct for better performance. #5240 (@mdisibio)
- [ENHANCEMENT] Drop invalid prometheus label names in the
spanmetrics
processor. #5122 (@KyriosGN0) - [ENHANCEMENT] Improve logging and tracing in the write path to include tenant info. #5436 (@javiermolinar)
- [ENHANCEMENT] Added usage tracker example. #5356 (@javiermolinar)
- [ENHANCEMENT] Add Stop method. #5293 (@stephanos)
- [ENHANCEMENT] Use peer attributes to determine the name of a client service virtual node in the service graph. #5381 (@MartenM)
- [ENHANCEMENT] Put actual size for writing to backend. #5413 (@ruslan-mikhailov)
- [ENHANCEMENT] Upgrade Azurite and Fake-gcs-server to latest version. #5512 (@javiermolinar)
- [ENHANCEMENT] Make block ordering deterministic. #5411 (@rajiv-singh)
- [ENHANCEMENT] Improve exemplar selection in
quantile_over_time()
. #5278 (@zalegrala) - [ENHANCEMENT] Measure bytes received before limits and publish it as
tempo_distributor_ingress_bytes_total
. #5601 (@mapno) - [ENHANCEMENT] Add total size logging functionality to track trace #5625(@sienna011022)
Bugfix
- [BUGFIX] Fix Tempo configuration options that are always overrided with config overrides section. #5202 (@KyriosGN0)
- [BUGFIX] Correctly apply trace idle period in ingesters and add the concept of trace live period. #5346 (@joe-elliott)
- [BUGFIX] Fix invalid YAML output from
/status/runtime_config
endpoint by adding document separator. #5371 (@iamrajiv) - [BUGFIX] Fix panic in
query_range
HTTP handling that could be triggered by cancellations or other errors. #5667 (@mdisibio) - [BUGFIX] Do not allow very small steps. #5441 (@ruslan-mikhailov)
- [BUGFIX] Fix incorrect TraceQL string comparison of strings starting with numbers. #5658 (@mdisibio)
- [BUGFIX] Fix incorrect results in TraceQL compare() for spans with array attributes #5519 (@ruslan-mikhailov)
- [BUGFIX] Fix cache collision for incomplete query in SearchTagValuesV2 #5549 (@ruslan-mikhailov)
- [BUGFIX] Fix for structural operator with empty left-hand spanset. #5578 (@ruslan-mikhailov)
- [BUGFIX] Deadlock on invalid query to
api/v2/search/tags
. (SearchTagsV2) #5607 (@ruslan-mikhailov) - [BUGFIX] Fixed incorrect root span detection when spans have a child_of link but no parent. #3634 (@mexirica)
- [BUGFIX] Prevent metrics-generator WAL deletion when tenant is empty. #5586 (@sienna011022)
- [BUGFIX] Fix docker-compose port configuration for Alloy gRPC (
4319
→4317
). #5536 - [BUGFIX] Fix panic error from empty span id. #5464
- [BUGFIX] Return Bad Request from frontend if the provided tag is invalid in
SearchTagValuesV2
endpoint. #5493 (@carles-grafana)
Tempo Rearchitecture [EXPERIMENTAL]
- [CHANGE] BREAKING CHANGE Drop unused
backend_scheduler.tenant_measurement_interval
, usebackend_scheduler.compaction.measure_interval
instead. #5328 (@zalegrala) - [CHANGE] Allow configuration of
min
/max
input blocks for compaction provider. #5373 (@zalegrala) - [CHANGE] BREAKING CHANGE Add require minimum time between tenant sorting in backend-scheduler. #5410 (@zalegrala)
The configuration forbackend_scheduler.provider.compaction.backoff
has been removed.
Additionally, thecompaction_tenant_backoff_total
metric has been renamed tocompaction_empty_tenant_cycle_total
for clarity. - [CHANGE] Shard backend-scheduler work files locally and modify backend work structure to accommodate sharding approach. #5412 (@zalegrala)
- [CHANGE] Change worker to shutdown after the current job, waiting
30s
by default. #5460 (@zalegrala) - [CHANGE] Assert max live traces limits in the local-blocks processor. #5170 (@mapno)
- [ENHANCEMENT] Add new header to vulture requests to query live stores. #5663 (@javiermolinar)
- [ENHANCEMENT] Add endpoint for partition downscaling. #4913 (@mapno)
- [ENHANCEMENT] Add backend scheduler and worker to the resources dashboard. #5206 (@javiermolinar)
- [ENHANCEMENT] Allow configure group lag exporter update time. #5431 (@javiermolinar)
- [ENHANCEMENT] Implement a
listOffset
by partition client. #5415 (@javiermolinar) - [ENHANCEMENT] Do not compact unfinished blocks. #5390 (@ruslan-mikhailov)
- [ENHANCEMENT] Allow block-builder to operate over empty partitions. #5581 (@ruslan-mikhailov)
- [ENHANCEMENT] Refactor method to wait for a healthy broker. #5618 (@javiermolinar)
- [ENHANCEMENT] Remove duplicated metric to count the number of processed records. #5654 (@javiermolinar)
- [ENHANCEMENT] Don't enqueue records to kafka when the context has been cancelled. #5499 (@javiermolinar)
- [ENHANCEMENT] Include backendwork dashboard and include additional alert. #5159 (@zalegrala)
- [ENHANCEMENT] Add live store to
jsonnet
lib. #5591 #5606 #5609 (@mapno) - [BUGFIX] Fix race condition between compaction provider and backend-scheduler. #5409 (@zalegrala)
- [BUGFIX] Correctly support req.AllowPartialTrace and tenant limits in the live store when requesting trace by id. #5680 (@joe-elliott)