Tempo v3.0.0-rc.1 Release Notes
Tempo 3.0 is a major release candidate focused on the new ingest/write architecture, removal of deprecated 2.x components, migration tooling, TraceQL metrics improvements, and live-store/block-builder correctness and observability fixes.
Breaking Changes
- Remove duplicate "compaction" prefix from CompactorConfig CLI flags. Affected flags:
compaction.block-retention,compaction.max-objects-per-block,compaction.max-block-bytes,compaction.compaction-windowby @electron0zero in #6909 - Enable RetryInfo by default.
distributor.retry_after_on_resource_exhaustednow defaults to5s(was0) so OTLP clients receive a retry hint onResourceExhaustederrors by @electron0zero in #7088
Set to0to disable cluster-wide, or set the per-tenant overrideingestion.retry_info_enabled: falseto disable for a single tenant. - Centralize block and WAL config:
block_builderandlive_storenow always usestorage.trace.blocksettings; per-module block config fields are removed by @stoewer in #6647 - Remove Opencensus receiver by @javiermolinar in #6523
- Remove legacy
mem-ballast-size-mbscli flag by @orkhan-huseyn in #6403 - tempo-cli: Support relative time (now, now-1h) for start/end args and standardize on RFC3339 in all commands by @electron0zero in #6458
query searchcommand no longer accepts timestamps without timezone (e.g.2024-01-01T00:00:00), use RFC3339 (e.g.2024-01-01T00:00:00Z) or relative time instead. - Consolidate read configuration for recent data cutoff.
query_frontend.search.query_ingesters_untilis removed in favor of onlyquery_frontend.search.query_backend_afterby @mapno in #6507 - Remove deprecated
querier.query_live_storeconfig. This field must be removed from configs on upgrade by @javiermolinar in #7048 - Optimize TraceQL AST by rewriting conditions on the same attribute to their array equivalent by @stoewer in #6353
Slightly changes the array matching semantics of != and !~ operators and introduces stricter rules for regex literals. - Remove partition ring livestore config by @javiermolinar in #6981
- Remove ingester module by @javiermolinar in #6959
- Remove ingest.enabled config by @javiermolinar in #6873
- Disable legacy (flat, unscoped) overrides by default. Tempo will refuse to start if legacy overrides are detected. Set
enable_legacy_overrides: trueor-config.enable-legacy-overrides=trueto opt back in temporarily. Legacy overrides will be removed in a future release by @electron0zero in #6741 - Remove remaining app ingester config by @javiermolinar in #6667
- Remove span-metrics leftovers and lazy-init generator clients by @javiermolinar in #6618
- Decommission livestore MetricsGenerator query service by @javiermolinar in #6615
- Remove metrics-generator localblocks processor and related local block storage plumbing by @javiermolinar in #6555
- Remove ingesters by @javiermolinar in #6504
- Remove ingesters and compactor alerts by @javiermolinar in #6369
- Removed
v2block encoding and compactor component by @joe-elliott in #6273
This includes the removal of the following CLI commands which werev2specific:list block,list index,view index,gen index,gen bloom. - SpanMetricsSummary is removed and querier code simplified by @javiermolinar in #6496 and #6510
- Sets the
alltarget to be 3.0 compatible and removes thescalable-single-binarytarget by @joe-elliott in #6283 - Clean up enterprise jsonnet by @javiermolinar in #6505
Changes
- Stop publishing 32-bit ARM binary archives. Release artifacts continue to include amd64 and arm64 binaries by @javiermolinar in #7106
- Upgrade Tempo to Go 1.26.0 by @stoewer in #6443
- Allow duplicate dimensions for span metrics and service graphs. This is a valid use case if using different instrumentation libraries, with spans having "deployment.environment" and others "deployment_environment", for example by @carles-grafana in #6288
- Update default max duration for TraceQL metrics queries up to one day by @javiermolinar in #6285
- Set TraceQL query metrics checks by default in Vulture by @javiermolinar in #6275
- Make Tempo single-binary example use the local backend by @javiermolinar in #7033
- Bump ingestion limits by @javiermolinar in #7034
- TraceQL metrics - change default step intervals to align with new vParquet5 timestamp columns by @mdisibio in #6413
- Remove all traces of ingesters from the dashboards by @javiermolinar in #6352
- jsonnet: Add emptyDir data volume to block-builder StatefulSet by @mapno in #6648
- Add quick checks to tempo mixin runbook by @javiermolinar in #6696
- Deprecate metrics-generator no-local-blocks by @javiermolinar in #6707
- Own local block and partition ring helpers by @javiermolinar in #6808
- Track invalid trace and span id discards by @javiermolinar in #6799
- Deprecate
query_frontend.rf1_afterand query all blocks regardless of replication factor for non-metrics paths. Simplifies 2.x to 3.0 migration by @mapno in #6969 - Flush blocks to backend storage from the Live store in single binary mode by @javiermolinar in #6941
- Remove stale config from the examples by @javiermolinar in #6980
- tempo-cli: Rewrite
migrate overrides-configand addmigrate overrides-per-tenantcommand to help migrate legacy flat overrides to the new scoped format by @electron0zero in #6793 - Decouple livestore from metrics-generator by @javiermolinar in #6506 and #6535
- Expose otlp http and grpc ports for Docker examples by @javiermolinar in #6296
Features
- Add span profiling support via otelpyroscope. Enable with
span_profiling: true(or-span-profilingCLI flag) to attach pprof labels to OTel spans by @simonswine in #7063 - Add
tempo-cli migrate configcommand for migrating Tempo 2.x configs to 3.0 by @mapno in #6982 - jsonnet: Add KEDA-based horizontal pod autoscaling support for microservices deployment by @mapno in #6970
- Add automemlimit support for automatic GOMEMLIMIT configuration. Enable with
memory.automemlimit_enabled: trueby @oleg-kozlyuk-grafana in #6313 - Support comparison operators in TraceQL Metrics queries by @ruslan-mikhailov in #6474
- metrics-generator: Add span filtering to service graphs through
filter_policiesby @javiermolinar in #6453 - Add new include_any filter policy for spanmetrics filter by @javiermolinar in #6392
- Add span_multiplier_key to overrides. This allows tenants to specify the attribute key used for span multiplier values to compensate for head-based sampling by @carles-grafana in #6260
- metrics-generator: Add per-label limiter to control cardinality by @electron0zero in #6414
Addsmax_cardinality_per_labelper tenant override and new metrics to estimate per label cardinality demand estimate. - Add an extension mechanism for per-tenant overrides by @stoewer in #6758
- Extend
TraceRedactorinterface to support hiding complete traces viaErrTraceHiddenby @stoewer in #6811 - Single-binary mode: push distributor local ingest directly to live-store and metrics-generator without Kafka by @javiermolinar in #6729
Enhancements
- Support OR conditions for tag name and tag value autocomplete (search tags v2) by @ie-pham in #6827
- Expose MinIO retry settings via S3 config by @rwhitty in #6561
- Reduce default livestore WAL size and align query defaults:
max_block_duration1mto30s,max_block_bytes100MiBto50MiB,complete_block_timeout1hto20m, metricsquery_backend_after30mto15mby @zhxiaogg in #6974 - Enable native histogram emission for all promauto-registered histograms, including
tempo_request_duration_seconds. Both classic and native formats are emitted simultaneously; existing scrapers are unaffected by @zalegrala in #6910 - tempo-cli: Add
--headerflag toquery apicommands for custom headers by @Nouuu in #6768 - tempo-cli: add
redactcommand to submit trace redaction jobs to the backend scheduler by @zalegrala in #6832 - Block builder: deduplicate spans within traces during block creation and track removed duplicates via
tempo_block_builder_spans_deduped_totalmetric by @zhxiaogg in #6539 - metrics-generator: Support extracting span multiplier from W3C tracestate OTel probability sampling threshold via
enable_tracestate_span_multiplierconfig option by @csmarchbanks in #6684 - Add new alerts and runbooks entries by @javiermolinar in #6276
- Double the maximum number of dedicated string columns in vParquet5 and update tempo-cli to determine the optimum number for the data by @mdisibio in #6282
- TraceQL metrics - experimental faster read path for most metrics queries, accessible behind the query hint
spanonly_fetch=truewhenunsafe_query_hintsis enabled by @mdisibio in #6359 - TraceQL metrics - add new per-tenant override to opt-in or opt-out of the new experimental faster read path for most metrics queries by @mdisibio in #6849
- Vulture: extend data consistency checks to include more strings, integers, and blobs, at resource/span/event scopes, and perform deeper trace content check by @mdisibio in #6731
- Improve attribute truncating observability by @javiermolinar in #6400
- Log truncated oversized attributes by @carles-grafana in #6467
- livestore: make
trace_too_largelog line an insight by @carles-grafana in #6371 - Remove live-store partition owner from ring on shutdown to prevent stale owner entries by @oleg-kozlyuk-grafana in #6409
- Improved live store readiness check and added
readiness_target_lagandreadiness_max_waitconfig parameters. Live store will now - ifreadiness_target_lagis set - not report/readyuntil Kafka lag is brought under the specified value by @oleg-kozlyuk-grafana and @ruslan-mikhailov in #6238 and #6405 - Expose a new histogram metric to track the jobs per query distribution by @javiermolinar in #6343
- Do deep validation for filter policies in user configurable overrides API by @electron0zero in #6407
- Allow span_name_sanitization to be set via user-configurable overrides API by @Logiraptor in #6411
- Add
fail_on_high_lagparameter to allow live-store to fail if it is lagged by @ruslan-mikhailov and @carles-grafana in #6363, #6567 and #7066 - Add support for per-tenant left-padding of trace IDs by @mapno in #6489
- Add new metric for generator ring size:
tempo_distributor_metrics_generator_tenant_ring_sizeby @zalegrala in #5686 - Remove explicit
runtime.GC()calls in vParquet5 compactor/block creation and CLI by @oleg-kozlyuk-grafana in #6603 - Reduce allocations in
extendReuseSlicegrowth path during WAL writes and block creation by @mapno in #6863 - Implemented anti-affinity for pods in same livestore zone by @zhxiaogg in #6757
- Livestore: skipped WAL complete op during shutdown by @zhxiaogg in #6839
- Add metric to track livestore block cut reasons by @zhxiaogg in #6922
- Enable async parquet read mode for WAL completion path by @zhxiaogg in #6967
- metrics-generator: add
leave_consumer_group_on_shutdownto send LeaveGroup on shutdown for immediate partition reassignment instead of waiting for session timeout by @zalegrala in #6575
Bugfixes
- Fix tempo-vulture ignoring
-tempo-push-tlsflag in normal operating mode by @zachfi in #6976 - livestore: check readiness before lag for SearchRecent and QueryRange queries by @zhxiaogg in #6911
- Fix integer overflow in query parameters by using
strconv.ParseUintinstead ofstrconv.Atoi/strconv.ParseIntfor unsigned integer fields by @ricardbejarano in #6612 - Fix live-store SearchTagValuesV2 disk cache never being populated on complete blocks by @mapno in #6858
- Fix dedicated columns fallback in
block_builderandlive_storeto usestorage.trace.block.parquet_dedicated_columnswhen not set via overrides by @stoewer in #6647 - Force live-store to rehydrate from Kafka lookback period when local data is missing (e.g. PVC wipe, new node) instead of resuming from the committed consumer group offset by @oleg-kozlyuk-grafana in #6428
- fix: reload span_name_sanitization overrides during runtime by @electron0zero in #6435
- fix: live store honor the config options for block and WAL versions by @mdisibio in #6509
- fix: block builder honor the global storage block config for block and WAL versions by @Harry-kp in #6532
- fix: normalize allowlist headers when building the allowlist map by @javiermolinar in #6481
- fix: bug related to dedicated column filtering by @stoewer in #6586
- fix: compactor deduped spans metric uses wrong type (gauge instead of counter) by @bejaratommy in #6576
- metrics-generator: Fix active-series counter underflow in local series limiter when overflow series are deleted by @carles-grafana in #6568
- fix: skip per-label limiter and sanitizer for target_info and host_info metrics in metrics-generator by @electron0zero in #6660
- fix(traceql): err on division by zero by @Proximyst in #6580
- fix(traceql): stop intPow from hanging by @Proximyst in #6581
- fix(traceql): Fix incorrect search results for some queries on new blob columns by @mdisibio in #6815
- fix(vparquet5) Fix buffer-reuse bug where event attributes in dedicated columns could be persisted on additional spans and events by @mdisibio in #6914
- fix: race condition where
remove_owner_on_shutdownflag was set too late — after context cancellation already triggered the lifecycler's shutdown, causing the partition owner to remain in the ring by @oleg-kozlyuk-grafana in #6693 - Return 400 instead of 500 when query_range or query_instant requests have unparseable start/end parameters by @ruslan-mikhailov in #6694
- fix: correct block-builder fetch metrics to use counters instead of gauges by @WinterCabbage in #6578
- Log tenant on receiver push errors by @javiermolinar in #6780
- Fix race conditions in WAL block by @ruslan-mikhailov in #6773
- metrics-generator: Fix
target_infobeing skipped when resource attributes have empty values by @carles-grafana in #6774 - metrics-generator: Drain old series on metric replacement to prevent limiter leak and permanent overflow by @carles-grafana in #6653
- live-store: fixed unsuccessful deregistering from membership/partition rings during shutdown by @zhxiaogg in #6848
- fix: respect context cancellation when reading WAL block iterator by @zhxiaogg in #6928
- Complete lifecycler shutdown on errors by @javiermolinar in #6906
- livestore: fix concurrent WAL writes from periodic and shutdown flushes by @zhxiaogg in #6972
- live-store: fix race conditions for tag values endpoint by @ruslan-mikhailov in #7000
- live-store: correct backoff duration calculation by @ruslan-mikhailov in #6999
- vulture: fix for recent traces when query_end_cutoff is enabled by @ruslan-mikhailov in #7018
- Fix live-store producing WAL blocks exceeding max_block_bytes when flushing large batches of idle traces by @ruslan-mikhailov in #6971
- live-store: skip lookback replay when partition is Inactive during scaling down by @zhxiaogg in #7101
New Contributors
Thanks to the following first-time contributors:
- @evan361425 made their first contribution in #5968
- @mihaelmiklec made their first contribution in #6442
- @Harry-kp made their first contribution in #6532
- @bejaratommy made their first contribution in #6576
- @jasuade made their first contribution in #6610
- @antonio-mazzini made their first contribution in #6609
- @orkhan-huseyn made their first contribution in #6403
- @ricardbejarano made their first contribution in #6612
- @rwhitty made their first contribution in #6561
- @WinterCabbage made their first contribution in #6578
- @csmarchbanks made their first contribution in #6684
- @gounthar made their first contribution in #6756
- @Nouuu made their first contribution in #6768
- @EoinTrial made their first contribution in #6905
- @sethmccombs made their first contribution in #7108
Full Changelog: v2.10.0-rc.0...v3.0.0-rc.1