v0.32.0-rc.0 is out after a long wait, as we were busy fixing a rather challenging issue!
Thank you to all contributors who have contributed to this release. It wouldn't be possible without you.
Some of the highlights include support for PromQL query explanations in the UI when using the thanos
PromQL engine, AZ-aware replication for Receive and other new flags, tools bucket replicate
improvements, and lots of optimizations and bug/race fixes!
Do take note of some of the breaking metric name changes.
You can find the changelog with all of the details below. Let's also celebrate all our new contributors!
Please try it out and let us know if you spot any problems! Full-release/next rc will be in 3 days!
Changes
Added
- #6437 Receive: make tenant stats limit configurable
- #6369 Receive: add az-aware replication support for Ketama algorithm
- #6185 Tracing: tracing in OTLP support configuring service_name.
- #6192 Store: add flag
bucket-web-label
to select the label to use as timeline title in web UI - #6195 Receive: add flag
tsdb.too-far-in-future.time-window
to prevent clock skewed samples to pollute TSDB head and block all valid incoming samples. - #6273 Mixin: Allow specifying an instance name filter in dashboards
- #6163 Receiver: Add hidden flag
--receive-forward-max-backoff
to configure the max backoff for forwarding requests. - #5777 Receive: Allow specifying tenant-specific external labels in Router Ingestor.
- #6352 Store: Expose store gateway query stats in series response hints.
- #6420 Index Cache: Cache expanded postings.
- #6441 Compact: Compactor will set
index_stats
inmeta.json
file with max series and chunk size information. - #6466 Mixin (Receive): add limits alerting for configuration reload and meta-monitoring.
- #6467 Mixin (Receive): add alert for tenant reaching head series limit.
- #6528 Index Cache: Add histogram metric
thanos_store_index_cache_stored_data_size_bytes
for item size. - #6560 Thanos ruler: add flag to optionally disable adding Thanos params when querying metrics
- #6574 Tools: Add min and max compactions range flags to
bucket replicate
command. - #6593 Store: Add
thanos_bucket_store_chunk_refetches_total
metric to track number of chunk refetches. - #6264 Query: Add Thanos logo in navbar
- #6234 Query: Add ability to switch between
thanos
andprometheus
engines dynamically via UI and API. - #6346 Query: Add ability to generate SQL-like query explanations when
thanos
engine is used.
Fixed
- #6503 *: Change the engine behind
ContentPathReloader
to be completely independent of any filesystem concept. This effectively fixes this configuration reload when used with Kubernetes ConfigMaps, Secrets, or other volume mounts. - #6456 Store: fix crash when computing set matches from regex pattern
- #6427 Receive: increased log level for failed uploads to
error
- #6172 query-frontend: return JSON formatted errors for invalid PromQL expression in the split by interval middleware.
- #6171 Store: fix error handling on limits.
- #6183 Receiver: fix off by one in multitsdb flush that will result in empty blocks if the head only contains one sample
- #6197 Exemplar OTel: Fix exemplar for otel to use traceId instead of spanId and sample only if trace is sampled
- #6207 Receive: Remove the shipper once a tenant has been pruned.
- #6216 Receiver: removed hard-coded value of EnableExemplarStorage flag and set it according to max-exemplar value.
- #6222 mixin(Receive): Fix tenant series received dashboard widget.
- #6218 mixin(Store): handle ResourceExhausted as a non-server error. As a consequence, this error won't contribute to Store's grpc errors alerts.
- #6271 Receive: Fix segfault in
LabelValues
during head compaction. - #6306 Tracing: tracing in OTLP utilize the OTEL_TRACES_SAMPLER env variable
- #6330 Store: Fix inconsistent error for series limits.
- #6342 Cache/Redis: Upgrade
rueidis
to v1.0.2 to to improve error handling while shrinking a redis cluster. - #6325 Store: return gRPC resource exhausted error for byte limiter.
- #6399 *: Fix double-counting bug in http_request_duration metric
- #6428 Report gRPC connnection errors in the logs.
- #6519 Reloader: Use timeout for initial apply.
- #6509 Store Gateway: Remove
memWriter
fromfileWriter
to reduce memory usage when sync index headers. - #6556 Thanos compact: respect block-files-concurrency setting when downsampling
- #6592 Query Frontend: fix bugs in vertical sharding
without
andunion
function to allow more queries to be shardable. - #6317 *: Fix internal label deduplication bug, by resorting store response set.
- #6189 Rule: Fix panic when calling API
/api/v1/rules?type=alert
.
Changed
- #6049 Compact: breaking ⚠️ Replace group with resolution in compact metrics to avoid cardinality explosion on compact metrics for large numbers of groups.
- #6168 Receiver: Make ketama hashring fail early when configured with number of nodes lower than the replication factor.
- #6201 Query-Frontend: Disable absent and absent_over_time for vertical sharding.
- #6212 Query-Frontend: Disable scalar for vertical sharding.
- #6107 Change default user id in container image from 0(root) to 1001
- #6228 Conditionally generate debug messages in ProxyStore to avoid memory bloat.
- #6231 mixins: Add code/grpc-code dimension to error widgets.
- #6244 mixin(Rule): Add rule evaluation failures to the Rule dashboard.
- #6303 Store: added and start using streamed snappy encoding for postings list instead of block based one. This leads to constant memory usage during decompression. This approximately halves memory usage when decompressing a postings list in index cache.
- #6071 Query Frontend: breaking ⚠️ Add experimental native histogram support for which we updated and aligned with the Prometheus common model, which is used for caching so a cache reset required.
- #6163 Receiver: changed default max backoff from 30s to 5s for forwarding requests. Can be configured with
--receive-forward-max-backoff
. - #6327 *: breaking ⚠️ Use histograms instead of summaries for instrumented handlers.
- #6322 Logging: Avoid expensive log.Valuer evaluation for disallowed levels.
- #6358 Query: Add +Inf bucket to query duration metrics
- #6363 Store: Check context error when expanding postings.
- #6405 Index Cache: Change postings cache key to include the encoding format used so that older Thanos versions would not try to decode it during the deployment of a new version.
- #6479 Store: breaking ⚠️ Rename
thanos_bucket_store_cached_series_fetch_duration_seconds
tothanos_bucket_store_series_fetch_duration_seconds
andthanos_bucket_store_cached_postings_fetch_duration_seconds
tothanos_bucket_store_postings_fetch_duration_seconds
. - #6474 Store/Compact: Reduce a large amount of
Exists
API calls against object storage when synchronizing meta files in favour of a recursiveIter
call. - #6548 Objstore: Bump minio-go to v7.0.61.
- #6187 *: Unify gRPC flags for all servers.
- #6267 Query: Support unicode external label truncation.
- #6371 Query: Only keep name in UI
store_matches
param. - #6609 *: Bump
go4.org/intern
to fix Go 1.21 builds.
Removed
- #6496 *: Remove unnecessary configuration reload from
ContentPathReloader
and improve its tests. - #6432 Receive: Remove duplicated
gopkg.in/fsnotify.v1
dependency. - #6332 *: Remove unmaintained
gzip
dependency.
New Contributors
- @tao12345666333 made their first contribution in #6169
- @MichaHoffmann made their first contribution in #6168
- @thib-ack made their first contribution in #6189
- @jianwu made their first contribution in #6197
- @justinjung04 made their first contribution in #6212
- @timo-42 made their first contribution in #6107
- @jacobbaungard made their first contribution in #6228
- @hiteshwani29 made their first contribution in #6216
- @kaleidoscopica made their first contribution in #6233
- @samruddhikhandale made their first contribution in #6232
- @pavdmyt made their first contribution in #6243
- @hackeramitkumar made their first contribution in #6264
- @jnyi made their first contribution in #6195
- @mickeyzzc made their first contribution in #6267
- @willnewby made their first contribution in #6275
- @junotx made their first contribution in #6278
- @Etienne-M made their first contribution in #6282
- @alexqyle made their first contribution in #6287
- @naveadkazi made their first contribution in #6294
- @wallee94 made their first contribution in #6299
- @thibaultmg made their first contribution in #6330
- @shayyxi made their first contribution in #6306
- @aimuz made their first contribution in #6345
- @rgarcia89 made their first contribution in #6386
- @alxric made their first contribution in #6369
- @jpds made their first contribution in #6351
- @mhoffm-aiven made their first contribution in #6456
- @heliapb made their first contribution in #6477
- @sigmaris made their first contribution in #6481
- @dbason made their first contribution in #6483
- @M3t0r made their first contribution in #6487
- @testwill made their first contribution in #6499
- @captncraig made their first contribution in #6519
- @zenbeam made their first contribution in #6555
- @TomMD made their first contribution in #6571
- @jonjohnsonjr made their first contribution in #6609
Full Changelog: v0.31.0...v0.32.0-rc.0