Highlights:
- Added new Thanos component: Query Frontend responsible for response caching,
query scheduling and parallelization (based on Cortex Query Frontend). - Added various new, improved UIs to Thanos based on React: Querier' BuildInfo & Flags, Ruler UI, BlockViewer.
- Optimized Sidecar, Store, Receive, Ruler data retrieval with new TSDB ChunkIterator, capping chunks to 120 samples, fixed various leaks.
- Fixed sample limit on Store Gateway.
- Added S3 Server Side Encryption options.
- Tons of other important fixes!
Thanks to all contributors! 🤗
Fixed
- #2665 Swift: Fix issue with missing Content-Type HTTP headers.
- #2800 Query: Fix handling of
--web.external-prefix
and--web.route-prefix
. - #2834 Query: Fix rendered JSON state value for rules and alerts should be in lowercase.
- #2866 Receive, Querier: Fixed leaks on receive and querier Store API Series, which were leaking on errors.
- #2937 Receive: Fixing auto-configuration of
--receive.local-endpoint
. - #2895 Compact: Fix increment of
thanos_compact_downsample_total
metric for downsample of 5m resolution blocks. - #2858 Store: Fix
--store.grpc.series-sample-limit
implementation. The limit is now applied to the sum of all samples fetched across all queried blocks via a single Series call, instead of applying it individually to each block. - #2936 Compact: Fix ReplicaLabelRemover panic when replicaLabels are not specified.
- #2956 Store: Fix fetching of chunks bigger than 16000 bytes.
- #2970 Store: Upgrade minio-go/v7 to fix slowness when running on EKS.
- #2957 Rule: breaking ⚠️ Now sets all of the relevant fields properly; avoids a panic when
/api/v1/rules
is called and the time zone is not UTC;rules
field is an empty array now if no rules have been defined in a rule group.
Thanos Rule's/api/v1/rules
endpoint no longer returns the old, deprecatedpartial_response_strategy
. The old, deprecated value has been fixed toWARN
for quite some time. Please usepartialResponseStrategy
. - #2976 Query: Better rounding for incoming query timestamps.
- #2929 Mixin: Fix expression for 'unhealthy sidecar' alert and increase the timeout for 10 minutes.
- #3024 Query: Consider group name and file for deduplication.
- #3012 Ruler,Receiver: Fix TSDB to delete blocks in atomic way.
- #3046 Ruler,Receiver: Fixed framing of StoreAPI response, it was one chunk by one.
- #3095 Ruler: Update the manager when all rule files are removed.
- #3105 Querier: Fix overwriting
maxSourceResolution
when auto downsampling is enabled. - #3010 Querier: Added
--query.lookback-delta
flag to override the default lookback delta in PromQL. The flag should be lookback delta should be set to at least 2 times of the slowest scrape interval. If unset it will use the PromQL default of 5m.
Added
- #2305 Receive,Sidecar,Ruler: Propagate correct (stricter) MinTime for TSDBs that have no block.
- #2849 Query, Ruler: Added request logging for HTTP server side.
- #2832 ui React: Add runtime and build info page
- #2926 API: Add new blocks HTTP API to serve blocks metadata. The status endpoints (
/api/v1/status/flags
,/api/v1/status/runtimeinfo
and/api/v1/status/buildinfo
) are now available on all components with a HTTP API. - #2892 Receive: Receiver fails when the initial upload fails.
- #2865 ui: Migrate Thanos Ruler UI to React
- #2964 Query: Add time range parameters to label APIs. Add
start
andend
fields to Store APILabelNamesRequest
andLabelValuesRequest
. - #2996 Sidecar: Add
reloader_config_apply_errors_total
metric. Add new flags--reloader.watch-interval
, and--reloader.retry-interval
. - #2973 Add Thanos Query Frontend component.
- #2980 Bucket Viewer: Migrate block viewer to React.
- #2725 Add bucket index operation durations:
thanos_bucket_store_cached_series_fetch_duration_seconds
andthanos_bucket_store_cached_postings_fetch_duration_seconds
. - #2931 Query: Allow passing a
storeMatch[]
to select matching stores when debugging the querier. See documentation
Changed
- #2893 Store: Rename metric
thanos_bucket_store_cached_postings_compression_time_seconds
tothanos_bucket_store_cached_postings_compression_time_seconds_total
. - #2915 Receive,Ruler: Enable TSDB directory locking by default. Add a new flag (
--tsdb.no-lockfile
) to override behavior. - #2902 Querier UI:Separate dedupe and partial response checkboxes per panel in new UI.
- #2991 Store: breaking ⚠️
operation
label valuegetrange
changed toget_range
forthanos_store_bucket_cache_operation_requests_total
andthanos_store_bucket_cache_operation_hits_total
to be consistent with bucket operation metrics. - #2876 Receive,Ruler: Updated TSDB and switched to ChunkIterators instead of sample one, which avoids unnecessary decoding / encoding.
- #3064 s3: breaking ⚠️ Add SSE/SSE-KMS/SSE-C configuration. The S3
encrypt_sse: true
option is now deprecated in favour ofsse_config
. If you usedencrypt_sse
, the migration strategy is to set up the following block:
sse_config:
type: SSE-S3