github cortexproject/cortex v1.16.0
Cortex 1.16.0

latest releases: v1.18.1, v1.18.0, v1.18.0-rc.0...
11 months ago

This release contains 227 contributions from 27 contributors. We also have 10 new contributors. Thank you all for the contribution!

Some notable changes release are:

  • Store Gateway multilevel index cache
  • Object storage backend for runtime config
  • Disable specific rule groups in Ruler
  • List rules supports filtering by rule name, rule group and file
  • Allow tenant shard size to be a percent of total instances for Querier and Store Gateway
  • Various improvement on metrics

Cortex

  • [CHANGE] AlertManager: include reason label in cortex_alertmanager_notifications_failed_total. #5409
  • [CHANGE] Ruler: Added user label to cortex_ruler_write_requests_total, cortex_ruler_write_requests_failed_total, cortex_ruler_queries_total, and cortex_ruler_queries_failed_total metrics. #5312
  • [CHANGE] Alertmanager: Validating new fields on the PagerDuty AM config. #5290
  • [CHANGE] Ingester: Creating label native-histogram-sample on the cortex_discarded_samples_total to keep track of discarded native histogram samples. #5289
  • [CHANGE] Store Gateway: Rename cortex_bucket_store_cached_postings_compression_time_seconds to cortex_bucket_store_cached_postings_compression_time_seconds_total. #5431
  • [CHANGE] Store Gateway: Rename cortex_bucket_store_cached_series_fetch_duration_seconds to cortex_bucket_store_series_fetch_duration_seconds and cortex_bucket_store_cached_postings_fetch_duration_seconds to cortex_bucket_store_postings_fetch_duration_seconds. Add new metric cortex_bucket_store_chunks_fetch_duration_seconds. #5448
  • [CHANGE] Store Gateway: Remove idle_timeout, max_conn_age, pool_size, min_idle_conns fields for Redis index cache and caching bucket. #5448
  • [CHANGE] Store Gateway: Add flag -store-gateway.sharding-ring.zone-stable-shuffle-sharding to enable store gateway to use zone stable shuffle sharding. #5489
  • [CHANGE] Bucket Index: Add series_max_size and chunk_max_size to bucket index. #5489
  • [CHANGE] StoreGateway: Rename cortex_bucket_store_chunk_pool_returned_bytes_total and cortex_bucket_store_chunk_pool_requested_bytes_total to cortex_bucket_store_chunk_pool_operation_bytes_total. #5552
  • [CHANGE] Query Frontend/Querier: Make build info API disabled by default and add feature flag api.build-info-enabled to enable it. #5533
  • [CHANGE] Purger: Do no use S3 tenant kms key when uploading deletion marker. #5575
  • [CHANGE] Ingester: Shipper always allows uploading compacted blocks to ship OOO compacted blocks. #5625
  • [CHANGE] DDBKV: Change metric name from dynamodb_kv_read_capacity_total to dynamodb_kv_consumed_capacity_total and include Delete, Put, Batch dimension. #5487
  • [CHANGE] Compactor: Adding the userId on the compact dir path. #5524
  • [CHANGE] Ingester: Remove deprecated ingester metrics. #5472
  • [CHANGE] Query Frontend: Expose -querier.max-subquery-steps to configure subquery max steps check. By default, the limit is set to 0, which is disabled. #5656
  • [FEATURE] Store Gateway: Implementing multi level index cache. #5451
  • [FEATURE] Ruler: Add support for disabling rule groups. #5521
  • [FEATURE] Support object storage backends for runtime configuration file. #5292
  • [FEATURE] Ruler: Add support for Limit field on RuleGroup. #5528
  • [FEATURE] AlertManager: Add support for Webex, Discord and Telegram Receiver. #5493
  • [FEATURE] Ingester: added -admin-limit-message to customize the message contained in limit errors.#5460
  • [FEATURE] AlertManager: Update version to v0.26.0 and bring in Microsoft Teams receiver. #5543
  • [FEATURE] Store Gateway: Support lazy expanded posting optimization. Added new flag blocks-storage.bucket-store.lazy-expanded-postings-enabled and new metrics cortex_bucket_store_lazy_expanded_postings_total, cortex_bucket_store_lazy_expanded_posting_size_bytes_total and cortex_bucket_store_lazy_expanded_posting_series_overfetched_size_bytes_total. #5556.
  • [FEATURE] Store Gateway: Add max_downloaded_bytes_per_request to limit max bytes to download per store gateway request. #5179
  • [FEATURE] Added 2 flags -alertmanager.alertmanager-client.grpc-max-send-msg-size and -alertmanager.alertmanager-client.grpc-max-recv-msg-size to configure alert manager grpc client message size limits. #5338
  • [FEATURE] Querier/StoreGateway: Allow the tenant shard sizes to be a percent of total instances. #5393
  • [FEATURE] Added the flag -alertmanager.api-concurrency to configure alert manager api concurrency limit. #5412
  • [FEATURE] Store Gateway: Add -store-gateway.sharding-ring.keep-instance-in-the-ring-on-shutdown to skip unregistering instance from the ring in shutdown. #5421
  • [FEATURE] Ruler: Support for filtering rules in the API. #5417
  • [FEATURE] Compactor: Add -compactor.ring.tokens-file-path to store generated tokens locally. #5432
  • [FEATURE] Query Frontend: Add -frontend.retry-on-too-many-outstanding-requests to re-enqueue 429 requests if there are multiple query-schedulers available. #5496
  • [FEATURE] Store Gateway: Add -blocks-storage.bucket-store.max-inflight-requests for store gateways to reject further series requests upon reaching the limit. #5553
  • [FEATURE] Store Gateway: Support filtered index cache. #5587
  • [ENHANCEMENT] Update go version to 1.21.3. #5630
  • [ENHANCEMENT] Store Gateway: Add cortex_bucket_store_block_load_duration_seconds histogram to track time to load blocks. #5580
  • [ENHANCEMENT] Querier: retry chunk pool exhaustion error in querier rather than query frontend. #5569
  • [ENHANCEMENT] Alertmanager: Added flag -alertmanager.alerts-gc-interval to configure alerts Garbage collection interval. #5550
  • [ENHANCEMENT] Query Frontend: enable vertical sharding on binary expr . #5507
  • [ENHANCEMENT] Query Frontend: Include user agent as part of query frontend log. #5450
  • [ENHANCEMENT] Query: Set CORS Origin headers for Query API #5388
  • [ENHANCEMENT] Query Frontend: Add cortex_rejected_queries_total metric for throttled queries. #5356
  • [ENHANCEMENT] Query Frontend: Optimize the decoding of SampleStream. #5349
  • [ENHANCEMENT] Compactor: Check ctx done when uploading visit marker. #5333
  • [ENHANCEMENT] AlertManager: Add cortex_alertmanager_dispatcher_aggregation_groups and cortex_alertmanager_dispatcher_alert_processing_duration_seconds metrics for dispatcher. #5592
  • [ENHANCEMENT] Store Gateway: Added new flag blocks-storage.bucket-store.series-batch-size to control how many series to fetch per batch in Store Gateway. #5582.
  • [ENHANCEMENT] Querier: Log query stats when querying store gateway. #5376
  • [ENHANCEMENT] Ruler: Add cortex_ruler_rule_group_load_duration_seconds and cortex_ruler_rule_group_sync_duration_seconds metrics. #5609
  • [ENHANCEMENT] Ruler: Add contextual info and query statistics to log #5604
  • [ENHANCEMENT] Distributor/Ingester: Add span on push path #5319
  • [ENHANCEMENT] Query Frontend: Reject subquery with too small step size. #5323
  • [ENHANCEMENT] Compactor: Exposing Thanos accept-malformed-index to Cortex compactor. #5334
  • [ENHANCEMENT] Log: Avoid expensive log.Valuer evaluation for disallowed levels. #5297
  • [ENHANCEMENT] Improving Performance on the API Gzip Handler. #5347
  • [ENHANCEMENT] Dynamodb: Add puller-sync-time to allow different pull time for ring. #5357
  • [ENHANCEMENT] Emit querier max_concurrent as a metric. #5362
  • [ENHANCEMENT] Avoid sort tokens on lifecycler autoJoin. #5394
  • [ENHANCEMENT] Do not resync blocks in running store gateways during rollout deployment and container restart. #5363
  • [ENHANCEMENT] Store Gateway: Add new metrics cortex_bucket_store_sent_chunk_size_bytes, cortex_bucket_store_postings_size_bytes and cortex_bucket_store_empty_postings_total. #5397
  • [ENHANCEMENT] Add jitter to lifecycler heartbeat. #5404
  • [ENHANCEMENT] Store Gateway: Add config estimated_max_series_size_bytes and estimated_max_chunk_size_bytes to address data overfetch. #5401
  • [ENHANCEMENT] Distributor/Ingester: Add experimental -distributor.sign_write_requests flag to sign the write requests. #5430
  • [ENHANCEMENT] Store Gateway/Querier/Compactor: Handling CMK Access Denied errors. #5420 #5442 #5446
  • [ENHANCEMENT] Alertmanager: Add the alert name in error log when it get throttled. #5456
  • [ENHANCEMENT] Querier: Retry store gateway on different zones when zone awareness is enabled. #5476
  • [ENHANCEMENT] Compactor: allow unregister_on_shutdown to be configurable. #5503
  • [ENHANCEMENT] Querier: Batch adding series to query limiter to optimize locking. #5505
  • [ENHANCEMENT] Store Gateway: add metric cortex_bucket_store_chunk_refetches_total for number of chunk refetches. #5532
  • [ENHANCEMENT] BasicLifeCycler: allow final-sleep during shutdown #5517
  • [ENHANCEMENT] All: Handling CMK Access Denied errors. #5420 #5542
  • [ENHANCEMENT] Querier: Retry store gateway client connection closing gRPC error. #5558
  • [ENHANCEMENT] QueryFrontend: Add generic retry for all APIs. #5561.
  • [ENHANCEMENT] Querier: Check context before notifying scheduler and frontend. #5565
  • [ENHANCEMENT] QueryFrontend: Add metric for number of series requests. #5373
  • [ENHANCEMENT] Store Gateway: Add histogram metrics for total time spent fetching series and chunks per request. #5573
  • [ENHANCEMENT] Store Gateway: Check context in multi level cache. Add cortex_store_multilevel_index_cache_fetch_duration_seconds and cortex_store_multilevel_index_cache_backfill_duration_seconds to measure fetch and backfill latency. #5596
  • [ENHANCEMENT] Ingester: Added new ingester TSDB metrics cortex_ingester_tsdb_head_samples_appended_total, cortex_ingester_tsdb_head_out_of_order_samples_appended_total, cortex_ingester_tsdb_snapshot_replay_error_total, cortex_ingester_tsdb_sample_ooo_delta and cortex_ingester_tsdb_mmap_chunks_total. #5624
  • [ENHANCEMENT] Query Frontend: Handle context error before decoding and merging responses. #5499
  • [ENHANCEMENT] Store-Gateway and AlertManager: Add a wait_instance_time_out to context to avoid waiting forever. #5581
  • [BUGFIX] Compactor: Fix possible division by zero during compactor config validation. #5535
  • [BUGFIX] Ruler: Validate if rule group can be safely converted back to rule group yaml from protobuf message #5265
  • [BUGFIX] Querier: Convert gRPC ResourceExhausted status code from store gateway to 422 limit error. #5286
  • [BUGFIX] Alertmanager: Route web-ui requests to the alertmanager distributor when sharding is enabled. #5293
  • [BUGFIX] Storage: Bucket index updater should ignore meta not found for partial blocks. #5343
  • [BUGFIX] Ring: Add JOINING state to read operation. #5346
  • [BUGFIX] Compactor: Partial block with only visit marker should be deleted even there is no deletion marker. #5342
  • [BUGFIX] KV: Etcd calls will no longer block indefinitely and will now time out after the DialTimeout period. #5392
  • [BUGFIX] Ring: Allow RF greater than number of zones to select more than one instance per zone #5411
  • [BUGFIX] Store Gateway: Fix bug in store gateway ring comparison logic. #5426
  • [BUGFIX] Ring: Fix bug in consistency of Get func in a scaling zone-aware ring. #5429
  • [BUGFIX] Compactor: Fix retry on markers. #5441
  • [BUGFIX] Query Frontend: Fix bug of failing to cancel downstream request context in query frontend v2 mode (query scheduler enabled). #5447
  • [BUGFIX] Alertmanager: Remove the user id from state replication key metric label value. #5453
  • [BUGFIX] Compactor: Avoid cleaner concurrency issues checking global markers before all blocks. #5457
  • [BUGFIX] DDBKV: Disallow instance with older timestamp to update instance with newer timestamp. #5480
  • [BUGFIX] DDBKV: When no change detected in ring, retry the CAS until there is change. #5502
  • [BUGFIX] Fix bug on objstore when configured to use S3 fips endpoints. #5540
  • [BUGFIX] Ruler: Fix bug on ruler where a failure to load a single RuleGroup would prevent rulers to sync all RuleGroup. #5563
  • [BUGFIX] Query Frontend: Fix query string being omitted in query stats log. #5655

Don't miss a new cortex release

NewReleases is sending notifications on new releases.