github cortexproject/cortex v1.5.0
Cortex 1.5.0

latest releases: v1.18.1, v1.18.0, v1.18.0-rc.0...
3 years ago

Changelog

Cortex

  • [CHANGE] Blocks storage: update the default HTTP configuration values for the S3 client to the upstream Thanos default values. #3244
    • -blocks-storage.s3.http.idle-conn-timeout is set 90 seconds.
    • -blocks-storage.s3.http.response-header-timeout is set to 2 minutes.
  • [CHANGE] Improved shuffle sharding support in the write path. This work introduced some config changes: #3090
    • Introduced -distributor.sharding-strategy CLI flag (and its respective sharding_strategy YAML config option) to explicitly specify which sharding strategy should be used in the write path
    • -experimental.distributor.user-subring-size flag renamed to -distributor.ingestion-tenant-shard-size
    • user_subring_size limit YAML config option renamed to ingestion_tenant_shard_size
  • [CHANGE] Dropped "blank Alertmanager configuration; using fallback" message from Info to Debug level. #3205
  • [CHANGE] Zone-awareness replication for time-series now should be explicitly enabled in the distributor via the -distributor.zone-awareness-enabled CLI flag (or its respective YAML config option). Before, zone-aware replication was implicitly enabled if a zone was set on ingesters. #3200
  • [CHANGE] Removed the deprecated CLI flag -config-yaml. You should use -schema-config-file instead. #3225
  • [CHANGE] Enforced the HTTP method required by some API endpoints which did (incorrectly) allow any method before that. #3228
    • GET /
    • GET /config
    • GET /debug/fgprof
    • GET /distributor/all_user_stats
    • GET /distributor/ha_tracker
    • GET /all_user_stats
    • GET /ha-tracker
    • GET /api/v1/user_stats
    • GET /api/v1/chunks
    • GET <legacy-http-prefix>/user_stats
    • GET <legacy-http-prefix>/chunks
    • GET /services
    • GET /multitenant_alertmanager/status
    • GET /status (alertmanager microservice)
    • GET|POST /ingester/ring
    • GET|POST /ring
    • GET|POST /store-gateway/ring
    • GET|POST /compactor/ring
    • GET|POST /ingester/flush
    • GET|POST /ingester/shutdown
    • GET|POST /flush
    • GET|POST /shutdown
    • GET|POST /ruler/ring
    • POST /api/v1/push
    • POST <legacy-http-prefix>/push
    • POST /push
    • POST /ingester/push
  • [CHANGE] Renamed CLI flags to configure the network interface names from which automatically detect the instance IP. #3295
    • -compactor.ring.instance-interface renamed to -compactor.ring.instance-interface-names
    • -store-gateway.sharding-ring.instance-interface renamed to -store-gateway.sharding-ring.instance-interface-names
    • -distributor.ring.instance-interface renamed to -distributor.ring.instance-interface-names
    • -ruler.ring.instance-interface renamed to -ruler.ring.instance-interface-names
  • [CHANGE] Renamed -<prefix>.redis.enable-tls CLI flag to -<prefix>.redis.tls-enabled, and its respective YAML config option from enable_tls to tls_enabled. #3298
  • [CHANGE] Increased default -<prefix>.redis.timeout from 100ms to 500ms. #3301
  • [CHANGE] cortex_alertmanager_config_invalid has been removed in favor of cortex_alertmanager_config_last_reload_successful. #3289
  • [CHANGE] Query-frontend: POST requests whose body size exceeds 10MiB will be rejected. The max body size can be customised via -frontend.max-body-size. #3276
  • [FEATURE] Shuffle sharding: added support for shuffle-sharding queriers in the query-frontend. When configured (-frontend.max-queriers-per-tenant globally, or using per-tenant limit max_queriers_per_tenant), each tenants's requests will be handled by different set of queriers. #3113 #3257
  • [FEATURE] Shuffle sharding: added support for shuffle-sharding ingesters on the read path. When ingesters shuffle-sharding is enabled and -querier.shuffle-sharding-ingesters-lookback-period is set, queriers will fetch in-memory series from the minimum set of required ingesters, selecting only ingesters which may have received series since 'now - lookback period'. #3252
  • [FEATURE] Query-frontend: added compression config to support results cache with compression. #3217
  • [FEATURE] Add OpenStack Swift support to blocks storage. #3303
  • [FEATURE] Added support for applying Prometheus relabel configs on series received by the distributor. A metric_relabel_configs field has been added to the per-tenant limits configuration. #3329
  • [FEATURE] Support for Cassandra client SSL certificates. #3384
  • [ENHANCEMENT] Ruler: Introduces two new limits -ruler.max-rules-per-rule-group and -ruler.max-rule-groups-per-tenant to control the number of rules per rule group and the total number of rule groups for a given user. They are disabled by default. #3366
  • [ENHANCEMENT] Allow to specify multiple comma-separated Cortex services to -target CLI option (or its respective YAML config option). For example, -target=all,compactor can be used to start Cortex single-binary with compactor as well. #3275
  • [ENHANCEMENT] Expose additional HTTP configs for the S3 backend client. New flag are listed below: #3244
    • -blocks-storage.s3.http.idle-conn-timeout
    • -blocks-storage.s3.http.response-header-timeout
    • -blocks-storage.s3.http.insecure-skip-verify
  • [ENHANCEMENT] Added cortex_query_frontend_connected_clients metric to show the number of workers currently connected to the frontend. #3207
  • [ENHANCEMENT] Shuffle sharding: improved shuffle sharding in the write path. Shuffle sharding now should be explicitly enabled via -distributor.sharding-strategy CLI flag (or its respective YAML config option) and guarantees stability, consistency, shuffling and balanced zone-awareness properties. #3090 #3214
  • [ENHANCEMENT] Ingester: added new metric cortex_ingester_active_series to track active series more accurately. Also added options to control whether active series tracking is enabled (-ingester.active-series-enabled, defaults to false), and how often this metric is updated (-ingester.active-series-update-period) and max idle time for series to be considered inactive (-ingester.active-series-idle-timeout). #3153
  • [ENHANCEMENT] Store-gateway: added zone-aware replication support to blocks replication in the store-gateway. #3200
  • [ENHANCEMENT] Store-gateway: exported new metrics. #3231
    • cortex_bucket_store_cached_series_fetch_duration_seconds
    • cortex_bucket_store_cached_postings_fetch_duration_seconds
    • cortex_bucket_stores_gate_queries_max
  • [ENHANCEMENT] Added -version flag to Cortex. #3233
  • [ENHANCEMENT] Hash ring: added instance registered timestamp to the ring. #3248
  • [ENHANCEMENT] Reduce tail latency by smoothing out spikes in rate of chunk flush operations. #3191
  • [ENHANCEMENT] User Cortex as User Agent in http requests issued by Configs DB client. #3264
  • [ENHANCEMENT] Experimental Ruler API: Fetch rule groups from object storage in parallel. #3218
  • [ENHANCEMENT] Chunks GCS object storage client uses the fields selector to limit the payload size when listing objects in the bucket. #3218 #3292
  • [ENHANCEMENT] Added shuffle sharding support to ruler. Added new metric cortex_ruler_sync_rules_total. #3235
  • [ENHANCEMENT] Return an explicit error when the store-gateway is explicitly requested without a blocks storage engine. #3287
  • [ENHANCEMENT] Ruler: only load rules that belong to the ruler. Improves rules synching performances when ruler sharding is enabled. #3269
  • [ENHANCEMENT] Added -<prefix>.redis.tls-insecure-skip-verify flag. #3298
  • [ENHANCEMENT] Added cortex_alertmanager_config_last_reload_successful_seconds metric to show timestamp of last successful AM config reload. #3289
  • [ENHANCEMENT] Blocks storage: reduced number of bucket listing operations to list block content (applies to newly created blocks only). #3363
  • [ENHANCEMENT] Ruler: Include the tenant ID on the notifier logs. #3372
  • [ENHANCEMENT] Blocks storage Compactor: Added -compactor.enabled-tenants and -compactor.disabled-tenants to explicitly enable or disable compaction of specific tenants. #3385
  • [ENHANCEMENT] Blocks storage ingester: Creating checkpoint only once even when there are multiple Head compactions in a single Compact() call. #3373
  • [BUGFIX] Blocks storage ingester: Read repair memory-mapped chunks file which can end up being empty on abrupt shutdowns combined with faulty disks. #3373
  • [BUGFIX] Blocks storage ingester: Close TSDB resources on failed startup preventing ingester OOMing. #3373
  • [BUGFIX] No-longer-needed ingester operations for queries triggered by queriers and rulers are now canceled. #3178
  • [BUGFIX] Ruler: directories in the configured rules-path will be removed on startup and shutdown in order to ensure they don't persist between runs. #3195
  • [BUGFIX] Handle hash-collisions in the query path. #3192
  • [BUGFIX] Check for postgres rows errors. #3197
  • [BUGFIX] Ruler Experimental API: Don't allow rule groups without names or empty rule groups. #3210
  • [BUGFIX] Experimental Alertmanager API: Do not allow empty Alertmanager configurations or bad template filenames to be submitted through the configuration API. #3185
  • [BUGFIX] Reduce failures to update heartbeat when using Consul. #3259
  • [BUGFIX] When using ruler sharding, moving all user rule groups from ruler to a different one and then back could end up with some user groups not being evaluated at all. #3235
  • [BUGFIX] Fixed shuffle sharding consistency when zone-awareness is enabled and the shard size is increased or instances in a new zone are added. #3299
  • [BUGFIX] Use a valid grpc header when logging IP addresses. #3307
  • [BUGFIX] Fixed the metric cortex_prometheus_rule_group_duration_seconds in the Ruler, it wouldn't report any values. #3310
  • [BUGFIX] Fixed gRPC connections leaking in rulers when rulers sharding is enabled and APIs called. #3314
  • [BUGFIX] Fixed shuffle sharding consistency when zone-awareness is enabled and the shard size is increased or instances in a new zone are added. #3299
  • [BUGFIX] Fixed Gossip memberlist members joining when addresses are configured using DNS-based service discovery. #3360
  • [BUGFIX] Ingester: fail to start an ingester running the blocks storage, if unable to load any existing TSDB at startup. #3354
  • [BUGFIX] Blocks storage: Avoid deletion of blocks in the ingester which are not shipped to the storage yet. #3346
  • [BUGFIX] Fix common prefixes returned by List method of S3 client. #3358
  • [BUGFIX] Honor configured timeout in Azure and GCS object clients. #3285
  • [BUGFIX] Blocks storage: Avoid creating blocks larger than configured block range period on forced compaction and when TSDB is idle. #3344
  • [BUGFIX] Shuffle sharding: fixed max global series per user/metric limit when shuffle sharding and -distributor.shard-by-all-labels=true are both enabled in distributor. When using these global limits you should now set -distributor.sharding-strategy and -distributor.zone-awareness-enabled to ingesters too. #3369
  • [BUGFIX] Slow query logging: when using downstream server request parameters were not logged. #3276
  • [BUGFIX] Fixed tenant detection in the ruler and alertmanager API when running without auth. #3343

Blocksconvert

  • [ENHANCEMENT] Blocksconvert – Builder: download plan file locally before processing it. #3209
  • [ENHANCEMENT] Blocksconvert – Cleaner: added new tool for deleting chunks data. #3283
  • [ENHANCEMENT] Blocksconvert – Scanner: support for scanning specific date-range only. #3222
  • [ENHANCEMENT] Blocksconvert – Scanner: metrics for tracking progress. #3222
  • [ENHANCEMENT] Blocksconvert – Builder: retry block upload before giving up. #3245
  • [ENHANCEMENT] Blocksconvert – Scanner: upload plans concurrently. #3340
  • [BUGFIX] Blocksconvert: fix chunks ordering in the block. Chunks in different order than series work just fine in TSDB blocks at the moment, but it's not consistent with what Prometheus does and future Prometheus and Cortex optimizations may rely on this ordering. #3371

Don't miss a new cortex release

NewReleases is sending notifications on new releases.