github grafana/mimir mimir-2.6.0
2.6.0

latest releases: mimir-distributed-5.6.0-weekly.315, mimir-2.14.1, mimir-distributed-5.6.0-weekly.314...
21 months ago

This release contains 259 PRs from 40 authors, including new contributors breadly7, bubu11e, Đurica Yuri Nikolić, Felix Beuke, Jack, klagroix, Martin Chodur, Ørjan Ommundsen, Sascha Sternheim, Wu Zhiyuan. Thank you!

Grafana Mimir version 2.6.0 release notes

Grafana Labs is excited to announce version 2.6 of Grafana Mimir.

The highlights that follow include the top features, enhancements, and bugfixes in this release. For the complete list of changes, see the changelog.

Features and enhancements

  • Lower memory usage in store-gateway by streaming series results
    The store-gateway can now stream results back to the querier instead of buffering them. This is expected to greatly reduce peak memory consumption while keeping latency the same. This is still an experimental feature but Grafana Labs is already running it in production and there's no known issue. This feature can be enabled setting the -blocks-storage.bucket-store.batch-series-size configuration option (if you want to try it out, we recommend you setting to 5000).

  • Improved stability in store-gateway by removing mmap usage
    The store-gateway can now use an alternate code path to read index-headers that does not use memory mapped files. This is expected to improve stability of the store-gateway. This is still an experimental feature but Grafana Labs is already running it in production and there's no known issue. This feature can be enabled setting -blocks-storage.bucket-store.index-header.stream-reader-enabled=true.

Alertmanager improvements

  • Webex support Alertmanager can now use Webex to send alerts.

  • tenantID template function A new template function tenantID, returning the ID of the tenant owning the alert, has been added.

  • grafanaExploreURL template function A new template function grafanaExploreURL, returning the URL to the Grafana explore page with range query, has been added.

Helm chart improvements

The Grafana Mimir and Grafana Enterprise Metrics Helm chart is now released independently. See the corresponding documentation for more information.

Important changes

In Grafana Mimir 2.6 we have removed the following previously deprecated or experimental configuration options:

  • The CLI flag -blocks-storage.bucket-store.max-concurrent-reject-over-limit and its respective YAML configuration option blocks_storage.bucket_store.max_concurrent_reject_over_limit.
  • The CLI flag -query-frontend.align-querier-with-step and its respective YAML configuration option frontend.align_querier_with_step.

The following configuration options are deprecated and will be removed in Grafana Mimir 2.8:

  • The CLI flag -store.max-query-length and its respective YAML configuration option limits.max_query_length have been replaced with -querier.max-partial-query-length and limits.max_partial_query_length.

The following experimental options and features are now stable:

  • The CLI flag -query-frontend.max-total-query-length and its respective YAML configuration option limits.max_total_query_length.
  • The CLI flags -distributor.request-rate-limit and -distributor.request-burst-limit and their respective YAML configuration options limits.request_rate_limit and limits.request_rate_burst.
  • The CLI flag -ingester.max-global-exemplars-per-user and its respective YAML configuration option limits.max_global_exemplars_per_user.
  • The CLI flag -ingester.tsdb-config-update-period its respective YAML configuration option ingester.tsdb_config_update_period.
  • The API endpoint /api/v1/query_exemplars.

Bug fixes

  • Alertmanager: Fix template spurious deletion with relative data dir. PR 3604
  • Security: Update prometheus/exporter-toolkit for CVE-2022-46146. PR 3675
  • Security: Update golang.org/x/net for CVE-2022-41717. PR 3755
  • Debian package: Fix post-install, environment file path and user creation. PR 3720
  • Memberlist: Fix panic during Mimir startup when Mimir receives gossip message before it's ready. PR 3746
  • Update github.com/thanos-io/objstore to address issue with Multipart PUT on s3-compatible Object Storage. PR 3802 PR 3821
  • Querier: Canceled requests are no longer reported as "consistency check" failures. PR 3837 PR 3927
  • Distributor: Don't panic when metric_relabel_configs in overrides contains null element. PR 3868
  • Ingester, Compactor: Fix panic that can occur when compaction fails. PR 3955

Changelog

2.6.0

Grafana Mimir

  • [CHANGE] Querier: Introduce -querier.max-partial-query-length to limit the time range for partial queries at the querier level and deprecate -store.max-query-length. #3825 #4017
  • [CHANGE] Store-gateway: Remove experimental -blocks-storage.bucket-store.max-concurrent-reject-over-limit flag. #3706
  • [CHANGE] Ingester: If shipping is enabled block retention will now be relative to the upload time to cloud storage. If shipping is disabled block retention will be relative to the creation time of the block instead of the mintime of the last block created. #3816
  • [CHANGE] Query-frontend: Deprecated CLI flag -query-frontend.align-querier-with-step has been removed. #3982
  • [FEATURE] Store-gateway: streaming of series. The store-gateway can now stream results back to the querier instead of buffering them. This is expected to greatly reduce peak memory consumption while keeping latency the same. You can enable this feature by setting -blocks-storage.bucket-store.batch-series-size to a value in the high thousands (5000-10000). This is still an experimental feature and is subject to a changing API and instability. #3540 #3546 #3587 #3606 #3611 #3620 #3645 #3355 #3697 #3666 #3687 #3728 #3739 #3751 #3779 #3839
  • [FEATURE] Alertmanager: Added support for the Webex receiver. #3758
  • [FEATURE] Limits: Added the -validation.separate-metrics-group-label flag. This allows further separation of the cortex_discarded_samples_total metric by an additional group label - which is configured by this flag to be the value of a specific label on an incoming timeseries. Active groups are tracked and inactive groups are cleaned up on a defined interval. The maximum number of groups tracked is controlled by the -max-separate-metrics-groups-per-user flag. #3439
  • [FEATURE] Overrides-exporter: Added experimental ring support to overrides-exporter via -overrides-exporter.ring.enabled. When enabled, the ring is used to establish a leader replica for the export of limit override metrics. #3908 #3953
  • [FEATURE] Ephemeral storage (experimental): Mimir can now accept samples into "ephemeral storage". Such samples are available for querying for a short amount of time (-blocks-storage.ephemeral-tsdb.retention-period, defaults to 10 minutes), and then removed from memory. To use ephemeral storage, distributor must be configured with -distributor.ephemeral-series-enabled option. Series matching -distributor.ephemeral-series-matchers will be marked for storing into ephemeral storage in ingesters. Each tenant needs to have ephemeral storage enabled by using -ingester.max-ephemeral-series-per-user limit, which defaults to 0 (no ephemeral storage). Ingesters have new -ingester.instance-limits.max-ephemeral-series limit for total number of series in ephemeral storage across all tenants. If ingestion of samples into ephemeral storage fails, cortex_discarded_samples_total metric will use values prefixed with ephemeral- for reason label. Querying of ephemeral storage is possible by using {__mimir_storage__="ephemeral"} as metric selector. Following new metrics related to ephemeral storage are introduced: #3897 #3922 #3961 #3997 #4004
    • cortex_ingester_ephemeral_series
    • cortex_ingester_ephemeral_series_created_total
    • cortex_ingester_ephemeral_series_removed_total
    • cortex_ingester_ingested_ephemeral_samples_total
    • cortex_ingester_ingested_ephemeral_samples_failures_total
    • cortex_ingester_memory_ephemeral_users
    • cortex_ingester_queries_ephemeral_total
    • cortex_ingester_queried_ephemeral_samples
    • cortex_ingester_queried_ephemeral_series
  • [ENHANCEMENT] Added new metric thanos_shipper_last_successful_upload_time: Unix timestamp (in seconds) of the last successful TSDB block uploaded to the bucket. #3627
  • [ENHANCEMENT] Ruler: Added -ruler.alertmanager-client.tls-enabled configuration for alertmanager client. #3432 #3597
  • [ENHANCEMENT] Activity tracker logs now have component=activity-tracker label. #3556
  • [ENHANCEMENT] Distributor: remove labels with empty values #2439
  • [ENHANCEMENT] Query-frontend: track query HTTP requests in the Activity Tracker. #3561
  • [ENHANCEMENT] Store-gateway: Add experimental alternate implementation of index-header reader that does not use memory mapped files. The index-header reader is expected to improve stability of the store-gateway. You can enable this implementation with the flag -blocks-storage.bucket-store.index-header.stream-reader-enabled. #3639 #3691 #3703 #3742 #3785 #3787 #3797
  • [ENHANCEMENT] Query-scheduler: add cortex_query_scheduler_cancelled_requests_total metric to track the number of requests that are already cancelled when dequeued. #3696
  • [ENHANCEMENT] Store-gateway: add cortex_bucket_store_partitioner_extended_ranges_total metric to keep track of the ranges that the partitioner decided to overextend and merge in order to save API call to the object storage. #3769
  • [ENHANCEMENT] Compactor: Auto-forget unhealthy compactors after ten failed ring heartbeats. #3771
  • [ENHANCEMENT] Ruler: change default value of -ruler.for-grace-period from 10m to 2m and update help text. The new default value reflects how we operate Mimir at Grafana Labs. #3817
  • [ENHANCEMENT] Ingester: Added experimental flags to force usage of postings for matchers cache. These flags will be removed in the future and it's not recommended to change them. #3823
    • -blocks-storage.tsdb.head-postings-for-matchers-cache-ttl
    • -blocks-storage.tsdb.head-postings-for-matchers-cache-size
    • -blocks-storage.tsdb.head-postings-for-matchers-cache-force
  • [ENHANCEMENT] Ingester: Improved series selection performance when some of the matchers do not match any series. #3827
  • [ENHANCEMENT] Alertmanager: Add new additional template function tenantID returning id of the tenant owning the alert. #3758
  • [ENHANCEMENT] Alertmanager: Add additional template function grafanaExploreURL returning URL to grafana explore with range query. #3849
  • [ENHANCEMENT] Reduce overhead of debug logging when filtered out. #3875
  • [ENHANCEMENT] Update Docker base images from alpine:3.16.2 to alpine:3.17.1. #3898
  • [ENHANCEMENT] Ingester: Add new /ingester/tsdb_metrics endpoint to return tenant-specific TSDB metrics. #3923
  • [ENHANCEMENT] Query-frontend: CLI flag -query-frontend.max-total-query-length and its associated YAML configuration is now stable. #3882
  • [ENHANCEMENT] Ruler: rule groups now support optional and experimental align_evaluation_time_on_interval field, which causes all evaluations to happen on interval-aligned timestamp. #4013
  • [ENHANCEMENT] Query-scheduler: ring-based service discovery is now stable. #4028
  • [BUGFIX] Log the names of services that are not yet running rather than unsupported value type when calling /ready and some services are not running. #3625
  • [BUGFIX] Alertmanager: Fix template spurious deletion with relative data dir. #3604
  • [BUGFIX] Security: update prometheus/exporter-toolkit for CVE-2022-46146. #3675
  • [BUGFIX] Security: update golang.org/x/net for CVE-2022-41717. #3755
  • [BUGFIX] Debian package: Fix post-install, environment file path and user creation. #3720
  • [BUGFIX] memberlist: Fix panic during Mimir startup when Mimir receives gossip message before it's ready. #3746
  • [BUGFIX] Store-gateway: fix cortex_bucket_store_partitioner_requested_bytes_total metric to not double count overlapping ranges. #3769
  • [BUGFIX] Update github.com/thanos-io/objstore to address issue with Multipart PUT on s3-compatible Object Storage. #3802 #3821
  • [BUGFIX] Distributor, Query-scheduler: Make sure ring metrics include a cortex_ prefix as expected by dashboards. #3809
  • [BUGFIX] Querier: canceled requests are no longer reported as "consistency check" failures. #3837 #3927
  • [BUGFIX] Distributor: don't panic when metric_relabel_configs in overrides contains null element. #3868
  • [BUGFIX] Distributor: don't panic when OTLP histograms don't have any buckets. #3853
  • [BUGFIX] Ingester, Compactor: fix panic that can occur when compaction fails. #3955
  • [BUGFIX] Store-gateway: return Canceled rather than Aborted error when the calling querier cancels the request. #4007

Mixin

  • [ENHANCEMENT] Alerts: Added MimirIngesterInstanceHasNoTenants alert that fires when an ingester replica is not receiving write requests for any tenant. #3681
  • [ENHANCEMENT] Alerts: Extended MimirAllocatingTooMuchMemory to check read-write deployment containers. #3710
  • [ENHANCEMENT] Alerts: Added MimirAlertmanagerInstanceHasNoTenants alert that fires when an alertmanager instance ows no tenants. #3826
  • [ENHANCEMENT] Alerts: Added MimirRulerInstanceHasNoRuleGroups alert that fires when a ruler replica is not assigned any rule group to evaluate. #3723
  • [ENHANCEMENT] Support for baremetal deployment for alerts and scaling recording rules. #3719
  • [ENHANCEMENT] Dashboards: querier autoscaling now supports multiple scaled objects (configurable via $._config.autoscale.querier.hpa_name). #3962
  • [BUGFIX] Alerts: Fixed MimirIngesterRestarts alert when Mimir is deployed in read-write mode. #3716
  • [BUGFIX] Alerts: Fixed MimirIngesterHasNotShippedBlocks and MimirIngesterHasNotShippedBlocksSinceStart alerts for when Mimir is deployed in read-write or monolithic modes and updated them to use new thanos_shipper_last_successful_upload_time metric. #3627
  • [BUGFIX] Alerts: Fixed MimirMemoryMapAreasTooHigh alert when Mimir is deployed in read-write mode. #3626
  • [BUGFIX] Alerts: Fixed MimirCompactorSkippedBlocksWithOutOfOrderChunks matching on non-existent label. #3628
  • [BUGFIX] Dashboards: Fix Rollout Progress dashboard incorrectly using Gateway metrics when Gateway was not enabled. #3709
  • [BUGFIX] Tenants dashboard: Make it compatible with all deployment types. #3754
  • [BUGFIX] Alerts: Fixed MimirCompactorHasNotUploadedBlocks to not fire if compactor has nothing to do. #3793
  • [BUGFIX] Alerts: Fixed MimirAutoscalerNotActive to not fire if scaling metric is 0, to avoid false positives on scaled objects with 0 min replicas. #3999

Jsonnet

  • [CHANGE] Replaced the deprecated policy/v1beta1 with policy/v1 when configuring a PodDisruptionBudget for read-write deployment mode. #3811
  • [CHANGE] Removed -server.http-write-timeout default option value from querier and query-frontend, as it defaults to a higher value in the code now, and cannot be lower than -querier.timeout. #3836
  • [CHANGE] Replaced -store.max-query-length with -query-frontend.max-total-query-length in the query-frontend config. #3879
  • [CHANGE] Changed default mimir_backend_data_disk_size from 100Gi to 250Gi. #3894
  • [ENHANCEMENT] Update rollout-operator to v0.2.0. #3624
  • [ENHANCEMENT] Add user_24M and user_32M classes to operations config. #3367
  • [ENHANCEMENT] Update memcached image from memcached:1.6.16-alpine to memcached:1.6.17-alpine. #3914
  • [ENHANCEMENT] Allow configuring the ring for overrides-exporter. #3995
  • [BUGFIX] Apply ingesters and store-gateways per-zone CLI flags overrides to read-write deployment mode too. #3766
  • [BUGFIX] Apply overrides-exporter CLI flags to mimir-backend when running Mimir in read-write deployment mode. #3790
  • [BUGFIX] Fixed mimir-write and mimir-read Kubernetes service to correctly balance requests among pods. #3855 #3864 #3906
  • [BUGFIX] Fixed ruler-query-frontend and mimir-read gRPC server configuration to force clients to periodically re-resolve the backend addresses. #3862
  • [BUGFIX] Fixed mimir-read CLI flags to ensure query-frontend configuration takes precedence over querier configuration. #3877

Mimirtool

  • [ENHANCEMENT] Update mimirtool config convert to work with Mimir 2.4, 2.5, 2.6 changes. #3952
  • [ENHANCEMENT] Mimirtool is now available to install through Homebrew with brew install mimirtool. #3776
  • [ENHANCEMENT] Added --concurrency to mimirtool rules sync command. #3996
  • [BUGFIX] Fix summary output from mimirtool rules sync to display correct number of groups created and updated. #3918

Documentation

  • [BUGFIX] Querier: Remove assertion that the -querier.max-concurrent flag must also be set for the query-frontend. #3678
  • [ENHANCEMENT] Update migration from cortex documentation. #3662
  • [ENHANCEMENT] Query-scheduler: documented how to migrate from DNS-based to ring-based service discovery. #4028

Tools

All changes in this release: mimir-2.5.0...mimir-2.6.0

Don't miss a new mimir release

NewReleases is sending notifications on new releases.