Altinity/clickhouse-operator release-0.22.0 on GitHub

Added

Support volume re-provisioning. If volume is broken and PVC detects it as lost, operator re-provisions the volume
When new CHI is created, all hosts are created in parallel
Allow to turn off waiting for running queries to complete. This can be done both in operator configuration or in CHI itself:
In operator configuration:

spec:
  reconcile:
    host:
      wait:
        queries: "false"

In CHI:

spec:
  reconciling:
    policy: nowait

When changes are applied to clusters with a lot of shards, the change is probed on a first node only. Is successul, it is applied on 50% of shards. This can be configured in operator configuration:

reconcile:
  # Reconcile runtime settings
  runtime:
    # Max number of concurrent CHI reconciles in progress
    reconcileCHIsThreadsNumber: 10

    # The operator reconciles shards concurrently in each CHI with the following limitations:
    #   1. Number of shards being reconciled (and thus having hosts down) in each CHI concurrently
    #      can not be greater than 'reconcileShardsThreadsNumber'.
    #   2. Percentage of shards being reconciled (and thus having hosts down) in each CHI concurrently
    #      can not be greater than 'reconcileShardsMaxConcurrencyPercent'.
    #   3. The first shard is always reconciled alone. Concurrency starts from the second shard and onward.
    # Thus limiting number of shards being reconciled (and thus having hosts down) in each CHI by both number and percentage

    # Max number of concurrent shard reconciles within one CHI in progress
    reconcileShardsThreadsNumber: 5
    # Max percentage of concurrent shard reconciles within one CHI in progress
    reconcileShardsMaxConcurrencyPercent: 50

Operator-related metrics are exposed to Prometheus now:

clickhouse_operator_chi_reconciles_started
clickhouse_operator_chi_reconciles_completed
clickhouse_operator_chi_reconciles_timings

clickhouse_operator_host_reconciles_started
clickhouse_operator_host_reconciles_completed
clickhouse_operator_host_reconciles_restarts
clickhouse_operator_host_reconciles_errors
clickhouse_operator_host_reconciles_timings

clickhouse_operator_pod_add_events
clickhouse_operator_pod_update_events
clickhouse_operator_pod_delete_events

Changed

fix typo in operator_installation_details.md by @seeekr in #1219
Set operator release date fot createdAt CSV field by @dmvolod in #1223
Fix type for exclude and include fields in 70-chop-config.yaml example by @dmvolod in #1222
change dashboard refresh rate 1m and add min_duration_ms, max_duration_ms dashboard variables, rename query_type to query_kind by @Slach in #1235
add securityContext to helm chart by @farodin91 in #1255
metrics-exporter collects all hosts and queries in parallel

Fixed

Fixed a bug when operator could break multiple nodes if incorrect configuration has been deployed several times in a row
Fixed a bug when schema could not be created on new nodes, if nodes took too long to start
Fixed a bug when services were not reconciled in rare cases

New Contributors

@seeekr made their first contribution in #1219
@dmvolod made their first contribution in #1223
@farodin91 made their first contribution in #1255

Full Changelog: release-0.21.3...release-0.22.0