Added
- Support volume re-provisioning. If volume is broken and PVC detects it as lost, operator re-provisions the volume
- When new CHI is created, all hosts are created in parallel
- Allow to turn off waiting for running queries to complete. This can be done both in operator configuration or in CHI itself:
In operator configuration:
spec:
reconcile:
host:
wait:
queries: "false"
In CHI:
spec:
reconciling:
policy: nowait
- When changes are applied to clusters with a lot of shards, the change is probed on a first node only. Is successul, it is applied on 50% of shards. This can be configured in operator configuration:
reconcile:
# Reconcile runtime settings
runtime:
# Max number of concurrent CHI reconciles in progress
reconcileCHIsThreadsNumber: 10
# The operator reconciles shards concurrently in each CHI with the following limitations:
# 1. Number of shards being reconciled (and thus having hosts down) in each CHI concurrently
# can not be greater than 'reconcileShardsThreadsNumber'.
# 2. Percentage of shards being reconciled (and thus having hosts down) in each CHI concurrently
# can not be greater than 'reconcileShardsMaxConcurrencyPercent'.
# 3. The first shard is always reconciled alone. Concurrency starts from the second shard and onward.
# Thus limiting number of shards being reconciled (and thus having hosts down) in each CHI by both number and percentage
# Max number of concurrent shard reconciles within one CHI in progress
reconcileShardsThreadsNumber: 5
# Max percentage of concurrent shard reconciles within one CHI in progress
reconcileShardsMaxConcurrencyPercent: 50
- Operator-related metrics are exposed to Prometheus now:
clickhouse_operator_chi_reconciles_started
clickhouse_operator_chi_reconciles_completed
clickhouse_operator_chi_reconciles_timings
clickhouse_operator_host_reconciles_started
clickhouse_operator_host_reconciles_completed
clickhouse_operator_host_reconciles_restarts
clickhouse_operator_host_reconciles_errors
clickhouse_operator_host_reconciles_timings
clickhouse_operator_pod_add_events
clickhouse_operator_pod_update_events
clickhouse_operator_pod_delete_events
Changed
- fix typo in operator_installation_details.md by @seeekr in #1219
- Set operator release date fot createdAt CSV field by @dmvolod in #1223
- Fix type for exclude and include fields in 70-chop-config.yaml example by @dmvolod in #1222
- change dashboard refresh rate 1m and add min_duration_ms, max_duration_ms dashboard variables, rename query_type to query_kind by @Slach in #1235
- add securityContext to helm chart by @farodin91 in #1255
- metrics-exporter collects all hosts and queries in parallel
Fixed
- Fixed a bug when operator could break multiple nodes if incorrect configuration has been deployed several times in a row
- Fixed a bug when schema could not be created on new nodes, if nodes took too long to start
- Fixed a bug when services were not reconciled in rare cases
New Contributors
- @seeekr made their first contribution in #1219
- @dmvolod made their first contribution in #1223
- @farodin91 made their first contribution in #1255
Full Changelog: release-0.21.3...release-0.22.0