Helmfile v1.6.0

This release introduces helmfile doctor — an AI-assisted diff analyzer that
reads your helmfile diff output and asks an LLM to summarize the changes and
flag risks before you apply them. We also ship parallel kubedog tracking
so resource convergence now happens alongside (not after) helm execution.

🩺 `helmfile doctor`: AI-assisted diff analysis

helmfile doctor runs helmfile diff, then sends the diff to any
OpenAI-compatible Chat Completions endpoint to produce a structured risk
report. It is designed to drop into a CI pipeline before helmfile apply so a
human reviewer (or a gate) gets a fast, opinionated second opinion on what is
about to change.

Quick start

# Configure via env (lowest precedence)...
export HELMFILE_LLM_API_KEY="sk-..."
export HELMFILE_LLM_MODEL="gpt-4o"

# ...or helmfile.yaml...
llm:
  baseURL: "https://api.openai.com/v1"
  apiKey: {{ env "OPENAI_API_KEY" }}
  model: "gpt-4o"

# ...or flags (highest precedence)
helmfile doctor --llm-model claude-3-5-sonnet

helmfile doctor

Example output:

# Helmfile Doctor Report

## Summary
Upgrades the checkout Deployment from v1.4 to v1.5 and raises the replica
count from 3 to 5. The database StatefulSet is unchanged.

## Risks

### 🔴 [HIGH] data-loss
The PVC `data-checkout-0` is marked for deletion ...
**Suggestion:** `kubectl get pvc data-checkout-0 -o yaml` before applying.

### 🟡 [MEDIUM] downtime
No PodDisruptionBudget found for the checkout Deployment ...
**Suggestion:** Add a PDB before scaling.

---
Model: gpt-4o | Duration: 8.2s | Secrets redacted: 3

How it works

Runs helmfile diff (with --context defaulting to 3 so the model gets
enough surrounding YAML to ground its analysis).
Runs the diff through a defense-in-depth secret redactor (see below).
Sends the redacted diff to the LLM with a system prompt that frames it as a
senior Kubernetes/Helm reviewer and locks the output to a known JSON schema.
Renders a markdown report (or --output json for programmatic consumption).

Risk model

The model evaluates the diff across six categories and three severity levels:

Category	What it catches
`data-loss`	PVCs, databases, stateful workloads deleted/recreated
`security`	New privileges, host networking, plaintext secrets
`breaking-change`	Renamed values, dropped labels, apiVersion downgrades
`downtime`	Missing PDBs, rolling-update storms, missing readiness gates
`performance`	Huge resource requests, removed HPA, expensive sidecars
`best-practice`	Missing namespace, hardcoded images, misaligned labels

Severity drives the exit code, making doctor a CI gate:

0 — success, or only low/medium risks, or LLM call failed (degraded mode).
2 — at least one high risk and --force was not passed.
(helm-diff's own "detected changes" exit-2 is intentionally swallowed —
changes are doctor's whole job.)
1 — other error (state load failure, helm-diff runtime failure, etc.).

Pass --force to keep the report but skip the high-risk gate.

Secret safety

Secrets are always redacted before any byte leaves the process — there is
no opt-out. This is enforced in two layers:

--show-secrets is silently ignored; the diff config is wrapped so
ShowSecrets() returns false, making helm-diff itself emit <REDACTED>.
A built-in SecretRedactor then strips any residual secret-looking content
(Secret resource data: blocks, sensitive key names like password /
apiKey / token, free-form long base64, and JWT-shaped tokens). The
redaction count is always shown in the report footer so you can spot
unexpected leaks.

JSON output (--output json) exposes only post-redaction diffs — doctor never
echoes raw pre-redaction content through stdout or JSON.

Graceful degradation

When no LLM is configured (no HELMFILE_LLM_API_KEY / model / llm: block /
--llm-* flags), doctor degrades to a plain helmfile diff with
--show-secrets forced off — byte-for-byte identical behavior, just safer.

Configuration precedence

env (HELMFILE_LLM_*)  <  helmfile.yaml (llm:)  <  CLI flags (--llm-*)

Flag	Purpose
`--llm-base-url`	OpenAI-compatible endpoint URL
`--llm-api-key`	API key (prefer `helmfile.yaml` + `{{ env }}` over the CLI)
`--llm-model`	Model id (`gpt-4o`, `claude-3-5-sonnet` via gateway, ...)
`--llm-timeout`	Per-request timeout (default 60s)
`--llm-max-tokens`	Completion cap (default 4096)
`--force`	Skip the high-risk exit-2 gate
`--output`	Report format: `text` (default) or `json`
`--diff-output`	helm-diff plugin output format (renamed from `--output`)

Most helmfile diff flags are accepted for parity. See helmfile doctor --help.

See #2660.

⚡ Parallel kubedog tracking with progress printer

With --track-mode kubedog, resource tracking now runs in parallel with
helm instead of waiting for helm to finish. Helmfile templates the release
upfront, launches the kubedog tracker in a goroutine, and streams live progress
while helm installs/upgrades.

Safety valves protect against the known upstream-kubedog races:

Cluster-convergence confirmation — when kubedog's resource graph stalls,
helmfile queries the live API to confirm convergence and cancels the tracker.
helm-killer — if the cluster confirms all resources converged but helm is
wedged on its hook waiter, helmfile deliberately interrupts the stuck helm
subprocess and treats it as success.
Hard timeout — a tracker that never returns within the release timeout is
treated as a failure.
Buffered helm output — helm's stdout is captured into a per-release buffer
and replayed as a single block so it never interleaves with kubedog progress.

See #2654.

🐛 Bug fixes

Fix OCI chart dependency resolution when the chart path contains underscores.
Paths like oci://registry/charts_my_app were being mis-split, breaking
helmfile deps. #2648
Resolve symlinked plugin directories in GetPluginVersion. Plugin
directories reached through symlinks (e.g. via XDG_DATA_DIRS) are now
followed correctly, fixing spurious "plugin not installed" errors.
#2661

📦 Dependencies

bump github.com/aws/aws-sdk-go-v2/service/s3 1.103.3 → 1.104.0
bump github.com/containerd/containerd 1.7.32 → 1.7.33
bump github.com/helmfile/vals 0.44.1 → 0.44.2
bump github.com/helmfile/chartify 0.26.5 → 0.27.0
bump helm to v4.2.2 (and v3.21.2 for the v3 track)
bump actions/checkout v6 → v7

📚 Docs

Fix a duplicated word in the hcl_funcs log description.
#2647 — thanks @s3onghyun
(first contribution!)
Small documentation indentation fixes.
#2655 — thanks @fiete2017
(first contribution!)

Full Changelog: v1.5.5...v1.6.0

helmfile/helmfile v1.6.0 on GitHub