github cloudposse/atmos v1.216.0-rc.1

pre-release4 hours ago

πŸš€ Enhancements

fix: respect workdir path for generate: writes and hook-triggered terraform @zack-is-cool (#2309) ## Summary

Fixes a cluster of bugs in provision.workdir.enabled: true mode covering file generation, hook dispatch, store hook correctness, and repeated-apply terraform init prompts.


Bug 1 – generate: writes to base component directory instead of workdir

resolveAndProvisionComponentPath called autoGenerateComponentFiles before provisionComponentSource. Generated files (e.g. locals_override.tf) were written to components/terraform/<component>/ instead of the JIT workdir.

Fix: swap call order β€” provision source first, then generate into the returned (workdir) path.


Bug 2 – hooks and output executor used base component directory

extractComponentPath always returned the base component directory because _workdir_path is a runtime key absent from freshly-described sections. Hooks calling terraform output would fail with "no such file or directory" when trying to write backend.tf.json to a path that doesn't exist.

Fix: check provision.workdir.enabled in sections and rebuild the deterministic workdir path via workdir.BuildPath.


Bug 3 – hooks fired on every event regardless of events: list

RunAll had no event matching β€” all hooks ran regardless of their events: list. YAML uses hyphens (after-terraform-apply) but Go HookEvent constants use dots (after.terraform.apply).

Fix: added MatchesEvent() with hyphen→dot normalisation. Hooks with no events: field match all events to preserve backward compatibility with configs written before event filtering existed.


Bug 4 – store hook used wrong output getter and wrong error sentinels

The store hook always used GetOutput (which runs terraform init) regardless of when it fires. Running init after apply with a closed stdin triggers state-migration prompts. Additionally, errors used ErrNilTerraformOutput for both retrieval failures and missing keys, and included no context about which hook or event caused the failure.

Fix: RunE now selects the getter based on the event β€” after- events use GetOutputSkipInit (workdir already initialised); before- events use GetOutput (init may not have run yet). IsPostExecution() helper on HookEvent encodes the contract. Error messages now include hook name, event, output key, component, and stack. Correct sentinels: ErrTerraformOutputFailed for retrieval errors, ErrTerraformOutputNotFound for missing keys.


Bug 5 – "Do you want to migrate all workspaces?" prompt on every apply

This was caused by three interacting problems:

  1. -reconfigure added whenever WorkdirPathKey was set β€” WorkdirPathKey is set for both a preserved workdir (TTL not expired) and a wiped/re-provisioned workdir (TTL=0s or expired). Checking it unconditionally added -reconfigure even when .terraform/ was intact.

  2. init_run_reconfigure: true overriding the preserved-workdir guard β€” even after scoping -reconfigure to WorkdirReprovisionedKey, the global InitRunReconfigure flag bypassed the check and always added -reconfigure.

  3. cleanTerraformWorkspace deleting .terraform/environment for workdir components β€” this function was designed for backend-switching on non-workdir components. For workdir components it deleted the active workspace record before every init, causing OpenTofu to see orphaned terraform.tfstate.d/<workspace>/ directories with no active workspace and prompt for migration.

When combined: -reconfigure tells OpenTofu to ignore the saved backend and treat init as fresh. A fresh-init with existing workspace state dirs triggers the migration prompt even when the backend is unchanged.

Fix (three parts):

  • Introduce WorkdirReprovisionedKey (_workdir_reprovisioned), set only by vendorToTarget (source wiped) or SyncDir with file changes (workdir synced). This is the correct signal that .terraform/ was actually cleared.
  • For workdir components with a preserved workdir, ignore InitRunReconfigure β€” the backend is always generated deterministically from the same stack config and never changes between runs. -reconfigure is only added when WorkdirReprovisionedKey is set or the subcommand is workspace.
  • Skip cleanTerraformWorkspace for workdir-enabled components β€” the backend is consistent, so there is no reason to clear the workspace record.

Tested end-to-end

Full producer β†’ store β†’ consumer pipeline:

  1. null-label applies with JIT workdir + generate: override
  2. after-terraform-apply hook reads .id output and writes it to Redis (no init re-run, no migration prompt)
  3. consumer reads the value via !store local/redis null-label label_id, injects it into its own generate: template, applies successfully
  4. Repeated applies do not prompt for workspace migration, with or without init_run_reconfigure: true and with or without ttl: "0s"

Reproduction
This worked successfully for the deployment that I was initially having this issue with. Local reproduction below.

cat << 'SCRIPT' > repro.sh
#!/usr/bin/env bash
# ============================================================
# ATMOS REPRO: generate: writes orphaned override to base
#              component directory; hook-triggered terraform fails;
#              consumer reads store value into JIT workdir generate:
#
# Stack name:  demo       (from vars.name + name_template)
# Components:  null-label (producer), consumer (reads from store)
#
# Requires: atmos, tofu, docker
# ============================================================

set -euo pipefail

WORKDIR="$(mktemp -d -t atmos-repro-XXXXXX)"
echo "Working in: ${WORKDIR}"
cd "${WORKDIR}"

echo "== starting redis =="
docker stop atmos-repro-redis 2>/dev/null || true
docker run -d --rm --name atmos-repro-redis -p 6379:6379 redis:7-alpine
trap 'docker stop atmos-repro-redis 2>/dev/null || true' EXIT
sleep 1

cat <<'EOF' > atmos.yaml
base_path: "."

stores:
  local/redis:
    type: redis
    options:
      url: "redis://localhost:6379"

components:
  terraform:
    base_path: "components/terraform"
    command: "tofu"
    workspaces_enabled: true
    apply_auto_approve: false
    deploy_run_init: true
    init_run_reconfigure: true
    auto_generate_backend_file: true
    auto_generate_files: true

stacks:
  name_template: "{{ .vars.name }}"
  base_path: "stacks"
  included_paths:
    - "**/*"
EOF

mkdir -p stacks

cleanup() {
  echo "-- cleanup --"
  atmos terraform workdir clean --all 2>/dev/null || true
  echo "-- cleanup done --"
}

show_dirs() {
  local label="${1:-}"
  echo
  if [[ -n "$label" ]]; then
    echo "-- directories: $label --"
  fi
  echo "components/terraform/null-label"
  ls -la components/terraform/null-label/ 2>/dev/null || echo "(does not exist)"
  echo ".workdir/terraform/demo-null-label"
  ls -la .workdir/terraform/demo-null-label/ 2>/dev/null || echo "(does not exist)"
  echo ".workdir/terraform/demo-consumer"
  ls -la .workdir/terraform/demo-consumer/ 2>/dev/null || echo "(does not exist)"
}

# ============================================================
# SCENARIO 1: JIT + generate, no hook.
# Verifies generate: writes to the workdir only (not the base
# component directory), and that apply succeeds.
# ============================================================
echo
echo "================================================="
echo "SCENARIO 1: init + apply WITHOUT hook (expect success)"
echo "  - generate: must write only to workdir, not base component dir"
echo "================================================="
cleanup

cat <<'EOF' > stacks/demo.yaml
vars:
  name: demo

terraform:
  backend_type: local

components:
  terraform:
    null-label:
      vars:
        namespace: "eg"
        stage: "test"
        name: "demo"
        enabled: true
      source:
        uri: "git::https://github.com/cloudposse/terraform-null-label.git"
        version: "0.25.0"
        ttl: "0s"
      provision:
        workdir:
          enabled: true
      generate:
        locals_override.tf: |
          # override file generated by atmos
          locals {
            name = "THISISANOVERRIDE"
          }
EOF

echo "== init =="
atmos terraform init null-label -s demo

show_dirs "after init"

echo "== apply =="
atmos terraform apply null-label -s demo -- -auto-approve

show_dirs "after apply"

echo
echo "SCENARIO 1: PASSED"

# ============================================================
# SCENARIO 2: JIT + generate + after-apply hook writes to Redis.
# The hook fires after apply, reads terraform output, and stores
# it in Redis. Tests that the hook does not re-run init (which
# would prompt for workspace migration with a closed stdin).
# ============================================================
echo
echo "================================================="
echo "SCENARIO 2: init + apply WITH after-apply store hook (expect success)"
echo "  - hook reads .id output and stores it in Redis"
echo "  - hook must NOT re-run terraform init"
echo "================================================="
cleanup

cat <<'EOF' > stacks/demo.yaml
vars:
  name: demo

terraform:
  backend_type: local

components:
  terraform:
    null-label:
      vars:
        namespace: "eg"
        stage: "test"
        name: "demo"
        enabled: true
      source:
        uri: "git::https://github.com/cloudposse/terraform-null-label.git"
        version: "0.25.0"
        ttl: "0s"
      provision:
        workdir:
          enabled: true
      generate:
        locals_override.tf: |
          # override file generated by atmos
          locals {
            name = "THISISANOVERRIDE"
          }
      hooks:
        store-outputs:
          events:
            - after-terraform-apply
          command: store
          name: local/redis
          outputs:
            label_id: .id
EOF

echo "== init =="
atmos terraform init null-label -s demo

echo "== apply =="
atmos terraform apply null-label -s demo -- -auto-approve

show_dirs "after apply"

echo
echo "== verifying Redis contains label_id =="
STORED=$(docker exec atmos-repro-redis redis-cli KEYS "*label_id*")
if [[ -z "$STORED" ]]; then
  echo "SCENARIO 2: FAILED β€” no label_id key found in Redis"
  exit 1
fi
echo "Redis keys: $STORED"
echo
echo "SCENARIO 2: PASSED"

# ============================================================
# SCENARIO 3: Consumer reads label_id from Redis via !store,
# injects it into a generate: template inside its own JIT workdir.
# Tests the full producer β†’ store β†’ consumer pipeline.
# ============================================================
echo
echo "================================================="
echo "SCENARIO 3: consumer reads store value into JIT workdir generate:"
echo "  - consumer.vars.label_id: !store local/redis null-label label_id"
echo "  - generate: uses {{ .vars.label_id }} in a locals override"
echo "  - both components use JIT workdir with ttl: 0s"
echo "================================================="

cat <<'EOF' > stacks/demo.yaml
vars:
  name: demo

terraform:
  backend_type: local

components:
  terraform:
    null-label:
      vars:
        namespace: "eg"
        stage: "test"
        name: "demo"
        enabled: true
      source:
        uri: "git::https://github.com/cloudposse/terraform-null-label.git"
        version: "0.25.0"
        ttl: "0s"
      provision:
        workdir:
          enabled: true
      generate:
        locals_override.tf: |
          # override file generated by atmos
          locals {
            name = "THISISANOVERRIDE"
          }
      hooks:
        store-outputs:
          events:
            - after-terraform-apply
          command: store
          name: local/redis
          outputs:
            label_id: .id

    consumer:
      vars:
        namespace: "eg"
        stage: "test"
        enabled: true
        label_id: !store local/redis null-label label_id
      source:
        uri: "git::https://github.com/cloudposse/terraform-null-label.git"
        version: "0.25.0"
        ttl: "0s"
      provision:
        workdir:
          enabled: true
      generate:
        name_override.tf: |
          # override file generated by atmos β€” value comes from Redis via !store
          locals {
            name = "{{ .vars.label_id }}-derpderpderp"
          }
EOF

echo "== apply consumer =="
atmos terraform apply consumer -s demo -- -auto-approve

show_dirs "after consumer apply"

echo
echo "== verifying consumer output contains the store value =="
CONSUMER_ID=$(atmos terraform output consumer -s demo 2>/dev/null |  grep "id =" | head -1)
echo "Consumer id output line: $CONSUMER_ID"

if echo "$CONSUMER_ID" | grep -q "derpderpderp"; then
  echo
  echo "SCENARIO 3: PASSED β€” consumer label contains store-derived value"
else
  echo
  echo "SCENARIO 3: FAILED β€” consumer output does not contain expected suffix"
  echo "  Expected 'derpderpderp' in id output"
  exit 1
fi

echo
echo "================================================="
echo "ALL SCENARIOS PASSED"
echo "Working directory preserved at: ${WORKDIR}"
echo "================================================="
SCRIPT
bash repro.sh 2>&1 | tee repro.log

Test plan

  • TestHook_MatchesEvent β€” hyphen/dot formats, no match, nil/empty events (backward compat), multiple events
  • TestRunAll_EventFiltering β€” store called/skipped based on event matching
  • TestExecutor_GetOutputWithOptions_SkipInit β€” terraform init NOT called when SkipInit: true
  • TestBuildInitArgs_ReconfigureWhenWorkdirReprovisioned β€” -reconfigure added when workdir wiped
  • TestBuildInitArgs_NoReconfigureWhenWorkdirPreserved β€” -reconfigure NOT added for preserved workdir
  • TestBuildInitArgs_NoReconfigureWhenWorkdirPreserved_InitRunReconfigureIgnored β€” global InitRunReconfigure: true does not override the preserved-workdir guard
  • TestBuildInitArgs_ReconfigureForNonWorkdir_InitRunReconfigure β€” InitRunReconfigure still works for non-workdir components
  • TestPrepareInitExecution_SkipsCleanWorkspaceForWorkdir β€” .terraform/environment preserved for workdir components
  • TestPrepareInitExecution_CleansWorkspaceForNonWorkdir β€” .terraform/environment still cleaned for non-workdir components
  • TestIsWorkdirEnabled / TestExtractComponentPath/workdir_enabled_* β€” workdir path resolution
  • Full pkg/hooks, pkg/terraform/output, internal/exec test suites pass

Closes #2308
Closes #2307

Summary by CodeRabbit

  • New Features

    • Workdir-aware provisioning that targets JIT workdirs and signals reprovisioning
    • Hook event normalization, post-execution detection, and event-matching/filtering
    • New output APIs including skip-init retrieval and advanced output options
  • Improvements

    • Smarter terraform init (-reconfigure) behavior for workdir flows
    • Preserve workspace files for workdir components to avoid unintended deletions
    • More robust output caching and clearer CLI success/error messaging
  • Tests

    • Expanded coverage for workdirs, init args, hooks, output paths, and store commands

Don't miss a new atmos release

NewReleases is sending notifications on new releases.