github cloudposse/atmos v1.220.0-rc.1

pre-release2 hours ago

🚀 Enhancements

fix(stacks): honour component-level list_merge_strategy in settings @thejrose1984 (#2480) ## what
  • Fixes settings.list_merge_strategy set at the component level being silently ignored during stack processing
  • Adds effectiveAtmosConfig() helper that scans the component's settings layers (GlobalSettings → BaseComponentSettings → ComponentSettings → ComponentOverridesSettings) before any merge and returns a shallow config copy with the winning strategy
  • mergeComponentConfigurations now uses this resolved config for all m.Merge / m.MergeWithDeferred / m.ApplyDeferredMerges calls — covering vars, settings, env, auth, providers, hooks, generate, dependencies, locals, source, and provision

why

mergeComponentConfigurations passed the global atmosConfig to every merge call. pkg/merge reads atmosConfig.Settings.ListMergeStrategy on every call. The component's settings.list_merge_strategy lived inside the data being merged, not the config doing the merging — so it was always ignored. The value appeared correctly in atmos describe component output (giving false confidence), but the actual list merging behavior was always governed by the global atmos.yaml setting or ATMOS_SETTINGS_LIST_MERGE_STRATEGY env var.

references

Closes #2396

Summary by CodeRabbit

  • New Features

    • Component-level list merge strategy overrides are now computed and applied consistently across configuration assembly, honoring inheritance and isolating unchanged configs.
  • Tests

    • Added integration tests and fixtures covering precedence, inheritance, copy isolation, prevention of empty overrides, and error handling for invalid strategy values.

Review Change Stack

fix(ci): fire CI hooks per-component in apply --all mode (#2475) @thejrose1984 (#2477) Extends the per-component CI hook pattern from PR #2430 (plan --all) to apply --all, so each component produces its own CI summary entry instead of a single misattributed entry for the last component.

what

  • Update apply --all to fire per-component CI hooks.
  • Preserve per-component CI reporting semantics used by plan --all.

why

  • Prevent CI summaries from being misattributed to only the last component.
  • Ensure each component has its own hook and status entry in CI pipelines.

references

  • Fixes behavior introduced in PR #2430 for plan --all.
  • Addresses CI reporting bug for apply --all mode.

Summary by CodeRabbit

  • Bug Fixes

    • Prevented duplicate CI hook firing during multi-component Terraform apply runs.
    • Reset per-run state at apply start so deferred and post-run hooks observe consistent values.
    • Suppressed post-run hook execution for multi-component apply to avoid double execution.
  • Tests

    • Added tests covering CI hook handling and post-run suppression in multi-component apply scenarios.

Review Change Stack

fix(auth): honor --identity=false in describe affected and dependents @osterman (#2471) ## what
  • Honor --identity=false (and aliases off/0/no) in atmos describe affected so per-component auth resolution is skipped, not just the top-level AuthManager creation.
  • Thread a new DescribeAffectedCmdArgs.AuthDisabled / DescribeDependentsArgs.AuthDisabled flag from the cmd layer through executeDescribeAffectedWith{TargetRepoPath,TargetRefClone,TargetRefCheckout}, executeDescribeAffected, addDependentsToAffected, and ExecuteDescribeDependents, routing inner stack resolution through ExecuteDescribeStacksWithAuthDisabled.
  • Also wired through terraform_affected.go, terraform_affected_graph.go, pkg/list/list_affected.go, pkg/ai/tools/atmos/describe_affected.go, and atlantis_generate_repo_config.go so every caller of the public helpers passes the signal.
  • Extracted pkg/list/list_affected.go::executeAffectedLogic into three per-mode helpers to stay under the 60-line function-length limit after the extra parameter.

why

  • A user disabled all auth on a describe affected --upload --process-functions=false --identity=false run in cloudposse/infra-live CI (failing run) and still got STS AssumeRoleWithWebIdentity 403 AccessDenied for component tfstate-plat.
  • The 1.219 fix (#2412) normalized --identity=false__DISABLED__ at the parser layer and made CreateAuthManagerFromIdentity* short-circuit to nil, but it only wired the disabled signal all the way down through list instances. In describe affected, the top-level AuthManager correctly became nil, but a nil AuthManager was indistinguishable from "no identity specified" downstream. With --process-templates=true (the default), shouldResolvePerComponentAuth(processTemplates, processYamlFunctions) still returned true, so the per-component resolver called createComponentAuthManager, which built a fresh AuthManager from atmosConfig.Auth and tried the assume-role call the user thought they had disabled.
  • This change makes --identity=false actually mean "no auth, anywhere" in describe affected, matching the contract that already works for list instances.

Tests:

  • cmd/describe_affected_test.go::TestDescribeAffectedSetsAuthDisabled covers false/off/0/no env-var spellings and asserts AuthDisabled=true and AuthManager=nil.
  • internal/exec/describe_affected_authdisabled_test.go verifies Execute() forwards AuthDisabled to all three helper paths and to addDependentsToAffected.
  • internal/exec/describe_stacks_component_processor_auth_test.go adds the exact (processTemplates=true, processYamlFunctions=false, authDisabled=true) regression case from the infra-live CI failure to the existing table.

references

Summary by CodeRabbit

  • Bug Fixes

    • describe affected and describe dependents now explicitly record when authentication is disabled (e.g., --identity=false, off, 0, no), ensuring downstream discovery and dependency resolution skip per-component auth and avoid unintended auth attempts.
  • Tests

    • Added unit and integration tests verifying the auth-disabled signal is propagated throughout affected-component discovery and dependent-resolution paths.

Review Change Stack

fix(auth): nil-check process-cached credentials for standalone `ambient` identity @aknysh (#2479) ## what
  • Fix a hard SIGSEGV triggered the second time a standalone generic ambient identity (kind: ambient) is authenticated in the same process. The first authentication succeeded and silently cached nil credentials in the process-level credential cache; the next lookup invoked isCredentialValid("process-cache", nil), which dereferenced a nil types.ICredentials interface in GetExpiration().
  • The crash is latent in atmos auth login / atmos auth whoami (one authentication per process) but fatal in commands that resolve per-component auth many times — most notably atmos describe affected --upload, where internal/exec/describe_stacks_component_processor.processComponentEntry walks every component and calls resolveComponentAuthManager → createComponentAuthManager → Authenticate → authenticateChain per component.

Fix (two layers in pkg/auth/manager_chain.go)

  1. authenticateChain — don't cache nil credentials.

    if creds != nil {
        processCredentialCache.Store(cacheKey, &processCachedCreds{
            credentials: creds,
        })
    }

    The generic ambient kind is a cloud-agnostic passthrough whose Authenticate() returns (nil, nil) by design — credentials are resolved by the cloud SDK at subprocess runtime, not by Atmos. Storing nil violates the cache invariant that every entry is a usable credential object. Skipping costs nothing because ambient re-authentication is itself a no-op.

  2. isCredentialValid — short-circuit on nil input.

    if cachedCreds == nil {
        log.Debug("Cached credentials are nil; treating as invalid", logKeyIdentity, identityName)
        return false, nil
    }

    Defense-in-depth mirror of the same nil-check pattern adopted by buildWhoamiInfo in the predecessor 2026-04-17 ambient fix. If any future caller stores nil in the cache (or another path passes nil into the validator), the worst case is a redundant re-authentication, not a panic.

Either guard alone closes the panic; both together make the contract explicit at both the read and write sites.

Tests (new pkg/auth/manager_chain_ambient_test.go)

  • TestManager_isCredentialValid_NilCreds — direct unit reproducer for the panic site. Before the fix this test panicked at manager_chain.go:164 with runtime error: invalid memory address or nil pointer dereference while calling cachedCreds.GetExpiration(). Asserts (false, nil) on nil credentials.
  • TestManager_Authenticate_AmbientStandalone_RepeatedCallsNoPanic — end-to-end via real NewAuthManager + two back-to-back Authenticate() calls on a standalone kind: ambient identity. Before the fix the second call panicked on the process-cache hit. Asserts both calls return cleanly with WhoamiInfo.Credentials == nil.
  • TestAuthenticateChain_AmbientStandalone_DoesNotCacheNil — locks in the authenticateChain-side fix by direct cache inspection: the cache key must be absent after a standalone ambient authentication. Prevents a regression where caching nil silently returns.

All three new tests pass alongside the existing ambient regression tests (TestManager_buildWhoamiInfo_NilCredentials, TestManager_Authenticate_Ambient_Standalone) and the existing TestProcessCredentialCache_* suite.

Coverage

Both patched functions remain at 100% statement coverage; both branches of each new guard are exercised:

Function File:Line Coverage
authenticateChain pkg/auth/manager_chain.go:51 100.0%
isCredentialValid pkg/auth/manager_chain.go:173 100.0%
  • isCredentialValid nil-true branch: TestManager_isCredentialValid_NilCreds. Nil-false branch: existing TestProcessCredentialCache_* tests.
  • authenticateChain skip-cache branch: the two ambient tests above. Cache-write branch: existing TestProcessCredentialCache_AvoidsDuplicateAuth and friends.

Validation

  • go test ./pkg/auth/... -count=1 — all 28 subpackages green.
  • go vet ./pkg/auth/... — clean.
  • go build ./... — succeeds.

why

  • The (nil, nil) return from the generic ambient kind is the documented contract (docs/prd/ambient-identity.md) — credentials are resolved by the cloud SDK at subprocess runtime, not by Atmos. The cache code on the other side of that boundary failed to honor the contract, and a recent change that made per-component auth resolver failures fatal turned this latent panic into a hard command termination.
  • The predecessor 2026-04-17 ambient fix (#2334) addressed the buildWhoamiInfo path but did not touch the process credential cache path in authenticateChain / isCredentialValid. That cache is dormant during single-authentication commands like atmos auth login / atmos auth whoami (where #2334's reproducer lived) but hot during multi-component flows like atmos describe affected, so the bug only surfaced after both #2334 shipped and per-component auth resolution became fatal. This PR extends the same nil-credential contract to the credential-cache layer.
  • Without this fix, any consumer of a standalone kind: ambient identity who exercises atmos describe affected --upload (the canonical Atmos Pro flow) hits a hard crash on every run, with no workaround short of avoiding the identity kind entirely — which defeats the reason the kind exists.

references

  • docs/fixes/2026-05-21-ambient-identity-process-cache-panic.md — fix write-up: root cause, code path, two-layer fix, test matrix, coverage notes, and the interaction with the predecessor #2334 fix that made this surface now.
  • docs/fixes/2026-04-17-ambient-identity-nil-credentials.md — predecessor fix. Same (nil, nil) ambient contract, different layer (buildWhoamiInfo). This PR extends the same defense to the process credential cache.
  • docs/prd/ambient-identity.md — feature PRD. Specifies that ambient.Authenticate() returns (nil, nil) and ambient identities do not store credentials.
  • pkg/auth/identities/ambient/ambient.go:66-71 — the intentional return nil, nil in ambientIdentity.Authenticate().
  • pkg/auth/identities/ambient/ambient.go:144-162AuthenticateStandaloneAmbient documents and propagates the nil-credentials contract.
  • pkg/auth/identities/aws/ambient.go:Authenticate — AWS-specific counterpart that returns real *AWSCredentials and therefore never triggers this bug.
  • internal/exec/describe_stacks_component_processor.go:150-174 — per-component auth resolver whose recent change made this latent panic fatal in the atmos describe affected --upload flow.

Summary by CodeRabbit

  • Bug Fixes

    • Fixed a crash that occurred when authenticating a standalone ambient identity multiple times within the same process.
  • Tests

    • Added regression tests to prevent this issue from reoccurring.
  • Documentation

    • Added documentation describing the fix and root cause analysis.

Review Change Stack

Don't miss a new atmos release

NewReleases is sending notifications on new releases.