feat: describe affected evaluates all provisioned component sections @osterman (#2573)
what
- Fix
atmos describe affectedso it detects changes in every provisioned component section, not justvars/env/settings/metadata/source/provision. - Newly evaluated sections:
providers,required_providers(provider versions),required_version,hooks,generate,backend,backend_type,remote_state_backend,remote_state_backend_type,auth,command, anddependencies— including scalar sections (previously only map sections were compared). - Add a configurable
describe.affected.sectionssetting inatmos.yamlthat fully replaces the evaluated set (e.g. to track a custom section or narrow the list);metadata/settingsare always evaluated. - Refactor the three component processors to a single table-driven comparison, add a documented "Evaluated sections" list, tests, a changelog blog post, and a roadmap milestone.
why
- The comparison ran against a hand-maintained allow-list that had drifted out of sync with what Atmos actually merges into a component, so changes to
providers,hooks, provider versions,backend, etc. were silently missed — a false negative that could let CI pipelines skip components that genuinely changed. - The table is now tied (via comments) to the sections written in
stack_processor_merge.go, and the new config setting gives users an escape hatch so the bug class can't quietly return. locals,overrides,inheritance, andretryare intentionally excluded (they either fold into other sections or are execution-time only).
references
- Docs: Evaluated sections and
describe.affected.sections
Summary by CodeRabbit
-
New Features
- Describe now evaluates and reports changes across a comprehensive set of top-level component sections (including scalar sections) with per-section reasons; first changed section becomes the headline reason.
- Added configurable describe.affected.sections to fully replace the default evaluated set (metadata/settings remain always evaluated).
-
Documentation
- Blog and CLI/config docs updated with evaluated-sections details, output reason entries, and configuration examples.
-
Tests
- Added tests for section evaluation, equality behavior, remote-locator logic, and override/no-false-positive cases.
-
Chores
- Updated snapshots, roadmap, CI workflow pins, link-checker exclusions, and changelog guidance.
feat(hooks): terraform init lifecycle hooks + --skip-hooks before-* fix @osterman (#2574)
what
- Fix
--skip-hooksfor before- hooks.* Previously it only skippedafter-*hooks;before-terraform-plan/apply/deployhooks ran regardless. Now--skip-hooks(skip all) and--skip-hooks=name1,name2(skip by name) are honored symmetrically for before and after events. - Add
before-terraform-initandafter-terraform-initlifecycle hooks for theatmos terraform initcommand.after-terraform-initis new;before-terraform-initwas documented but never dispatched to user hooks — now it fires. They run through the samerunHooks/RunAllpath, so the skip fix applies to them too. - Add tests (real parsed Cobra flag, not
viper.Set), strengthen hook-inheritance coverage with a fixture proving top-levelterraform.hooks:is inherited by every component (andcomponents.terraform.hooks:is not), update the Hooks docs, blog post, and roadmap.
why
--skip-hooksis a global flag bound to Viper insideRunE, but before-* hooks run earlier inPreRunE— soviper.GetString("skip-hooks")never saw the CLI value and before-hooks fired anyway. The flag is now resolved directly from the parsed command (Viper/ATMOS_SKIP_HOOKSfallback), mirroring how--ciand--verboseare read inPreRunE.- Init had no user-hook surface at all:
init.gowired no hooks and theBeforeTerraformInitevent was never dispatched. WiringPreRunE/PostRunEon the init command (like plan/apply/deploy) closes the lifecycle gap so teams can validate tooling, vendor sources, or notify systems aroundterraform initdeclaratively. - The previous skip tests injected via
viper.Setwith anilcommand, sidestepping the exact flag-binding lifecycle that was broken — which is how the bug shipped; the new tests fail against the old implementation.
references
- Hooks documentation:
/stacks/hooks - Note:
before-/after-terraform-initfire on the explicitatmos terraform init, not the implicit init that plan/apply run.
Summary by CodeRabbit
- New Features
- Added Terraform init lifecycle event: after-terraform-init (alongside before-terraform-init); Terraform-scoped default hooks can be inherited by Terraform components.
- Bug Fixes
- Fixed --skip-hooks precedence so CLI flag reliably overrides env/config and consistently skips before/after hook phases.
- Clarified hook scope handling so misplaced hook keys aren’t incorrectly applied.
- Documentation
- Blog, docs, and roadmap updated to describe init hook events and skip-hooks behavior.
- Tests
- Expanded coverage for hook inheritance, scope, init wiring, event filtering, and skip-hooks CLI behavior.
- Chores
- CI Codecov step made non-fatal for transient upload errors.
feat(auth): share single OIDC session across aws/iam-identity-center providers @Benbentwo (#2553)
what
- Refactors the
aws/iam-identity-center(AWS SSO) provider so that multiple providers pointing at the same SSO portal (identicalstart_url+region) share a single OIDC token — one browser flow now unlocks every provider instead of one flow per provider. - Adds silent refresh-token renewal via
ssooidc:CreateTokenwithgrant_type=refresh_token, so a single browser interaction holds for the full portal session (~8h) rather than re-prompting every hour. - Introduces an in-process
sessionTokenStore(keyed bysha1(start_url|region)) with per-session mutexes that single-flight concurrent device-auth flows; re-keys the on-disk cache fromaws-sso/<provider>/token.jsontoaws-sso/sessions/<sha1>.jsonin the AWS SDKssocreds-compatible format. - Adds the design PRD (
docs/prd/aws-sso-session-support.md), a changelog blog post, and a shipped roadmap milestone under the Unified Authentication initiative.
why
- A common setup has one provider per environment (dev/staging/prod) all backed by the same corporate SSO portal; previously
atmos auth loginlaunched the browser flow once per provider, contradicting AWS's own "credentials have been shared successfully" single-sign-in experience. - The legacy flow re-ran the full browser interaction on every ~1h access-token expiry and keyed its cache by provider name, so renaming a provider silently invalidated a still-valid token — both are eliminated here with zero
atmos.yamlconfig changes.
references
- PRD:
docs/prd/aws-sso-session-support.md - AWS CLI token provider docs: https://docs.aws.amazon.com/cli/latest/userguide/sso-configure-profile-token.html
- AWS SDK for Go v2
ssocreds: https://pkg.go.dev/github.com/aws/aws-sdk-go-v2/credentials/ssocreds
Summary by CodeRabbit
-
New Features
- Shared AWS SSO sessions across providers for the same portal (start URL + region), reducing duplicate logins and browser prompts.
- Silent refresh via refresh tokens to renew credentials without a browser; per-session locking prevents concurrent device-auth flows.
- Session-keyed on-disk cache (compatible with AWS SDK patterns); logout clears shared session data; added session telemetry.
-
Documentation
- Product spec and blog post describing session sharing, cache format, refresh behavior, and rollout plan.
-
Tests
- Added/updated tests validating session sharing, cache semantics, isolation, refresh logic, and concurrency.
feat: implement !append YAML function for list concatenation @osterman (#1513)
what
- Implements the
!appendYAML function that allows fine-grained control over list merging behavior in Atmos stack configurations - Lists tagged with
!appendwill be concatenated with base values instead of replaced - Adds comprehensive unit tests and integration test fixtures
why
- Resolves the ongoing challenge of needing to concatenate lists on a case-by-case basis
- Currently, users have to fall back to using maps instead of lists when they need append behavior
- This is particularly important for fields like
depends_onwhere appending is often the desired behavior rather than replacement - The
!appendtag provides opt-in, per-field control that works alongside the globallist_merge_strategysetting
Key Features
- Opt-in behavior: Only lists explicitly tagged with
!appenduse append mode - Works alongside global settings: The
!appendtag works independently of the globallist_merge_strategysetting - Nested support: Works with deeply nested configurations
- Backward compatible: No impact on existing configurations without the tag
Example Usage
# base.yaml
components:
terraform:
eks:
settings:
depends_on:
- vpc
- iam-role
# override.yaml
components:
terraform:
eks:
settings:
depends_on: !append # This tag indicates append mode
- rds
- elasticache
# Result: depends_on = [vpc, iam-role, rds, elasticache]Testing
- ✅ All unit tests pass
- ✅ Build succeeds without errors
- ✅ Linting passes with no issues
- ✅ Code follows Atmos conventions and patterns
references
- Linear issue: DEV-2980
- Documentation:
!appendYAML function - Changelog: blog post
append-yaml-function; roadmap milestone updated (Extensibility initiative)
Summary by CodeRabbit
- New Features
- Added a !append YAML function to append items to lists during configuration merging (per-field, preserves order, supports nested lists/maps, interacts with global list-merge strategies).
- Tests
- Added comprehensive unit and integration tests covering append-tag helpers, parsing, merging, and end-to-end scenarios.
- Documentation
- Added docs, examples, blog post, and index updates explaining !append usage and behavior.
- Chores
- Updated website roadmap/metadata and package config; added a sentinel error alias.
feat: add !unset YAML function to delete keys from configuration @osterman (#1521)
what
- Add new
!unsetYAML function that completely removes keys from configuration during inheritance and merging - Implement processing in both stack merging (
yaml_func_utils.go) and config loading (process_yaml.go) - Add comprehensive unit tests for all functionality
- Create documentation with examples and use cases
- Update YAML functions index documentation
why
- Users need a way to explicitly remove inherited configuration values, not just override them with
null - Current workarounds require physically removing or commenting out keys in parent configurations
- This addresses GitHub issue #227: "A YAML way of undefining a value without removing the key"
- Provides fine-grained control over configuration inheritance in complex stack hierarchies
Key Features
- Complete removal: Unlike setting to
null,!unsetcompletely removes the key from configuration - Inheritance control: Child configurations can remove values inherited from parents
- Works everywhere: Functions in all Atmos configuration sections (vars, settings, env, metadata, etc.)
- Type-safe: Operates after YAML parsing, ensuring no syntax breakage
- Respects skip list: Can be disabled via skip list if needed
Examples
Basic Usage
# parent.yaml
components:
terraform:
vpc:
vars:
enable_nat_gateway: true
enable_vpn_gateway: true
# child.yaml
import:
- parent
components:
terraform:
vpc:
vars:
enable_vpn_gateway: !unset # Completely removes this keyRemoving Nested Values
config:
database:
host: "prod.db.example.com"
backup_enabled: true
# Override:
config:
database:
backup_enabled: !unset # Remove backup config
host: "dev.db.example.com"Testing
All tests pass:
- ✅ Unit tests for config processing
- ✅ Unit tests for stack processing
- ✅ Integration tests with other YAML functions
- ✅ Skip list functionality tests
- ✅ Inheritance scenario tests
references
Summary by CodeRabbit
- New Features
- Added a YAML !unset function to remove keys or list items during config processing and inheritance. Works at any depth, supports multiple unsets, and coexists with other YAML functions.
- Tests
- Introduced comprehensive tests covering flat and nested structures, arrays, multiple/nested unsets, inheritance scenarios, and edge cases.
- Documentation
- Added dedicated docs and examples for !unset, including usage in stack manifests, nested removals, list handling, and guidance on expected behavior.
feat(imports): cache remote stack-import clones (dedup + opt-in TTL) @osterman (#2571)
what
- Clone each remote (Git) stack-import source repository at most once per Atmos invocation instead of once per import — all subdir imports of the same repo now resolve from a single shared clone (within-run dedup, spanning both
describe affectedpasses). - Add an opt-in
ttlto reuse the cloned source across runs until it expires: per-import (ttl:in the import map form) and a globalimports.ttldefault inatmos.yaml. With nottl, the source refreshes once per run so mutable refs like?ref=mainstay fresh. - Wire the default git-subdir resolve path through the existing
ensureSourceDir, add per-session fetch tracking + TTL freshness (timestamp persisted in the.atmos-source-readymarker), and extract a sharedduration.IsExpired/IsZeroTTLthat the source provisioner now reuses. - Update JSON schemas, add unit tests, document "Caching Remote Imports" in
stacks/imports.mdx, add a changelog blog post, and add a roadmap milestone.
why
- For hub-and-spoke repos pulling a shared catalog via remote imports,
atmos describe affectedwas re-cloning the hub repo once per import (~68–87×/run, ~7–11 min total), and a warmactions/cacheof~/.cache/atmos/stack-imports/was ignored because the subdir path re-cloned unconditionally. - Within-run dedup collapses those clones to one per repo (the ~80% win, no staleness risk); the opt-in
ttllets CI reuse the clone across runs (warm cache skips the clone entirely) while keeping mutable refs fresh by default. Shallow clones (depth=1) were already in use — the win is not re-cloning the same repo repeatedly.
references
- Cached sources live under the XDG cache dir (
~/.cache/atmos/stack-imports/, honoringXDG_CACHE_HOME). - Builds on the source-provisioning TTL mechanism (
pkg/duration,pkg/provisioner/source). - Changelog:
website/blog/2026-06-05-faster-remote-stack-imports.mdx
Summary by CodeRabbit
-
New Features
- Per-import
ttland globalimports.ttlfor optional cross-run caching of remote stack imports. - Each unique remote Git source is cloned at most once per invocation and shared across nested imports.
- Improved cache freshness semantics, including explicit zero-ttl behavior.
- Per-import
-
Documentation
- Added caching guide, TTL examples, XDG cache guidance, and a blog post.
-
Tests
- Added tests for TTL parsing/expiration and remote import caching behavior.
[codex] consolidate terraform bulk execution on scheduler @shirkevich (#2466)
Summary
- route Terraform
--all,--components, and--querythrough the scheduler-backed Terraform adapter - build Terraform dependency graphs from
dependencies.componentsfirst, withsettings.depends_onfallback - preserve query-path auth manager setup, store resolver bridging, YAML function processing, and per-component CI hook capture
- includes #2348 identity/auth fixes in this stack so local
--identity terraformtesting works - include the credential-store concurrency-safety prerequisite discovered by concurrency validation
- keep effective scheduler concurrency fixed at
1for this PR
Stacking
This PR is stacked on PR 2 and targets codex/dag-scheduler-core.
PR 4 is #2468 and is stacked on this branch to introduce plan-only --max-concurrency wiring.
Supersedes the earlier fork-headed draft #2462 now that the stack branches exist in cloudposse/atmos.
Draft note
This branch is back to the intended PR 3 review shape: Terraform --all, --components, and --query share the graph-backed scheduler path, but execution remains sequential.
The temporary ATMOS_EXPERIMENTAL_DAG_MAX_CONCURRENCY validation hook has been removed. User-visible plan concurrency now belongs to PR 4.
This branch retains the narrow credential-store concurrency-safety prerequisite discovered during validation:
- credential-store initialization no longer mutates global Viper env bindings per component and preserves
ATMOS_KEYRING_TYPEprecedence
Validation
go test ./pkg/scheduler ./pkg/scheduler/adapters ./internal/exec -run TestExecuteTerraformQuery|TestExecuteTerraformQueryNoMatches|TestBuildTerraformDependencyGraph|TestExecuteTerraformAllUsesGraphBackedSequentialOrder|TestExecuteTerraformComponentsUsesGraphBackedSequentialOrder|TestExecuteTerraformQueryUsesGraphBackedSequentialOrder|TestExecuteTerraformKeepsIndependentComponentsSequential|TestBuildTerraformGraphgo test ./pkg/auth/credentialsgo test -race ./pkg/auth/credentials -run TestNewCredentialStoreWithConfig_ConcurrentInitializationgo test ./pkg/auth ./internal/exec -run TestCreateAndAuthenticateManagerWithAtmosConfig|TestSetupTerraformAuth|TestProcessComponentConfig_PropagatesAuthManager|TestProcessComponentConfig_AuthManagerGuardBranches- built
build/atmosand live-tested against a downstream stack withterraform plan --alland an explicit identity
Validation findings carried forward
- The first concurrency-4 validation run exposed an auth race: per-component credential-store initialization called global
viper.BindEnv, causingfatal error: concurrent map writes. This PR fixes that narrowly inpkg/auth/credentials. - Higher-concurrency validation also showed local Terraform working-directory contention when multiple logical aliases share one physical Terraform component directory. PR 4 keeps path-based locking while introducing plan concurrency.
Follow-up discussion
The longer-term way to unlock true parallelism for aliases sharing one physical Terraform folder would be per-node isolated workdirs plus isolated TF_DATA_DIR and generated files. That needs repo-owner discussion because it changes the operator debugging model: Atmos would need to decide whether and how to retain those per-node copies for inspection, how atmos terraform shell maps to them, and how cleanup/debug artifacts are managed.
Summary by CodeRabbit
-
New Features
- Graph-backed Terraform scheduler with deterministic dependency order, reversed destroy order, per-resource serialization, concurrency control, per-component output capture/hooks, and signal-aware cancellation.
- New Terraform run options: --failure-mode, --max-concurrency, log-order, hide (including no-changes), and execution-summary file.
- Line-prefixing writer for prefixed log output.
-
Bug Fixes
- Credential keyring type now respects ATMOS_KEYRING_TYPE and is safe for concurrent init.
- Workdir sync/hash skips Terraform/OpenTofu runtime dirs.
- More tolerant Git repo opening for worktrees.
-
Tests
- Large expansion of tests covering scheduler behavior, CLI options, concurrency, logging, auth, and new utilities.
feat: install Atmos from a branch or tag with --use-version=ref: @osterman (#2569)
what
- Add a
ref:<name>version spec to--use-version(andversion.useinatmos.yaml/ATMOS_USE_VERSION) that installs Atmos from the latest commit of a branch or tag, e.g.atmos --use-version=ref:main version. - Accepts branch names, tag names, and slash-qualified refs for disambiguation:
ref:main,ref:release/v1.199,ref:v1.199.0,ref:heads/main,ref:tags/v1.199.0. - Resolves the ref to its full commit SHA via the GitHub API, then reuses the existing
sha:install/cache path unchanged; ref versions always re-execute and fail hard on resolution errors. - Docs (
version/use.mdx), aminorblog post, and a roadmap milestone.
why
- Previously
--use-versiononly accepted PR numbers (pr:1234), commit SHAs (sha:ceb7526), and releases — a branch name likemainwas rejected, even though branch/tag pushes already publish the samebuild-artifacts-*from theTestsworkflow. ref:lets you pin a moving target once (ref:main) instead of chasing a newsha:after every merge, making it trivial to test unreleased fixes on a branch.- The ref is re-resolved on every run so a mutable branch always tracks the latest build, while the SHA-keyed cache avoids reinstalling when the ref hasn't moved. Resolving to the full SHA also sidesteps GitHub's
head_shafilter, which only matches full (not short) SHAs.
references
- Docs: Version Pinning
- Changelog:
website/blog/2026-06-04-use-version-ref.mdx
Summary by CodeRabbit
-
New Features
- Support for git branches/tags via --use-version=ref: (resolves refs to commit SHAs and uses existing artifact download/cache).
-
Behavior Changes
- CI artifact selection now prefers the newest workflow run that contains the platform artifact (may pick in-progress or failed runs if they include the artifact).
- Re-exec/version switching treats ref: like immutable versions (resolve → install/cache).
-
Bug Fixes
- Clearer, user-friendly error when a ref does not exist (with actionable hints).
-
Documentation
- Added CLI docs, blog post, and roadmap entry describing ref: usage and caching.
feat: Add custom component types for custom commands @osterman (#1904)
Summary
- Implement shell completion for semantic-typed flags and arguments (component/stack types)
- Add interactive prompting for missing required semantic-typed values
- Support custom component types in shell completions
What Changed
- New custom component type provider system (
pkg/component/custom) - Shell completion for semantic-typed arguments and flags in custom commands
- Interactive prompting for missing required semantic-typed values
- Extended command schema to support semantic types and components
- Comprehensive test coverage for completion and prompting functionality
Why This Matters
This feature enables custom commands to provide superior developer experience through:
- Tab completion for component and stack arguments/flags
- Interactive prompts for required semantic-typed values
- Support for custom component types beyond built-in types
References
Summary by CodeRabbit
-
New Features
- Custom component types with registry support, CLI integration, and template access to resolved component data.
- Enhanced CLI semantic completion and interactive prompting for selecting component and stack values.
- Aggregated component listing across stacks for discovery and completion.
-
Documentation
- New guides, examples, and blog post demonstrating custom component types and workflows.
- Schema updates to validate custom component manifests.
-
Tests
- Broad test coverage for completion, providers, processing, and stack handling.
docs(gists): add Atmos + Packer + GitHub Actions AMI pipeline gist @aknysh (#2560)
what
- Add a new gist at
gists/aws-ami-packer-github-actions/demonstrating an end-to-end AWS AMI pipeline with Atmos + Packer + GitHub Actions:- Build a hardened Amazon Linux 2023 AMI with Packer, orchestrated by Atmos.
- Validate it on a live test instance, optionally scan it, and gate promotion behind a manual approval.
- Tag the approved image
ScanStatus=approvedand share it across AWS accounts.
- Drive the whole build from stack configuration (no hardcoded HCL) and operate the result through a tree of
atmos amicustom commands (get-ami-id, tag, share, launch/terminate test instances, …). - Include reference IAM/OIDC policies and an org SCP that enforces "launch only approved AMIs".
- Wire the gist into the docs-site file browser (tags + related-docs links) and announce it with a blog post.
why
- "How do I use Atmos + Packer to build AMIs, and automate the build → approve → share process?" is a frequent community question. This gist is a vendor-neutral, copy-and-adapt reference recipe that combines several Atmos features into one production-shaped workflow.
- Like all gists, it's shared as-is (not part of the CI-tested examples), so users adapt it to their environment and Atmos version.
references
- Gist:
gists/aws-ami-packer-github-actions/ - Blog post:
website/blog/2026-06-01-gist-aws-ami-packer-github-actions.mdx
Summary by CodeRabbit
-
New Features
- Added a complete gist showing an end-to-end AMI build/validate/approve/share pipeline using Atmos + Packer + GitHub Actions, with reusable setup and tool-install steps, approval gate, optional vulnerability scan, and cross-account sharing.
-
Documentation
- Added detailed README, customization checklist, policy templates, and a blog post documenting setup, governance (OIDC, IAM, SCP), local execution, and cleanup guidance.
feat: add !git.* repository YAML functions and atmos.Resolve template func @osterman (#2558)
what
- Add five new
!git.*YAML functions that expose Git repository metadata from theoriginremote:!git.repository(the<owner>/<repo>slug, e.g.cloudposse/atmos),!git.owner,!git.name,!git.host, and!git.url. - Add the
atmos.Resolvetemplate function, which evaluates any Atmos YAML-function string (!git.*,!exec,!store,!terraform.output, …) at template-render time so its result can be composed with other strings and template variables in a single value. - The new YAML functions are parsed generically (GitHub/GitLab/Bitbucket/Azure DevOps), support a fallback value, and work in both stack/component processing and
atmos.yamlconfig preprocessing. - Includes unit tests, per-function docs, two changelog posts, a roadmap update, and a follow-up PRD.
why
- Users needed the repository slug (and its parts) for tagging resources and building backend paths, previously only achievable by shelling out via
!exec echo ${GITHUB_REPOSITORY:-$(git remote get-url origin | sed …)}. - A bare YAML tag owns the entire scalar and Atmos renders Go templates before YAML functions, so composing a function result with extra text (e.g. prefixing
workspace_key_prefixwith the repo slug) was impossible without!exec;atmos.Resolvemakes that composition native:workspace_key_prefix: '{{ atmos.Resolve .settings.context.repo }}/{{ or .metadata.name .metadata.component }}'
references
- Extends the existing Git YAML function family from the Git YAML Functions changelog.
- Docs:
/functions/yaml/git.repository,/functions/template/atmos.Resolve. - Follow-up:
docs/prd/lazy-yaml-function-template-values.md(lazy-Stringer auto-deref so{{ .settings.context.repo }}evaluates withoutatmos.Resolve).
Summary by CodeRabbit
-
New Features
- Added Git repository metadata YAML functions (!git.repository, !git.owner, !git.name, !git.host, !git.url).
- Added atmos.Resolve template function to evaluate YAML functions during template rendering for inline composition.
-
Documentation
- Added PRD, docs pages, blog posts, and roadmap entries describing the new YAML functions and atmos.Resolve.
-
Tests
- Added tests covering Git YAML tag resolution and the new template Resolve behavior.
-
Chores
- Updated link-checker configuration to exclude slow/intermittent targets.
feat(stacks): template variables in import paths from earlier imports @osterman (#2554)
what
- Render Go templates in stack
import:paths (local paths and a remote import's Git?ref=) against thesettings/vars/envaccumulated from imports listed earlier in the same manifest, plus the import's owncontext. - A single variable (e.g.
settings.context.deployment_repo_version, set once in a_defaults) can now pin both a remote catalog import's ref and the componentsource.version. - Only the import path string is rendered; imported file content templating and its deferral are unchanged. Missing values are a hard error (with hints) unless
ignore_missing_template_valuesis set;skip_templates_processingor a disabled templating engine leaves the path literal. - Adds the
ErrImportPathTemplatesentinel, a fixture scenario + unit tests, docs ("Referencing Earlier Imports in Import Paths"), a changelog blog post, and a roadmap milestone.
why
- Keep
devandprodin one repo while isolating prod from dev changes: dev uses local catalogs/sources, prod imports a versioned catalog and pins the component source to an immutable ref — both driven by one variable. - Previously the component
source.versiontemplate worked (resolved late, at component processing) but the import?ref=had to be hard-coded, because imports are resolved before that context exists. This closes that gap so both come from the same variable.
references
- Docs:
/stacks/imports#referencing-earlier-imports-in-import-paths - Builds on remote stack imports (#2528) and the git context YAML functions (#2537)
Summary by CodeRabbit
-
New Features
- Import paths now support Go-template rendering, letting paths reference settings, vars, and env from earlier imports in the same manifest.
-
Bug Fixes
- Templating failures in import paths now surface a clear error; options added to ignore or skip unresolved import templates.
-
Documentation
- Added docs and a blog post with examples and operational guidance for templated import paths.
Add ECR Public authentication: `aws/ecr-public` integration and `atmos aws ecr login --public` @osterman (#2231)
what
Add ECR Public authentication to Atmos for authenticated access to public.ecr.aws, solving Docker rate limiting on public ECR images. Two entry points:
atmos aws ecr login --public— direct, zero-config login using ambient AWS credentials (the AWS SDK default chain: env, shared config/profile, SSO, IMDS/IRSA/ECS), or--public --identity <name>to use a specific identity. Ideal for CI.aws/ecr-publicintegration kind — for automatic login onatmos auth loginand identity linking.
Key changes:
- Command (
cmd/aws/ecr/login.go): new--publicflag onatmos aws ecr login; ambient-credential and identity-based ECR Public login paths; mutually exclusive with a positional integration argument and--registry. - Cloud layer (
pkg/auth/cloud/aws/ecr_public.go):GetPublicAuthorizationToken()callsecrpublic:GetAuthorizationToken, always in us-east-1. - Integration layer (
pkg/auth/integrations/aws/ecr_public.go):ECRPublicIntegrationfactory registering theaws/ecr-publickind, with region validation at config time. Implements the fullIntegrationinterface includingCleanup()(docker logout) andEnvironment()(DOCKER_CONFIG). - Region validation: rejects unsupported regions (only us-east-1 and us-west-2 have service endpoints; auth is us-east-1 only).
- Tests: cloud-layer and integration-layer unit tests (token retrieval, region validation, cleanup, error handling) with a generated mock ECR Public client; command tests for the
--publicflag and mode validation. - Documentation:
atmos aws ecr logincommand reference (added--publicflag), ECR authentication tutorial, and a PRD (docs/prd/ecr-public-authentication.md). - Blog post + roadmap: announcement and a shipped milestone linking to the changelog.
Note: this branch has been merged up to
main. Following #2144 (atmos auth ecr-login→atmos aws ecr login), ECR login lives under theawsnamespace, and the integration was adapted tomain's evolvedIntegrationinterface (exportedBuildAWSConfigFromCreds, newCleanup/Environmentmethods).
why
Docker pulls from public.ecr.aws hit rate limits when unauthenticated. This blocks CI workflows, especially those using cloudposse/github-action-docker-build-push which pulls BuildKit/binfmt images on every run. Authenticated pulls have significantly higher (or no) rate limits.
Because public.ecr.aws is global, any valid AWS credentials unlock authenticated pulls — so --public with ambient credentials "just works" in CI with zero configuration. ECR Public otherwise differs from private ECR: it uses the ecrpublic SDK service, a bearer token instead of SigV4, a hardcoded us-east-1 auth region, and a fixed public.ecr.aws registry URL. It requires ecr-public:GetAuthorizationToken and sts:GetServiceBearerToken IAM permissions.
references
- ECR Public Authentication Tutorial — configuration examples, multi-environment setup.
atmos aws ecr loginCommand Reference — command usage,--publicflag, integration configuration.- ECR Public Blog Post — announcement and use cases.
- PRD:
docs/prd/ecr-public-authentication.md. - AWS Docs: ECR Public APIs.
Summary by CodeRabbit
-
New Features
- ECR Public authentication (aws/ecr-public) with atmos aws ecr login --public, identity-driven auto-provisioning, and enforced us-east-1 auth.
-
Documentation
- Tutorials, blog post, and roadmap updated with ECR Public examples, permissions, CI guidance, and troubleshooting.
-
Bug Fixes
- Improved identity selection UX (confirmation message) and safer CLI behavior for non‑TTY identity selection.
-
Tests
- Extensive unit and integration tests covering ECR Public flows and CLI routing.
-
Chores
- NOTICE/dependencies updated and minor .gitignore tweak.
feat(auth): Atmos Pro STS — JIT GitHub token broker for CI @osterman (#2546)
what
- Add a new auth provider
kind: atmos/prothat authenticates the Atmos CLI to Atmos Pro by federating the GitHub Actions runner's OIDC token into an Atmos Pro session JWT (v1 is OIDC-only). - Add a new auth integration
kind: github/sts— a just-in-time GitHub token broker for CI. On login it mints short-lived, scoped GitHub App installation tokens viaPOST /api/v1/sts, materializes them as per-ownerGIT_CONFIG_*URL rewrites (envorfilemode), and revokes them at command-end (viaatmos auth execin CI) and onatmos auth logout. - Add a passthrough
kind: atmos/proidentity, a keyring-registeredProCredentialstype, realm scoping for integration state, andvia.providerbinding for integrations (in addition tovia.identity). - Add
ATMOS_PRO_GITHUB_TOKEN, preferred by Atmos-native git operations (vendoring,source:provisioning, go-getter) ahead ofATMOS_GITHUB_TOKEN/GITHUB_TOKEN. - Add the PRD (
docs/prd/atmos-pro-sts.md), a changelog blog post, a shipped roadmap milestone, and configuration docs; full unit-test coverage for the provider, identity, integration, keyring round-trip,via.providermatching, revoke gating, and token precedence.
why
- Fetching private Terraform modules, Atmos
source:components, and vendored artifacts in CI today requires a long-lived, over-privileged GitHub credential (PAT, machine user, or deploy key) sitting in a CI secret — a standing breach risk that can't be scoped per-run. - Atmos Pro STS replaces that with least-privilege, deny-by-default, short-lived tokens minted at the start of a run and revoked at the end — with zero
.tfchanges (the injectedGIT_CONFIG_*rewrites are honored by both go-getter and Terraform's native git), and multi-org support because tokens are minted per(installation, permission-set). - Built into Atmos CLI core (CI-native, OIDC-aware) rather than as a GitHub Action, modeled on the existing
aws/ecr/aws/eksintegrations; the workflow only needspermissions: id-token: write.
references
- PRD:
docs/prd/atmos-pro-sts.md(includes deferred Future Work: moving Pro connection config underauth, unifyingpkg/proonto auth-issued sessions, and broadening command-end revoke beyondatmos auth exec) - Changelog:
website/blog/2026-05-29-atmos-pro-github-sts.mdx
Summary by CodeRabbit
-
New Features
- Atmos Pro GitHub token broker: new atmos/pro provider + github/sts integration for just-in-time GitHub tokens (env or git-config modes) with realm-scoped state and optional token export.
- ATMOS_PRO_GITHUB_TOKEN added as preferred GitHub token source.
- CI-gated automatic token revocation on command exit/logout.
- Ambient credential broker registry to auto-provision env vars for remote reads.
-
Documentation
- PRD, docs, and blog post for Atmos Pro STS and usage guidance.
docs: re-date custom commands step types blog post to 2026-05-30 @osterman (#2550)
what
- Re-dated the "25+ Interactive Step Types" blog post from
2026-01-03to2026-05-30. - Renamed the file prefix (
git mv, history preserved) and added a matchingdate: 2026-05-30frontmatter field.
why
- Aligns the post's publish date with its actual release timing so it surfaces correctly in the changelog feed.
- Adds the explicit
date:field to match the repo convention (e.g.2026-05-28-git-yaml-functions.mdx). - The
slugis unchanged, so the published URL stays the same.
references
- N/A — docs-only date adjustment, no user-facing code change.
Summary by CodeRabbit
- Documentation
- Published comprehensive guide to custom commands and workflow step types, featuring 25+ interactive step types with usage examples, including input collection, output formatting, and variable passing conventions for enhanced automation capabilities.
fix(website): use consistent brand-blue announcement bar @osterman (#2551)
what
- Removed the per-announcement
backgroundColor/textColoroverrides fromwebsite/src/data/announcements.jsso every announcement bar entry inherits the brand-blue (#3578e5) / white-text defaults from the--announcement-bar-*CSS variables. - Documented the convention in the file header so future announcements don't reintroduce per-entry colors.
why
- The announcement bar cycled through a rainbow of saturated Tailwind-600 colors (emerald green, violet, cyan, amber, red, indigo, teal...) that looked like "crayola" against the site's dark, near-black theme.
- A single restrained, on-brand color reads as sophisticated and consistent with the rest of the dark site, and matches the bar's original styling.
references
- N/A (website cosmetic change)
Summary by CodeRabbit
- Refactor
- Standardized announcement bar styling configuration to use shared CSS variables instead of per-announcement color settings, improving consistency across announcements.
feat: Implement workflow step types with registry pattern (DEV-263, DEV-2969) @osterman (#1899)
what
- Add 20+ step types across 4 categories (Interactive, Output, UI, Command) with extensible registry pattern
- Support Go template variable passing between steps (e.g.,
{{ .steps.step1.value }}) - Implement per-step output modes: viewport (pager), raw (passthrough), log (grouped), none (silent)
- Interactive handlers with TTY detection and clear error messages in CI environments
why
Addresses DEV-263 (add input type to workflows) and DEV-2969 (add viewport support). Enables users to build complex multi-step workflows with user interaction, conditional execution, and flexible result display.
references
Summary by CodeRabbit
Release Notes
-
New Features
- Added 25+ interactive step types for workflows and custom commands (input, confirm, choose, filter, file, write, markdown, spin, table, style, and more).
- Support for configurable output modes (viewport, raw, log, none) and step-level display options.
- Workflow progress rendering and status indicators.
-
Documentation
- Comprehensive guides for interactive workflows and custom commands with step type reference.
- New examples demonstrating interactive deployments, credentials collection, and multi-step flows.
-
Bug Fixes
- Improved error messaging for workflow step validation and execution failures.
Add process and I/O execution foundation @shirkevich (#2464)
Summary
This is PR 1 for the DAG concurrent execution rollout. It introduces the reusable process and stream-isolation foundation without enabling scheduler behavior or changing Terraform bulk routing.
Changes:
- Add
pkg/processwithRunner,TaskSpec,Streams,Result, defaultos/execrunner, context-aware execution, cancellation reporting, and exit-code preservation. - Extend
pkg/iowith prefixed per-node stream composition for terminal, file, and capture sinks. - Refactor
internal/exec.ExecuteShellCommand()into a backward-compatible wrapper overpkg/processwhile preserving CI stdout/stderr capture options. - Replace the
runTerraformShow()globalos.Stdoutswap with injected stdout capture.
Scope
No scheduler, CLI routing consolidation, concurrency flags, or Terraform adapter behavior is enabled in this PR.
Stacking
This PR is the bottom of the DAG rollout stack and targets main.
Supersedes the earlier fork-headed draft #2459 now that the stack branches exist in cloudposse/atmos.
Validation
rtk env GOCACHE=/private/tmp/atmos-gocache GOMODCACHE=/private/tmp/atmos-gomodcache go test ./pkg/process ./pkg/io ./internal/exec ./cmd/terraformNext PR
PR 2 branches from codex/dag-process-io-foundation and adds the generic pkg/scheduler core with ready-queue scheduling, bounded workers, deterministic aggregate results, and isolated unit tests only.
Summary by CodeRabbit
Release Notes
-
New Features
- Configurable subprocess execution with optional contexts and injectable streams
- Composable, scope-scoped output writers with per-line prefixing and masking
-
Bug Fixes
- More accurate subprocess exit/error reporting and improved stream-redirection behavior
-
Tests
- Expanded unit tests for subprocess execution, stream injection/capture, and output utilities
-
Documentation
- Updated concurrent execution docs to reflect stream-based output handling
Add core git YAML functions @osterman (#2537)
what
- Add core Git YAML functions:
!git.root,!git.sha,!git.branch, and!git.ref. - Resolve Git metadata through
pkg/git, withpkg/utilslimited to compatibility shims and YAML tag registration. - Wire Git tag resolution through config preprocessing, stack/component YAML processing, and function registry metadata.
- Add a changelog post and roadmap milestone for the new Git YAML functions.
why
- Allow dev stack/component source versions to pin to the current Git SHA via
!git.ref. - Keep prod pins explicit while giving dev environments PR-aware source refs.
- Avoid expanding
pkg/utilsby placing Git behavior in the self-contained Git package.
references
- n/a
Summary by CodeRabbit
-
New Features
- Added Git YAML tags (!git.root / !repo-root, !git.sha, !git.ref, !git.branch) to resolve repo root, commit SHA/ref, and branch in configs and stacks; !git.ref can pin source versions.
-
Refactor
- Centralized git tag resolution for consistent behavior, alias support, unified fallbacks, and clearer error handling.
-
Tests
- Expanded coverage for tag resolution, fallbacks, detached‑HEAD behavior, and real-repo scenarios.
-
Documentation
- Updated blog post and roadmap with examples and usage notes.
🚀 Enhancements
fix(stacks): honor component list_merge_strategy in metadata.inherits… @JaseKoonce (#2565)
what
-
settings.list_merge_strategy set on a component now applies when merging lists via metadata.inherits
-
Adds tests covering append, replace, and merge strategies across single and multi-level inheritance
chains
why
-
Component-level list_merge_strategy was only honored on the import/stack merge path (fixed in #2480).
The metadata.inherits resolution path always used the global atmosConfig, so per-component overrides were
silently ignored -
A component with list_merge_strategy: append inheriting two bases would get last-wins ([from_b]) instead
of the expected accumulated result ([from_a, from_b])
references
Summary by CodeRabbit
-
Improvements
- Component inheritance now applies per-component list merge strategies during metadata-based inheritance so inherited lists are accumulated, replaced, or merged by index according to the inheriting component’s settings across multi-level chains.
-
Tests
- Added integration tests and fixture scenarios validating append, replace, multi-level append, and merge-by-index behaviors for metadata inheritance.
fix(auth): unwrap Atmos Pro envelope in github/sts mint @osterman (#2568)
what
- Fix the
github/stsauth integration ignoring a successfully minted Atmos Pro STS token becausemint()decoded the response with a flat struct instead of the canonical API envelope. - Add a shared, reusable primitive —
dtos.Envelope[T]+pro.DecodeEnvelope[T]— and routemint()through it so every Atmos Pro response unwraps the nesteddatapayload through one sanctioned path. - Fix the bug-masking test fixture (the simulated broker now emits the real envelope shape) and add a regression test asserting
mint()persists 1 token, not 0, plus decoder unit tests including a canary that a flat payload decodes to emptydata.
why
- Every Atmos Pro API route returns
{ "success": true, "status": 200, "data": { "tokens": [...], "excluded": [...] } }, butmint()decoded straight into the flatstsResponse(top-leveltokens), so it always read 0 tokens — the CLI loggedGitHub STS: no tokens granted, never wrote the gitinsteadOfconfig, and cross-repoimport:calls fell back to the ambientGITHUB_TOKENand failed withremote: Repository not found, even though the server had minted a valid token (HTTP 200, so no error surfaced). - The existing e2e test passed only because its simulated broker returned the unwrapped
{tokens,excluded}shape the real server never sends; matching the fixture to the real envelope and adding the regression/canary tests prevents this whole class of "decoded a Pro response without the envelope" bug from recurring.
references
mint()was the only Pro call bypassing the sharedAtmosApiResponseenvelope thatExchangeOIDCToken/LockStackalready use.
Summary by CodeRabbit
-
Bug Fixes
- Clearer STS error messages and correct unwrapping of canonical API envelopes.
- Prevent ambient tokens from being baked into Git URLs by honoring insteadOf rewrites (including file-mode).
- Avoid invalid git checkout/fetch for empty refs by fetching default branch and skipping bad checkouts.
- Warn when component
sourceis misplaced undermetadataand accept simple-formsourcestrings.
-
New Features
- Provision credential brokers before Git source detection so token rewrites apply.
-
Tests
- Expanded tests covering envelope decoding, STS handling, broker provisioning, git insteadOf, and default-ref behavior.
-
Documentation
- Added fix notes on STS envelope/token-shadowing and updated PRD guidance for
source.
- Added fix notes on STS envelope/token-shadowing and updated PRD guidance for
fix(pro): respect metadata.enabled when uploading instances for drift @osterman (#2563)
what
atmos list instances --uploadnow collapses the Atmos Pro enabled hierarchy (metadata.enabled>settings.pro.enabled>settings.pro.drift_detection.enabled) before uploading, so the values Atmos Pro persists already reflect any outer disable.- A shared
effectiveEnabledStatehelper is the single source of truth for both the upload payload (extractProSettings) and the success-toast counts, so they can no longer diverge. - Disabled components are still uploaded (as
pro.enabled: false) rather than omitted, so Atmos Pro shows them disabled instead of orphaning them. - Reference docs corrected (
settings/pro.mdxgains asettings.pro.enabledentry + precedence note;list/list-instances.mdxdrops the now-false "preserved verbatim" / "drift is independent of pro.enabled" claims), plus adocs/fixes/write-up.
why
- Components disabled upstream via
metadata.enabled: falsekept failing scheduled drift detection (dispatchError: "missing_plan_result",drift_status: error): the CLI skips planning them, but the upload serialized the rawsettings.problock and never sentmetadata.enabled, so Atmos Pro (whose ingestion contract has nometadatafield) persisted them asenabled:true, drift_enabled:trueand legitimately dispatched drift. - Fixing it in the CLI keeps the determination where it is already resolved and needs no Atmos Pro change: the stuck
errorrows self-heal todisabledon the next upload, with no data migration. pro.enableddefaults totrue(matching the Pro server-side default) so the collapse only ever turns things off when an outer level is explicitly disabled — it never regresses default-enabled components.
references
docs/fixes/2026-06-03-drift-dispatch-ignores-metadata-enabled.md(root-cause analysis, Neoninstancesevidence, verification steps)- Source of truth for the disabled determination:
internal/exec/component_utils.go(isComponentEnabled)
Summary by CodeRabbit
-
Bug Fixes
- Resolve upload so component enablement honors metadata.enabled, preventing metadata-disabled components from remaining scheduled for drift and correcting counts; disabled components are uploaded as disabled rather than omitted.
-
Documentation
- Clarify enablement precedence (metadata.enabled > settings.pro.enabled > drift_detection.enabled), upload behavior, and how effective Pro/drift state is reflected in UI counts.
-
Tests
- Add unit and end-to-end tests validating effective enablement resolution, drift counting, and uploaded payloads.
fix(auth): deduplicate ECR, ECR Public, and EKS integrations to once per process @MrZablah (#2564)
What
Adds a process-level execution cache to triggerIntegrations so that
auto-provisioned integrations (aws/ecr, aws/ecr-public, aws/eks)
fire at most once per atmos invocation, regardless of how many times
Authenticate is called or how many AuthManager instances are created.
The cache key is the integration's canonical target endpoint rather than
its config entry name:
aws/ecr→"aws/ecr:<account_id>:<region>"aws/ecr-public→"aws/ecr-public"(single global registry)aws/eks→"aws/eks:<cluster_name>:<region>"- everything else → integration name (no behaviour change)
This means two config entries that point at the same registry — e.g. one
from global atmos.yaml and one from a component stack file — are
collapsed to a single execution.
Why
atmos terraform plan calls Authenticate from at least three internal
paths: setupTerraformAuth, TerraformPreHook, and one call per YAML
function (!store.get, !terraform.state). With a 6-tool .tool-versions
this produced 6 ECR logins per command. Switching to a name-keyed cache
reduced it to 2 because merged configs can carry two integration entries
with different names for the same registry. Keying by target endpoint
reduces this to exactly 1.
Changes
pkg/auth/manager_integrations.go— addsprocessIntegrationCache sync.Map,resetProcessIntegrationCache()(test helper),
integrationTargetKey()(canonical key helper coveringaws/ecr,
aws/ecr-public,aws/eks); updatestriggerIntegrationsto use
LoadOrStoreon the target key.pkg/auth/manager_integrations_test.go— adds
TestIntegrationTargetKey(table-driven tests for all key variants
including ECR Public) andTestIntegrationTargetKey_Deduplication
(verifies that two same-registry entries produce one cache hit).
Notes
aws/ecr-public was added to upstream/main in #2231 after this branch
diverged; coverage for it was added here to keep deduplication consistent
across all three AWS integration kinds.
references
ECR / ECR Public Login Executes Multiple Times Per atmos terraform Invocation
#2562
Summary by CodeRabbit
-
New Features
- Added process-level deduplication for auto-provisioned integrations to prevent redundant provisioning of the same target within a single process.
- Failed provisioning attempts are evicted from the dedupe cache so retries can proceed.
-
Tests
- Added unit tests validating cache key behavior and deduplication scenarios to ensure consistent provisioning outcomes.
fix(auth): make github/sts compose with default GitHub token injection @osterman (#2557)
what
- Stop Atmos's go-getter token injection from silently shadowing
github/sts-minted GitHub tokens:CustomGitDetectornow skips URL token injection when a liveGIT_CONFIG_*insteadOfrewrite already matches the URL's host/owner, so git's rewrite (carrying the correct least-privilege token) wins. - Make the
ATMOS_PRO_GITHUB_TOKENbridge consistent:resolveTokenfalls back to the live env var (which the broker sets after startup), mirroringpkg/http/client.go. - Default
token_envtoATMOS_PRO_GITHUB_TOKEN(was empty) so a single-owner mint reachesgh/REST and Atmos's in-process git path automatically. - Replace the ad-hoc
{owner}placeholder with Atmos's standard Go-template syntax ({{ .owner }}, plus.host); update docs, PRD, and add adocs/fixes/write-up.
why
- A real CI job resolving a remote
import:from a second private repo failed withremote: Repository not found— the minted token was correct, but the ambientGITHUB_TOKENwas being injected into the URL ahead of it, defeating git'sinsteadOfrewrite. The only fix was thesettings.inject_github_token: falseworkaround. - These changes make
github/sts(introduced in #2546) compose with the defaultsettings.inject_github_token: true, so it "just works" with no workaround. Reproduced first with a simulated-broker e2e test, then fixed.
references
- Fixes the
github/stsfeature shipped in #2546 docs/fixes/2026-06-01-github-sts-token-injection-shadowing.md(root cause, fix, and why this is a fix doc rather than a changelog entry)docs/prd/atmos-pro-sts.md
Summary by CodeRabbit
-
Bug Fixes
- Prevented minted GitHub tokens from being silently overridden by detecting broker-provided git URL rewrites and skipping ambient token injection.
-
New Features
- token_env accepts Go-template names (e.g., GH_TOKEN_{{ .owner }}) and defaults to ATMOS_PRO_GITHUB_TOKEN when appropriate.
- Token resolution prefers a live exported broker token before falling back to configured values; minted tokens are not logged.
-
Documentation
- Clarified github/sts token_env semantics, templating, multi-owner behavior, and URL-rewrite interactions.
-
Tests
- Added/expanded tests for token-env defaults, templating, precedence, and insteadOf handling.
-
Chores
- Made license NOTICE generation produce deterministic URLs.
fix(auth): report missing exec binary instead of "atmos requires a subcommand" @osterman (#2559)
what
- Fix
atmos auth exec -- <command>reporting the misleading "The command atmos requires a subcommand" when the executable after--(e.g.uvx) is not found onPATH. - The missing executable is now reported clearly via the error builder: the command name, the underlying cause, a PATH hint, and exit code
127. - Internally, Cobra's "unknown command" conversion now uses the
ErrUnknownSubcommandsentinel, and the root handler intercepts that (via a new testableunknownSubcommandhelper) instead of the overloadedErrCommandNotFound.
why
auth execand the registry executor both wrapped the sharedErrCommandNotFoundsentinel, so a missing user binary was indistinguishable from an unknown Atmos subcommand and got masked as root usage output — hiding the real cause.- Separating the two sentinels gives accurate errors for both cases (genuine unknown subcommands still show root usage with suggestions; missing executables now say "command not found" with a hint), and also fixes the same latent masking for
pkg/hookscommand lookups.
references
- Regression from the
atmos auth→ command-registry migration (#1919) combined with the registry executor's Cobra-error conversion (#1643).
Summary by CodeRabbit
-
Bug Fixes
- Clearer "command not found" errors with install guidance and enforced exit code 127.
- Distinguish missing external executables from unknown subcommands so help is shown only for genuine unknown subcommands.
-
Tests
- Added/updated tests to guard error-classification behaviors and prevent regressions.
-
Documentation
- Adjusted BSD dependency listing to mark the URL as Unknown.
fix: allow --use-version artifact downloads without GitHub token @osterman (#2212)
what
- Allow unauthenticated artifact downloads for public repositories via
--use-versionflag - Metadata fetching (PR info, workflow runs, artifact listing) and artifact downloads now work without authentication on public repos per GitHub API docs
- Replace upfront
GetGitHubTokenOrError()gate with optionalGetGitHubToken()inInstallFromPR()andInstallFromSHA() - Skip
Authorizationheader when token is unavailable indownloadPRArtifact() - Add smart HTTP error handling with
buildDownloadHTTPError()to distinguish auth failures from rate limiting
why
- Users without GitHub token environment variables couldn't install PR artifacts, even for public repositories
- Rate limit errors (429) were reported generically as "HTTP 429" with no actionable context
- Need to properly surface rate limit information (60/hr for unauthenticated, 5,000/hr for authenticated) to guide users
references
- Fixes the issue where
atmos --use-version=2129fails with "authentication failed" when noGITHUB_TOKENis set - GitHub API documentation confirms artifact downloads work without authentication for public repositories
Summary by CodeRabbit
-
New Features
- Added optional unauthenticated access for public GitHub artifacts (subject to rate limits)
- New ATMOS_GITHUB_CLI env var to control/disable CLI-based token retrieval
-
Bug Fixes
- Clearer handling and messaging for auth vs rate-limit errors, with improved hints and retry info
- GitHub token is now optional for artifact operations (falls back to anonymous when available)
-
Tests
- Expanded tests for artifact downloads and HTTP auth/rate-limit scenarios
-
Documentation
- Documented ATMOS_GITHUB_CLI usage and behavior
fix(version): honor ATMOS_USE_VERSION env var for version re-exec @osterman (#2556)
what
- Honor the documented
ATMOS_USE_VERSIONenvironment variable so Atmos actually switches to (and downloads, if needed) the requested version during early re-exec. resolveRequestedVersionnow readsATMOS_USE_VERSION, with precedenceATMOS_VERSION_USE>ATMOS_USE_VERSION>ATMOS_VERSION>version.use.cmd/root.goalso honorsATMOS_USE_VERSIONfrom the environment so version-management commands (e.g.atmos version) re-exec on it just like the--use-versionflag.- Add a table case and a precedence test covering the new behavior.
why
ATMOS_USE_VERSIONis advertised as the primary env var (docs atwebsite/docs/cli/environment-variables.mdxand the flag bindingWithEnvVars("use-version", "ATMOS_USE_VERSION")), but the re-exec resolver never read it — it only checked the internalATMOS_VERSION_USE(set solely by the CLI flag), theATMOS_VERSIONalias, andversion.useconfig.- An env-populated flag is not marked
Changed()and maps to viper keyuse-versionrather thanversion.use, soATMOS_USE_VERSIONfell through every code path — setting it was a complete no-op. - This surfaced in CI where
ATMOS_USE_VERSIONwas set foratmos describe affected --uploadbut Atmos ran the already-installed version instead of switching. This brings the code in line with the existing documentation.
references
- Docs already describe the intended behavior:
website/docs/cli/environment-variables.mdx
Summary by CodeRabbit
-
New Features
- Added support for the ATMOS_USE_VERSION environment variable as an alternative to the --use-version CLI flag.
- Updated version selection precedence to consider environment variables in the defined order.
-
Tests
- Extended test coverage for environment-variable-driven version selection scenarios.
-
Chores
- Updated NOTICE entry for a dependency license URL.
fix(auth): honor keyring.type config and send DPoP proof on AWS webflow @osterman (#2545)
what
- Honor
auth.keyring.typefromatmos.yamlacross all auth-manager entrypoints by threadingauthConfigintocredentials.NewCredentialStoreWithConfig(...)(was silently dropped via the no-argNewCredentialStore()), and inject the manager's config-aware store into AWSuseridentities via a new optionalSetCredentialStoreinterface. - Add an RFC 9449 DPoP proof (EC P-256 / ES256, stdlib-only) to the AWS browser webflow token requests; generate the key per session, persist it in the refresh-token cache, and reuse it on refresh (a cache without a key falls back to the browser flow).
- Add
AuthManager.CredentialStoreType()for observability/testability, mark the no-argNewCredentialStore()constructorDeprecated, and add unit tests for both fixes (keyring backend selection, DPoP proof structure/signature, key round-trip, header presence).
why
- #2544: with
auth.keyring.type: memoryset, Atmos still selected the defaultsystemkeyring and hung indefinitely on hosts where the keyring service is present but unusable (e.g. a lockedgnome-keyring-daemon). The config value was read and then thrown away before backend selection — onlyATMOS_KEYRING_TYPEworked. Now the configured backend is honored everywhere an auth manager is built. - #2542: AWS sign-in's
/v1/tokenendpoint now rejects requests without a DPoP proof (HTTP 400 INVALID_REQUEST), so browser-based authentication foraws/useridentities failed at the code-exchange step. Sending the proof restores the flow; because the public-client refresh token is bound to the DPoP key, the key is persisted and reused on refresh.
references
- closes #2542
- closes #2544
- RFC 9449 (DPoP): https://datatracker.ietf.org/doc/html/rfc9449
Summary by CodeRabbit
-
New Features
- Added RFC 9449 DPoP support for AWS OAuth token exchanges to strengthen token binding.
- Auth now respects configured keyring backend across authentication flows.
-
Bug Fixes
- Fixed AWS token parsing to match real-world snake_case responses.
-
Improvements
- Auth manager exposes credential store backend type for easier diagnostics.
fix(yaml-functions): honor init.pass_vars when resolving !terraform.output (#1412) @thejrose1984 (#2548)
what
When components.terraform.init.pass_vars: true is set, forward the component's vars to the internal terraform init that runs while resolving !terraform.output, via TF_VAR_* environment variables.
ComponentConfiggainsPassVars+Vars, populated inExtractComponentConfig.SetupEnvironmentinjectsTF_VAR_*for each var whenPassVarsis true (strings verbatim, other types JSON-encoded).- Regression tests cover the enabled path (string/number/bool/list), the disabled default, and env-section precedence.
why
Closes #1412.
The main terraform path honors pass_vars by passing -var-file to init (terraform_execute_helpers.go), so modules with init-time variable dependencies (e.g. a module version/source bound to var.aks_version) can initialize. But the init that runs while resolving !terraform.output goes through pkg/terraform/output, which uses the terraform-exec library and never honored pass_vars:
runInitonly setUpgrade(false)+ optionalReconfigure(true).ComponentConfighad noPassVars/vars plumbing.terraform-exec'sinitConfighas no var-file field — it structurally cannot pass-var-filetoinit.
So atmos tofu init/plan -s <stack> failed with Unable to compute static value / module.aks.version depends on var.aks_version which is not available whenever an init-time var came from a component resolved via !terraform.output.
Why TF_VAR_* rather than a var-file
terraform-exec can't attach a var-file to init, and an auto-loaded *.auto.tfvars.json on disk would risk cross-stack contamination when components are resolved concurrently. TF_VAR_* is process/runner-scoped, reaches init transparently through the existing SetEnv call, and Terraform/OpenTofu accept these values for the matching variable types (JSON encodings of lists/maps are valid HCL2). Gated behind pass_vars (default false), so it's a no-op unless opted in; an explicit TF_VAR_* in the component env section still wins.
references
- Closes #1412
test plan
go test ./pkg/terraform/output/...
New tests:
TestDefaultEnvironmentSetup_PassVars— vars exported asTF_VAR_*with correct encoding.TestDefaultEnvironmentSetup_PassVarsDisabled— noTF_VAR_*whenpass_varsis off.TestDefaultEnvironmentSetup_PassVarsEnvSectionWins— explicit env-sectionTF_VAR_*wins.
Validation note: verified at the unit level (init env now carries the component vars when
pass_varsis set; previously the init invocation was unchanged whetherpass_varswas on or off). I don't have terraform/tofu in this environment to re-run the reporter's fulltofu init/planend-to-end, so a maintainer check against a real init-time-dependent module would be worthwhile before release.
Summary by CodeRabbit
-
New Features
- Added an option to forward component variables as TF_VAR_* environment entries during Terraform/OpenTofu init; existing TF_VAR_* values are preserved and non-string values are JSON-encoded.
-
Tests
- Added tests for enabled/disabled forwarding, JSON encoding of non-strings, precedence of explicit env values, and end-to-end propagation to the runner env.
-
Documentation
- Docs updated to note init.pass_vars also applies to implicit init runs and how forwarded vars are presented as TF_VAR_*.
test(yaml-functions): regression test for mixed state/output circular dependency (#2005) @thejrose1984 (#2547)
what
- Add a regression test and fixture for a cross-component circular dependency that mixes
!terraform.stateand!terraform.output(component-a →!terraform.statecomponent-b; component-b →!terraform.outputcomponent-a). - New fixture:
tests/fixtures/scenarios/yaml-functions-circular-deps-mixed. - New test:
TestYAMLFunctionsCrossComponentCycleMixed.
why
This is the exact scenario from #2005. It was the same root cause as #2457 and was fixed by #2533 (making ProcessCustomYamlTags reuse the goroutine-local ResolutionContext so the Visited map survives nested walks). That fix covers both state↔state and the mixed state↔output path, but only the state↔state case had a regression test — so #2005 could silently regress while the existing test stayed green.
Verified the mixed cycle hangs (infinite recursion / goroutine stack overflow) on the commit before #2533, and returns a clean ErrCircularDependency on current main.
references
test plan
go test ./tests -run TestYAMLFunctionsCrossComponentCycle -v
Both TestYAMLFunctionsCrossComponentCycle (state↔state) and TestYAMLFunctionsCrossComponentCycleMixed (state↔output) pass. The mixed test asserts ErrCircularDependency is returned and that the MaxResolutionDepth safety net is not what fired (which would indicate the primary cycle detector regressed).
fix: defer custom-command/built-in collision warning to invocation time @thejrose1984 (#2549)
what
Scope is intentionally narrow: change only when the existing collision warning fires — defer it from command-registration time to the moment the conflicting command is actually invoked.
- No change to collision behavior: the built-in still wins and custom
stepsare still ignored. - No
override:/invoke:work — that opt-in design is tracked separately in thecustom-command-builtin-overridePRD. - Implemented by wrapping the conflicting built-in command's
PreRunEinprocessCustomCommands(preserving any existingPreRunE/PreRunand honoring Cobra's precedence ofPreRunEoverPreRun). - Adds a regression test asserting the warning is absent at registration and present (exactly once) on invocation.
why
Today the warning (introduced in #2191) is emitted from processCustomCommands, which runs during root init on every Atmos invocation. So a single colliding custom command makes every command — atmos list stacks, atmos terraform ..., etc. — print a warning about, say, a plan collision it never touched. The result is worse than noisy:
- It's misleading — the warning points at a command the user didn't run.
- It breaks scripting/CI that reads stderr, since every command (except
version) emits it.
Deferring the warning to invocation makes it accurate and actionable: it appears exactly once, only when you run the command the warning is actually about, and stderr stays clean for every other command. Same information, delivered at the moment it's relevant instead of on every unrelated call.
Behavior
| Invocation | Before | After |
|---|---|---|
atmos list stacks (with a colliding custom plan)
| ⚠ warning printed | no warning |
atmos <colliding command>
| ⚠ warning printed (and also for every other command) | ⚠ warning printed once, here only |
references
test
go test ./cmd/ -run 'TestCustomCommand_.*Collision|TestCustomCommand_StepsConflictWarning|TestCustomCommand_NamespaceMerge|TestCustomCommand_DeepNesting'
Verified the new test fails against the previous (emit-at-registration) behavior and passes with the fix.
Summary by CodeRabbit
-
Bug Fixes
- Collision warnings for custom commands that overlap built-in leaf commands are now deferred until the conflicting command is invoked, reducing startup noise and preserving existing pre-run error behavior.
-
Tests
- Added regression tests to verify deferred warnings are emitted exactly once on invocation and that existing pre-run behavior and error propagation remain intact; tests skip on Windows where stderr capture is unreliable.
fix(flags): register --settings-list-merge-strategy as a global flag (#2398) @thejrose1984 (#2540)
what
- Register
--settings-list-merge-strategyas a global persistent flag onRootCmd, with env binding toATMOS_SETTINGS_LIST_MERGE_STRATEGY. - Add a Cobra-direct fallback in
ProcessCommandLineArgsso the value reachesConfigAndStacksInfoeven when Cobra strips the flag fromRunE's args. - In
setSettingsConfig, scanos.Args(mirroringsetLogConfig'sparseFlags()pattern) so command paths that callInitCliConfigdirectly with a zero-valueConfigAndStacksInfo(e.g.describe config) still honor the flag. - Unit test the registration, inheritance, defaults, CLI value, and env-var path.
why
The flag is advertised in two places:
atmos.yaml:344— "Can also be set using 'ATMOS_SETTINGS_LIST_MERGE_STRATEGY' environment variable, or '--settings-list-merge-strategy' command-line argument"website/docs/cli/configuration/settings/settings.mdx:54
And Atmos's internal arg/flag layer already expects it:
pkg/config/const.go:147—SettingsListMergeStrategyFlag = \"--settings-list-merge-strategy\"internal/exec/cli_utils.go:72— listed incommonFlagsinternal/exec/cli_utils.go:495— string-flag handler that writesinfo.SettingsListMergeStrategypkg/config/utils.go:726— applies it ontoatmosConfig.Settings.ListMergeStrategy
But it was never registered with Cobra at the global level. Subcommands that don't whitelist unknown flags (e.g. terraform plan, which has no FParseErrWhitelist) rejected the flag before the legacy commonFlags post-processing ever ran:
$ atmos --settings-list-merge-strategy=append terraform plan vpc -s test
Error: unknown flag --settings-list-merge-strategy for command atmos terraform plan
references
- Closes #2398
test plan
Unit tests added in pkg/flags/global_registry_test.go:
flag is registered on RootCmd as persistentdefaults to empty stringCLI flag value flows through ViperATMOS_SETTINGS_LIST_MERGE_STRATEGY env var flows through Vipersubcommand inherits the persistent flag
End-to-end verification on a minimal project (atmos.yaml has settings.list_merge_strategy: replace):
| Invocation | list_merge_strategy
|
|---|---|
atmos describe config
| replace (baseline from atmos.yaml)
|
atmos --settings-list-merge-strategy=append describe config
| append
|
atmos describe config --settings-list-merge-strategy=merge
| merge
|
ATMOS_SETTINGS_LIST_MERGE_STRATEGY=append atmos describe config
| append
|
atmos --help now lists --settings-list-merge-strategy.
Full test suites pass for the touched packages:
ok github.com/cloudposse/atmos/pkg/flags
ok github.com/cloudposse/atmos/pkg/flags/global
ok github.com/cloudposse/atmos/pkg/config
ok github.com/cloudposse/atmos/internal/exec
Summary by CodeRabbit
- New Features
- Added --settings-list-merge-strategy CLI flag (replace, append, merge) and ATMOS_SETTINGS_LIST_MERGE_STRATEGY env var to override list-merge behavior for an invocation
- Documentation
- Documented the new flag and environment variable with usage and defaults
- Tests
- Updated CLI help snapshots to include the new flag and refreshed help text formatting across commands
Fix templated store hook execution @osterman (#2539)
what
- Render hook execution fields only after a hook matches the current event and skip filters.
- Preserve static hook discovery/preflight while supporting
!templateand bare Go templates in store hook names, output keys, and output values. - Add regression tests for templated store hooks and non-matching hooks with invalid execution-only templates.
why
- Fixes a regression where templated
store-outputs.namevalues were used literally, causing store lookup failures. - Keeps pre-auth hook discovery safe while allowing execution-time hook fields to use the fully available component context.
- Prevents future regressions for both YAML function and bare Go template forms.
references
- Closes #2538
Summary by CodeRabbit
-
New Features
- Hooks now resolve execution-time templates and custom YAML functions, supporting nested templating, rendering into hook execution fields, stronger type validation, and clearer hook-specific error messages.
-
Tests
- Added tests for template rendering, YAML-function evaluation, nested value processing, error cases, and store-hook execution behavior.
fix(auth): normalize override keys to uppercase in filterAtmosOverrides (#2349) @thejrose1984 (#2541)
what
- Uppercase the override key before the prefix check (and in the returned map) inside
pkg/auth/manager_env_overrides.go:filterAtmosOverrides. - Add regression test cases in
TestFilterAtmosOverridescovering Viper-lowercased keys, mixed-case keys, and mixed atmos/non-atmos casings.
why
filterAtmosOverrides did a case-sensitive strings.HasPrefix(k, \"ATMOS_\"). The function's documented contract was "only keys with the ATMOS_* prefix" — but in production the only realistic source of its input map is an MCP server env: block in atmos.yaml / .atmos.d/mcp.yaml, which Viper loads with all map keys lowercased.
This is the same Viper-lowercasing pitfall already documented and handled on a sibling code path by pkg/mcp/client/mcpconfig.go:copyEnv (the CLI-provider pass-through that writes config files for Claude Code / Codex / Gemini). That fix wasn't applied to the auth code path, so an authored:
mcp:
servers:
atmos:
command: atmos
args: [\"mcp\", \"start\"]
env:
ATMOS_PROFILE: managers
identity: core-root/terraformreached filterAtmosOverrides as {\"atmos_profile\": \"managers\"}, was silently dropped, and the auth manager was rebuilt against the default profile. Identity resolution then surfaced as:
✗ Server failed to start
Error: MCP server failed to start: atmos: auth setup failed for \"atmos\": identity not found: core-root/terraform
I confirmed Viper's lowercasing end-to-end against the actual schema.MCPServerConfig shape (Env map[string]string):
env key=\"atmos_profile\" value=\"managers\"
env key=\"aws_region\" value=\"us-east-1\"
— so the authored ATMOS_PROFILE is gone by the time the filter runs.
scope of behavior change
- Already-uppercase callers (
ATMOS_PROFILE): unchanged. - Previously-dropped lowercase/mixed-case callers (
atmos_profile,Atmos_Profile): now honored — and those are exactly the users hitting the documented bug. - Non-
ATMOS_*keys: still dropped, regardless of case (aws_profile,FOO,foo). - Existing
TestFilterAtmosOverridescases still pass unchanged. - Existing
TestCreateAndAuthenticateManagerWithEnvOverrides_*tests still pass unchanged.
alternatives considered
I weighed three fix locations on the original issue:
- Uppercase inside
filterAtmosOverrides(this PR). Smallest possible surface, single source of truth for the auth path, doesn't touch the MCP layer. copyEnv(or equivalent) insideScopedAuthProvider.ForServer. Localizes to the MCP adapter; downside is a future non-MCP consumer ofCreateAndAuthenticateManagerWithEnvOverridesthat loads its env map from YAML would hit the same trap.- Uppercase at
ParseConfigtime. Widest reach — would also affect subprocess env propagation. A real (if narrow) behavior change for users who deliberately set unconventionally-cased env vars inenv:and expected those passed to the spawned MCP server verbatim.
Option 1 fixes the documented case without altering any other code path's behavior or risking the subprocess-env corner case in Option 3.
references
- Closes #2349
- Related context:
pkg/mcp/client/mcpconfig.go:128(copyEnv) — the parallel fix on the CLI-provider pass-through path that documents the Viper-lowercasing trap.
test plan
go test ./pkg/auth -run 'TestFilterAtmosOverrides|TestCreateAndAuthenticateManagerWithEnvOverrides' -v
go test ./pkg/auth ./pkg/mcp/client/...
Both pass. New regression subtests:
viper-lowercased atmos key is normalized to uppercasemixed-case atmos key is normalized to uppercaseviper-lowercased non-atmos key is droppedmixed casings across atmos and non-atmos keys
Summary by CodeRabbit
Bug Fixes
- Fixed an issue where environment configuration overrides specified in lowercase format (from YAML configuration files) were incorrectly dropped during processing. Environment override keys are now properly normalized to ensure consistent handling regardless of the input format used.