github cloudposse/atmos v1.222.0-rc.0

pre-release3 hours ago
feat(terraform): registry cache, RC management, and multi-platform mirror @osterman (#2582) ## what
  • Add a transparent Terraform/OpenTofu registry cache: an ephemeral local HTTPS network-mirror proxy (pkg/http/proxy, pkg/terraform/{cache,registry}) that caches providers and modules in the canonical filesystem_mirror layout, enabled with components.terraform.cache.enabled: true.
  • Add the atmos terraform cache command group — list, stats, prune, delete, plus mirror (alias warm) for eager multi-platform pre-seeding and trust/untrust for the proxy certificate.
  • Add declarative Terraform CLI-config (.terraformrc) management via components.terraform.rc, exposed to the subprocess through TF_CLI_CONFIG_FILE/TOFU_CLI_CONFIG_FILE.
  • Add a first-class components.terraform.platforms setting (target <os>_<arch> list) that drives both eager atmos terraform cache mirror pre-seeding (--all/--components/--query/-s, package-manager-style TUI, --format json|yaml) and automatic completion of .terraform.lock.hcl.
  • Keep .terraform.lock.hcl complete across platforms: a built-in after.terraform.init provisioner runs terraform/tofu providers lock -platform=… for the declared platforms whenever a customized provider installation method (the default plugin cache, or the registry cache) is active. Because it runs after init, it sees the fully JIT-vendored and code-generated working directory, so the generated provider set (including stack-config provider versions) is what gets locked — and committed lock files install cleanly on every platform in a fleet.
  • Generate and cache a self-signed loopback certificate so the proxy can serve HTTPS (required by Terraform/OpenTofu network mirrors); trusted automatically via SSL_CERT_FILE on Linux/CI and via a one-time atmos terraform cache trust on macOS/Windows.
  • Add examples/caching (auto-installs OpenTofu via the toolchain), PRDs, command + configuration docs, blog posts, and a roadmap update.

why

  • Repeated and CI runs re-download the same providers and modules; the cache eliminates that, keeps runs working through registry outages, and preserves the exact versions a deployment used.
  • Atmos enables a provider plugin cache (TF_PLUGIN_CACHE_DIR) by default, and network mirrors behave the same way: Terraform can no longer record the registry's signed cross-platform checksums, so init writes a .terraform.lock.hcl with hashes for only the current platform and prints the "Incomplete lock file information for providers" warning. Declaring components.terraform.platforms lets Atmos complete the lock automatically for every target platform.
  • The lazy proxy only caches the host platform, so mixed CI/developer fleets and air-gapped reproducible builds need declarative multi-platform pre-seeding — components.terraform.platforms + cache mirror provide it.
  • Declarative rc lets teams manage provider mirrors, credentials, and other CLI-config directives from atmos.yaml instead of per-machine .terraformrc files.

references

  • Closes #2150
  • docs/prd/terraform-registry-cache.md, docs/prd/terraform-rc-management.md, docs/prd/terraform-registry-cache-tls.md

Summary by CodeRabbit

Release Notes

  • New Features

    • Added an experimental Terraform/OpenTofu registry cache with disk-backed mirroring, metadata freshness controls (TTL + stale-while-revalidate), per-key locking, and a savings report.
    • Added atmos terraform cache subcommands: list/stats (table/JSON/YAML output), prune (--older-than, --dry-run, --all), delete <key>, mirror (warm alias, optional eager pre-seeding).
    • Added trust/untrust for HTTPS certificate trust on macOS/Windows (actionable when untrusted).
  • Documentation

    • Added/updated guides and CLI references for registry cache, CLI RC management (components.terraform.rc), and all cache subcommands.
feat: Atmos Git — foundational capability for GitOps enablement @osterman (#2597) ## what

Atmos Git: Git becomes a foundational platform capability, on par with Toolchain, Auth, and Hooks — the enablement layer for GitOps workflows where Atmos commits generated artifacts to a source-of-truth repository. PRD: docs/prd/git-ops.md.

  • Top-level git configgit.repositories.<name> declares managed repositories (uri, branch, remote, clone depth/filter/single-branch/submodules, auth.identity, commit.signing/commit.author, push.retries), git.hooks declares local Git hooks, git.list configures list output. Workdirs default to automatic XDG cache locations ($XDG_CACHE_HOME/atmos/git/repositories/<name>) so the native CI cache captures and restores managed clones.
  • pkg/git service — provider registry (registry pattern) with the cli provider in v1 (chosen because GitHub STS materializes credentials as GIT_CONFIG_* env vars, which subprocess git honors and go-git ignores). Clone is defined as reconcile (clone-if-absent, else fetch + checkout + ff-only) so stale CI-cache restores are just faster clones. Safety rules: ff-only pull, no force push ever, push retry-with-rebase on non-fast-forward rejection, path-scoped commits that fail on unrelated dirty files, worktree path-traversal validation, per-invocation commit author injection (CI runners need no user.name), provenance trailers (Atmos-Stack, Atmos-Component, Atmos-Source-SHA).
  • atmos git command groupclone, pull, status, diff, commit, push, list, clean, plus git hooks install|uninstall|run, registered under the Git help group. --all bulk operations (bounded concurrency, attempt-all with errors.Join). Clone accepts configured names, plain URLs, and go-getter git::...?ref=&depth= URIs. No-arg clone in native CI (ci.enabled: true) infers the current repository from CI metadata and clones into the workspace — an actions/checkout replacement. atmos list git-repositories alias registered.
  • git hook kind — publishes generated artifacts on lifecycle events (after.terraform.apply, ...) to the current repository by default or a named managed repository, with templated commit messages, trailers, clean no-ops, and push-after-commit with retry. Inherits --skip-hooks and on_failure.
  • Local Git hook shimsatmos git hooks install writes worktree-aware .git/hooks/* shims (marker-protected, --force to overwrite, warns when core.hooksPath is set); run dispatches configured commands with stdin forwarding and exit-code propagation.
  • Error handling — new sentinels (ErrGitRepositoryNotFound, ErrGitAuthFailed, ErrGitPushRejected, ErrGitDirtyUnmanagedFiles, ErrGitPathEscapesWorktree, ErrGitHookNotConfigured, ErrGitRepositoryRequired, ErrGitProviderNotFound) with error-builder hints and exit-code mapping. Git stderr streams to the masked writer and is never embedded in error chains.
  • Docs & example — command pages under website/docs/cli/commands/git/, git configuration reference, hook kind docs, changelog blog post (atmos-gitops), roadmap milestone (CI/CD Simplification initiative), and a GitOps publishing demo at examples/gitops (reconcile → review → publish against a managed deployment repo via custom commands).

What this is — and isn't

Atmos owns the publishing side of GitOps: render → diff → commit → push, with centralized safety rules. Reconciliation stays with the consumer — Argo CD or Flux pulls from the repository, or CI applies on merge. There are no agents and no drift-correction loop in Atmos itself (explicit non-goal in the PRD); Atmos is the producer feeding the reconciler. This also isn't a replacement for the existing GitHub Actions plan/apply integration — it's the Git plumbing those pipelines use.

why

GitOps workflows have always needed glue: ad hoc scripts to render manifests into deployment repos, commit them, survive push races, and wire credentials. Atmos already owns rendering, lifecycle events, toolchain, and credentials (GitHub STS) — this PR gives it the Git operations between them, with centralized safety rules instead of per-pipeline shell scripts. It is the foundation for Kubernetes deployment-repository provisioning (Argo CD / Flux rendered-manifest publishing, on the kubernetes component branch) and a future github provider for pull-request-based publishing to protected branches.

references

  • PRD: docs/prd/git-ops.md (in this PR)
  • Coverage: pkg/git 86%, pkg/git/providers/cli 88%, pkg/hooks/kinds/git 94%, cmd/git 81%
  • Related: native CI cache (XDG-root archiving) and the Kubernetes component branch (consumes provision.git next)

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Added the experimental atmos git command group: clone, pull (fast-forward-only), status, diff, commit, push (retries on contention), clean, list, init, plus hooks (install, run, uninstall).
    • Introduced managed Git repositories in atmos.yaml (git.repositories) with deterministic workdirs, git:: URI query params (ref, depth), CI no-arg checkout, and concurrent --all operations.
    • atmos git list now supports configurable columns/formatting and optional status probing.
  • Documentation

    • Added/updated CLI, configuration, GitOps, and hook documentation for the new Git surfaces.
feat: support dotenv files in !include @osterman (#1930) ## Summary

Adds explicit dotenv file support to the existing !include YAML function. Dotenv files now resolve to maps, so they can be used directly in CLI and stack env sections and with YAML merge keys.

env:
  <<: !include .env
  AWS_REGION: us-east-2

Dotenv files can also be layered with YAML merge sequences. This uses YAML's << merge-key syntax, the same YAML mechanism commonly used with anchors and aliases:

env:
  <<:
    - !include .env.local
    - !include .env
  AWS_REGION: us-east-2

YAML merge sequence precedence is earlier item wins, and inline keys under env override all merged values.

What Changed

  • Parse .env, .env.*, and exact *.env filenames as dotenv files when used with !include
  • Support env: !include .env and env: { <<: !include .env } / block merge forms in stack config
  • Support dotenv !include in atmos.yaml env, including merge sequences for layered dotenv files
  • Preserve !include.raw behavior for raw file contents
  • Keep .envrc and foo.env.local unsupported/raw; Atmos does not auto-load or execute dotenv files
  • Preserve YAML custom tags during schema validation so env: !include .env satisfies stack manifest schema rules
  • Update the stack manifest JSON schema description for env to document the !include string form
  • Document dotenv includes in both CLI env and stack env docs, including YAML merge-key behavior, include path resolution, and layered files
  • Add a short blog post for explicit dotenv inclusion
  • Add a roadmap milestone entry for the shipped dotenv !include support
  • Add coverage-focused tests for dotenv merge-key retry handling, include path helpers, case-preservation helpers, and YAML custom-tag conversion
  • Harden the LocalStack demo provider config to use the local edge endpoint directly, path-style S3, and skip AWS account-ID discovery so Terraform does not hang before reaching LocalStack in CI

Tests

  • cd examples/demo-localstack && ATMOS_IDENTITY=false go run ../.. describe component demo -s dev --format json --logs-level Off | jq '.providers.aws'
  • cd examples/demo-localstack && ATMOS_IDENTITY=false go run ../.. validate stacks --logs-level Off
  • go test ./pkg/config ./pkg/validator ./pkg/filetype
  • go test ./internal/exec -run 'TestGenerateProviderOverrides|TestGenerateProviderOverridesForAliases|TestProcessStackConfigProviderSection'
  • go test ./pkg/config ./pkg/validator -coverprofile=.context/dotenv-include-coverage.out
  • go test ./pkg/utils -run 'TestInclude(Dotenv|ExtensionBased|RawFunction|WithNoExtension)'
  • node -e "import('./website/src/data/roadmap.js').then(() => console.log('roadmap import ok'))"
  • git diff --check
  • Real stack manifest schema regression: env: !include .env validates against tests/fixtures/schemas/atmos/atmos-manifest/1.0/atmos-manifest.json
  • Commit hooks passed: go-fumpt, Go build, go mod tidy, golangci-lint, whitespace/EOF/large-file checks

Closes DEV-2990

feat(ci): GitHub Actions build cache (atmos ci cache) @osterman (#2579) ## what
  • Add a CI build cache that restores the well-known Atmos cache root (~/.cache/atmos — toolchain binaries, vendored components, remote import clones, provider/plugin caches) at startup and saves it at exit, using the same store actions/cache uses (GitHub Actions Cache Service v2).
  • New atmos ci cache subcommands: restore, save, list, delete — so the lifecycle can run in one invocation or be spread across CI steps.
  • New ci.cache configuration block (enabled, auto: off|restore|save|both, root, paths, key, restore_keys, compression) with ATMOS_CI_CACHE_* env overrides.
  • Model it as a CI-provider capability (provider.CacheProvider + ci.DetectCache()) with a backend registry (pkg/ci/cache) and a GitHub Actions implementation (pkg/ci/cache/github), mirroring the existing artifact subsystem; outside a runner it's a clean no-op.
  • Consolidate the default toolchain install path under the XDG cache root (~/.cache/atmos/toolchain) so a single cache captures it; add a PRD, command/config docs, blog post, and roadmap entry.

why

  • In CI, every job re-downloads the toolchain, providers, and modules from upstream — wasting time/bandwidth and exposing runs to transient and rate-limit failures. Persisting the cache root across jobs makes executions faster, more reliable, and reduces supply-chain exposure.
  • Teams otherwise hand-wire an actions/cache step and own the key/path logic themselves; Atmos already knows its cache root and can derive a stable key from toolchain.lock.yaml + OS/arch, so it's two settings to enable.
  • Cache entries are write-once; a per-run state marker makes automatic and manual usage idempotent (an exact-key hit on restore skips the save), so the same operations work whether triggered automatically or via the subcommands.

references

  • PRD: docs/prd/native-ci/framework/ci-cache.md
  • Docs: /cli/commands/ci/cache and /cli/configuration/ci/cache
  • GitHub Actions Cache Service v2 (the store actions/cache uses)

Summary by CodeRabbit

  • New Features
    • Added native CI build caching: atmos ci cache group with paths, restore, save, list, and delete, including GitHub Actions-backed caching, admin list/delete, and template-based key/restore-key generation.
    • Automatic restore-on-start and save-on-exit when enabled and cache-capable; provider capability is respected outside supported CI.
  • Documentation
    • New/updated docs for CLI commands, ci.cache configuration, PRD/blog, and supporting GitHub Actions.
  • Tests
    • Expanded unit/integration coverage for archive safety, key/config resolution, backend behavior, manager lifecycle, and CLI output.
  • Chores
    • Updated acceptance caching, snapshots/docs, and aligned toolchain default install path with XDG cache.

🚀 Enhancements

fix(flags): scope --skip-hooks to the terraform command subtree @osterman (#2578) ## what
  • Scope --skip-hooks to the terraform command subtree. The flag (and ATMOS_SKIP_HOOKS) moved off the global flag set onto atmos terraform and its subcommands, so it no longer appears in the help of unrelated commands (auth, helmfile, atlantis, toolchain, about, secret, …). Lifecycle hooks only ever run on terraform plan/apply/deploy.
  • Stop tracking native-ci CI scratch output. tests/fixtures/scenarios/native-ci/{github-output,github-step-summary}.txt are runtime artifacts; gitignored and untracked (matching the newer native-ci-gha-plan scenario).
  • Standardize the CLI test suite on OpenTofu. The suite forces ATMOS_COMPONENTS_TERRAFORM_COMMAND=tofu via a single test-harness default, gates every binary-invoking test on a precondition so a missing binary skips cleanly (instead of baking "executable file not found" into goldens), and sanitizes the harness-injected env var out of debug snapshots. A small parity set (terraform -help/-version passthrough) opts back into terraform.
  • Provision test tooling via the Atmos toolchain (dogfooding). TestMain installs any missing pinned binary (terraform/tofu/packer/helmfile/helm) through the Atmos toolchain itself and prepends it to PATH — "install as necessary", so CI (which supplies them via setup-* actions) downloads nothing while local runs become self-contained. No host binaries (brew, etc.) required.

why

  • --skip-hooks on every command was misleading — hooks only run on terraform. Mirrors the existing --github-token/toolchain scoping precedent.
  • The native-ci scratch files were tracked, so every local run without terraform dirtied them. They're CI artifacts, not fixtures.
  • Test runs depended on whatever terraform/tofu binary was on the host; a missing binary silently corrupted golden snapshots and tracked fixtures. Standardizing on a single, license-clean (MPL) OpenTofu — with explicit preconditions — makes the suite deterministic and host-independent. The product runtime default stays terraform; only tests change.
  • Provisioning tools through the toolchain dogfoods the feature and removes the dependency on host-installed binaries, so the suite runs the same way everywhere.

references

  • Follows the --github-token/toolchain flag-scoping precedent in pkg/flags/global_builder.go.

Summary by CodeRabbit

  • New Features

    • Added --skill flag for AI context features across CLI commands (requires --ai).
  • Changes

    • Moved --skip-hooks from global flags to the atmos terraform command flags.
    • --skip-hooks applies to Terraform subcommands (plan/apply/deploy) and supports both no-value usage and comma-separated hook-name selection.
  • Documentation

    • Added/updated --skip-hooks documentation under Terraform command usage.
    • Removed --skip-hooks and ATMOS_SKIP_HOOKS from core global flag/environment variable references; updated hooks documentation accordingly.
fix(toolchain): retry cosign verification on transport-level network errors @osterman (#2604) ## what
  • Add a transportFlakeMarkers allowlist to the cosign retry classifier (pkg/toolchain/verification/signature_rekor.go) so transport-level network errors are retried like other transient Sigstore Rekor flakes:
    • stream error: stream ID (Go net/http2 stream errors — covers all HTTP/2 error codes and both send/recv variants)
    • connection reset by peer
    • TLS handshake timeout
    • i/o timeout
    • unexpected EOF
  • Extend TestClassifyCosignError with the exact error observed in CI plus one case per new marker, and add TestRunCosignWithRetry_RecoversFromTransportFlake covering end-to-end retry recovery.

why

CI failed on TestToolchainCustomCommands_InstallAllTools/Install_tofu while toolchain install opentofu/opentofu@1.9.0 was verifying the download signature. Cosign's query to the Sigstore Rekor transparency log died with:

searching log query: stream error: stream ID 1; INTERNAL_ERROR; received from peer

Atmos already retries cosign flakes (runCosignWithRetry, 5 attempts with exponential backoff), but the retryable classification is a deliberate allowlist that only recognized Rekor HTTP response flakes (searchLogQueryBadRequest, the IEEE_P1363 decode error, and 5xx scoped to the tlog retrieve endpoint). An HTTP/2 transport error matched none of the markers, so it surfaced on the first attempt with no retry.

Broadening to transport-level failures is safe within the allowlist's design rule: the allowlist exists so a real signature verdict (tampering, identity mismatch, expired cert) is never silently retried away. A transport failure means the request never completed and no verdict was rendered, so retrying it categorically cannot mask tampering. Existing negative tests (tampered artifact, identity mismatch, generic failure) continue to assert those still fail on the first attempt.

references

  • Observed failure: Acceptance Tests (linux), TestToolchainCustomCommands_InstallAllTools/Install_tofu

Summary by CodeRabbit

  • Bug Fixes
    • Signature verification now automatically retries on transient network/transport failures (e.g., HTTP/2 stream errors, connection resets, TLS handshake/timeouts, I/O timeouts, unexpected EOF), improving reliability during temporary infrastructure disruptions.
  • Tests
    • Added tests that validate retry behavior and recovery from transport-layer flakes.

Don't miss a new atmos release

NewReleases is sending notifications on new releases.