fix: Add missing doc redirects for old core-concepts URLs @osterman (#2287)
## what- Adds 25 new client-side redirects for old
/core-concepts/URLs that are still indexed by Google and cached by LLMs, causing 404 errors - Fixes 2 existing redirects that had invalid trailing slashes on
/vendor/component-manifest/targets (was causing Docusaurus build validation errors)
New redirect categories:
- 4 screenshot-confirmed 404s (vendoring, component-management, provisioning, schemas)
- 7 project section redirects (
/core-concepts/projects/*→/projects/and/cli/configuration/) - 7 stacks sub-pages (define-components, settings, components, backend, vars, env, providers)
- 2 share-data / remote-state redirects
- 2 vendor sub-pages (component-manifest, vendor-manifest)
- 1 describe page redirect
- 2 component sub-pages (packer, ansible)
why
- Old
/core-concepts/URLs are still indexed by Google and widely cached in LLM training data - LLMs frequently generate links to these old URLs when helping users with Atmos, leading to broken links and poor developer experience
- Each broken URL was verified by live-fetching the page and confirming a 404 response
- Each redirect target was cross-referenced against
llms.txtto ensure validity
references
- Verified via
site:atmos.tools/core-conceptsGoogle searches - All redirect targets validated against the Docusaurus build (
npm run buildpasses)
Summary by CodeRabbit
-
Bug Fixes
- Fixed numerous broken documentation links and improved navigation by adding and updating redirect rules across Projects, Stacks, Components, Vendor, and related pages (including removal of trailing-slash redirect mismatches) so users are directed to correct docs URLs.
-
Chores
- Updated CI workflow runner constraints to refine automated job scheduling.
🚀 Enhancements
Fix multi-region provider aliases generating incorrect Terraform JSON format @[copilot-swe-agent[bot]](https://github.com/apps/copilot-swe-agent) (#2210)
When configuring providers with dot-notation aliases (e.g., `aws.use1`), the generated `providers_override.tf.json` emitted invalid structure — separate top-level keys instead of the array-of-objects format Terraform's JSON syntax requires for multiple provider instances.Changes
pkg/terraform/output/backend.go: Added exportedProcessProviderAliasesthat detects dot-notation provider keys, groups all configurations for the same provider type into an array (base config first, aliases sorted), and leaves non-aliased providers unchangedinternal/exec/utils.go: UpdatedgenerateComponentProviderOverridesto delegate totfoutput.ProcessProviderAliases, eliminating duplicated logic
Example
Given stack config:
providers:
aws:
region: us-east-2
aws.use1:
region: us-east-1
alias: use1Before:
{ "provider": { "aws": { "region": "us-east-2" }, "aws.use1": { "alias": "use1", "region": "us-east-1" } } }After:
{
"provider": {
"aws": [
{ "region": "us-east-2" },
{ "alias": "use1", "region": "us-east-1" }
]
}
}Original prompt
This section details on the original issue you should resolve
<issue_title>Multi-Region with Provider Aliases example is not working</issue_title>
<issue_description>### Describe the Bughttps://atmos.tools/stacks/providers#multi-region-with-provider-aliases, this example is not working, the actual generated file is different from the example.
Expected Behavior
The generated file is the same as the example.
Steps to Reproduce
With the following atmos component config:
components: terraform: eip: providers: aws: region: us-east-2 aws.use1: region: us-east-1 alias: use1 metadata: component: eipRun atmos command and check the output of providers_override.tf.json
Screenshots
The content of the generated providers_override.tf.json
{ "provider": { "aws": { "region": "us-east-2" }, "aws.use1": { "alias": "use1", "region": "us-east-1" } } }Would expect it to be :
{ "provider": { "aws": [ { "region": "us-east-2" }, { "alias": "use1", "region": "us-east-1" } ] } }Environment
- OS: OSX
- Version: 1.209.0
- Terraform version: v1.14.7
Additional Context
No response</issue_description>
Comments on the Issue (you are @copilot in this section)
- Fixes #2208
🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.
Summary by CodeRabbit
-
New Features
- Added support for provider aliases—both explicit and auto-derived from dot-notation provider keys (e.g.,
aws.use1). - Providers are now properly grouped into arrays in generated Terraform provider override files.
- Added support for provider aliases—both explicit and auto-derived from dot-notation provider keys (e.g.,
-
Tests
- Added integration tests for provider alias scenarios.
-
Documentation
- Updated provider documentation to clarify alias auto-derivation behavior.
fix(list): gate `list instances --upload` on `settings.pro.enabled` @osterman (#2330)
## what- Change
atmos list instances --uploadto filter instances bysettings.pro.enabled == true(strict boolean) instead ofsettings.pro.drift_detection.enabled == true. - Rename
isProDriftDetectionEnabled→isProEnabledand simplify the check to a single lookup onsettings.pro.enabled;drift_detection.enabledis no longer consulted. - Update all unit, integration, comprehensive, cmd, and benchmark tests to the new fixture shape; add an explicit case proving
pro.enabled: truewithdrift_detection.enabled: falseis now enabled. - Update
website/docs/cli/commands/list/list-instances.mdxto document the filter criterion under--upload, in the examples section, and in the:::tipblock (noting it must be a boolean, not the string"true").
why
- Users with
settings.pro.enabled: trueconfigured on their components were hittingNo Atmos Pro-enabled instances found; nothing to upload.even when Pro was clearly enabled, because the filter required the narrowerdrift_detection.enabledsub-key. settings.pro.enabledis the correct top-level enablement flag for Pro; drift detection is one feature among several and shouldn't gate the whole upload.- The docs previously described
--uploadwithout specifying what made an instance eligible, so the failure mode was invisible to users.
Behavior change (callout)
Components that previously qualified via only settings.pro.drift_detection.enabled: true (without pro.enabled: true) will now be excluded from --upload. Users in that shape must add settings.pro.enabled: true.
references
--uploadwas introduced in #2322
Summary by CodeRabbit
-
Bug Fixes
- Pro detection simplified: only an explicit boolean settings.pro.enabled=true marks an instance as Pro; missing/non-boolean values are treated as disabled.
- Upload behavior: all collected instances are uploaded; post-upload summary shows total uploaded plus enabled/disabled and drift-enabled counts.
- Improved Pro authentication hints for GitHub Actions and workspace ID.
-
Documentation
- CLI docs updated to reflect new upload semantics, payload shape, and the "No instances found; nothing to upload." message.
-
Tests
- Tests updated/added to cover the new Pro flag shape, counting, and upload behavior.
Fix: Identity names with dots incorrectly parsed by Viper @[copilot-swe-agent[bot]](https://github.com/apps/copilot-swe-agent) (#2129)
- [x] Initial plan for fixing identity names with dots - [x] Add `fixAuthIdentities()` to re-parse identities from raw YAML - [x] Extract shared decode hooks into `getAtmosDecodeHookFunc()` - [x] Apply fix in `LoadConfig()` and `loadConfigFromCLIArgs()` - [x] Add test case `TestIdentityNamesWithDots` - [x] Use atmosConfig in perf.Track for consistency - [x] Remove debug log message that caused test snapshot failures - [x] Add error handling test cases to increase coverage to 84.6%Original prompt
This section details on the original issue you should resolve
<issue_title>Zero-Configuration AWS SSO Identity Management: identity containing dots break it.</issue_title>
<issue_description>### Describe the BugTesting
auth: providers: sso-prod: kind: aws/iam-identity-center start_url: https://my-org.awsapps.com/start region: us-east-1 auto_provision_identities: true # One line to enableI do get a list of identities in
~/.cache/atmos/auth/sso-prod/provisioned-identities.yaml.Some of them contains dots, e.g.
product.usa/ReadOnlyAccess: # <=== The "." here breaks it kind: aws/permission-set provider: sso-prod via: provider: sso-prod principal: account: id: "000000000000" name: product.usa name: ReadOnlyAccessWhich atmos does not support:
$ atmos auth list Initialize Identities Error: invalid identity kind ## Explanation unsupported identity kind: Initialize Identities Error: failed to initialize identities: invalid identity config: identity=product: invalid identity kind: unsupported identity kind: Error Error: invalid auth config: failed to create auth manager: failed to initialize identities: invalid identity config: identity=product: invalid identity kind: unsupported identity kind:Expected Behavior
it works :-)
Steps to Reproduce
Cf .bug description
Screenshots
No response
Environment
atmos 1.207.0
Additional Context
No response</issue_description>
Comments on the Issue (you are @copilot in this section)
- Fixes #2128
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.
fix(toolchain): resolve aliases in `toolchain exec` / `toolchain which` lookups @osterman (#2332)
## what- Route
findBinaryPath(used byatmos toolchain execandatmos toolchain which) through the existing alias-awareLookupToolVersionhelper instead of a rawtoolVersions.Tools[name]map lookup. - Derive
owner/repofrom the resolved canonical key so the computed install path matches what the write side persisted. - Add a regression test that reproduces the bug:
.tool-versionsstoringhelm/helm 3.20.2+ an aliashelm → helm/helmnow resolves viaWhichExec("helm").
why
- Symptom:
atmos toolchain install helm@3.20.2succeeds, butatmos toolchain exec -- helm …then errors withtool 'helm' not configured in .tool-versionsand tries to re-install. - Root cause: the write side already canonicalizes via the resolver (
wouldCreateDuplicate→aliasConflictsWithFullName), so entries land under theowner/repokey. The read side did a raw map lookup with no resolver, so an alias query missed the canonical entry — the classic write/read asymmetry. - Fix keeps the read side symmetric with the write side by reusing the helper that already exists for exactly this purpose.
references
- Out of scope, tracked separately:
RunInstallpersisting the literal stringlatestto.tool-versionswhen installing without an explicit version, and wiringpkg/toolchain/filemanager/pkg/toolchain/lockfileinto install/uninstall/set/exec.
Summary by CodeRabbit
- Bug Fixes
- Fixed tool alias resolution to correctly locate binary paths when requesting tools by their registered alias names instead of canonical identifiers. The system now properly maps aliases to their resolved canonical entries before checking availability.
fix: resolve JIT workdir path for !terraform.state, !terraform.output, and atmos.Component @zack-is-cool (#2328)
## WhatBug fix PR. Makes !terraform.state, !terraform.output, and atmos.Component work correctly for JIT workdir components (provision.workdir.enabled: true). All three were silently broken in ways that only surfaced at runtime.
Four fixes:
!terraform.statepath resolution — resolves state path from.workdir/terraform/<stack>-<component>/instead of the static source directory JIT components never write state to.!terraform.output/atmos.Componentauto-provision — provisions the JIT workdir beforeterraform initso output references work on any machine, not just ones with a pre-existing workdir from a prior apply.- Source-provisioned JIT workdir support — Fix 2 only handled local-copy provisioning. For
source.uricomponents,!terraform.outputnow hydrates from the source URI before init. Also fixesextractComponentNamefallback and a go-getterFileGetterdst-must-not-exist invariant. - Provisioner output interleaving —
ui.ClearLine()before status writes prevents the bubbletea spinner from leaving leading whitespace on provisioner messages.
Correctness & security fixes:
- TOCTOU race —
sync.Map.Load+Storereplaced withLoadOrStoreinside the singleflight closure, eliminating the window where two goroutines could both enterProvision. - Context cancellation — switched to
singleflight.DoChan+selectso waiters with cancelled contexts exit immediately. Addedcontext.WithoutCancelso leader cancellation doesn't abort shared provisioning work. - Path traversal guard —
extractComponentPathverifies the derived workdir path stays withinfilepath.Abs(basePath)before returning it; escaping paths fall back tocomponentPath. Mirrors the existing guard interraform_backend_local.go. - Actionable error hint —
ErrWorkdirProvisionnow includes the full YAML path and env var to disable auto-provisioning. loadConfigFromCLIArgsenv var bug —setEnv(v)was missing on the--config/--config-pathcode path, silently ignoring allATMOS_*overrides when config was loaded from CLI args.- Documentation —
auto_provision_workdir_for_outputsandATMOS_COMPONENTS_TERRAFORM_AUTO_PROVISION_WORKDIR_FOR_OUTPUTSadded to the config/env var reference docs.
Why
JIT workdir components write their Terraform files to .workdir/terraform/<stack>-<component>/ via a before.terraform.init hook — but that hook only fires during direct atmos terraform commands, not YAML function evaluation. Three distinct silent failures resulted:
!terraform.statelooked in the source directory where JIT components have no state — unconditional failure.!terraform.outputcomputed the correct workdir path but never populated the directory before callingterraform init— fails with "no such file or directory" on any cold machine.!terraform.output+source.uri— even with Fix 2,ProvisionWorkdironly copies local files. Source-provisioned components needAutoProvisionSourcefirst, which only fires in the hook system the output executor never reaches.
Note on Fix 3 (source.uri components)
!terraform.output against a source-provisioned component with a cold workdir will fetch from source.uri — the same credentials already needed for atmos terraform apply. The fetch is cached per (stack, component) pair per process.
Set auto_provision_workdir_for_outputs: false (or ATMOS_COMPONENTS_TERRAFORM_AUTO_PROVISION_WORKDIR_FOR_OUTPUTS=false) to disable Fixes 2 and 3.
For state-only reads, prefer !terraform.state — no init, no source fetch, no terraform binary required.
Migration
No breaking changes. Previously-failing commands now work.
# Before (runs terraform init + output on every eval):
vpc_id: '{{ (atmos.Component "vpc" .stack).outputs.vpc_id }}'
# After (reads state file directly, no init):
vpc_id: !terraform.state vpc {{ .stack }} vpc_idResolves #2167
Summary by CodeRabbit
-
New Features
- Auto-provision JIT working directories before Terraform output evaluation (configurable, enabled by default).
- Template/YAML functions can resolve state/outputs from JIT-provisioned and source-backed components.
-
Security / Bug Fixes
- Containment checks to prevent path traversal outside configured base path.
- Safer fallbacks and debug logging when workdir/state resolution fails.
-
Documentation
- Docs and env var added for the new auto-provision setting.
-
Tests
- Extensive unit/integration tests covering JIT provisioning, resolution, caching, concurrency, and inheritance.
fix(auth): crash on standalone `ambient` identity; add global panic handler @aknysh (#2334)
## what- Fix a hard
SIGSEGVwhen Atmos authenticates a standaloneambientidentity (kind: ambient). Everyatmos auth login/atmos auth whoami/atmos terraform ...against such an identity crashed the process with a Go stack trace. - Add a process-wide panic handler (
pkg/panics) so any future uncaught panic renders a short, actionable crash message viapkg/uiinstead of a raw Go goroutine dump, while preserving the full stack trace in a crash-report file for bug reports. - Update
github.com/mikefarah/yq/v4(4.52.5 → 4.53.2) and migrate Atmos's yq logger setup to the new slog-based API.
1. Ambient identity crash (primary fix)
Background: the generic ambient identity kind (docs/prd/ambient-identity.md) is a cloud-agnostic passthrough — Authenticate() returns (nil, nil) by design because credentials are resolved by the cloud SDK at subprocess runtime (IRSA / IMDS / ECS task role / environment), not by Atmos.
Bug: the auth manager forwarded those nil credentials straight to buildWhoamiInfo, which unconditionally invoked a method on the credential interface, producing a nil-interface dereference on the main goroutine.
Scope: standalone generic ambient identities. The AWS-specific aws/ambient was not affected because its Authenticate() resolves via the AWS SDK default chain and always returns real credentials.
Fix: buildWhoamiInfo now short-circuits safely when creds == nil and still returns a populated WhoamiInfo (realm, provider, identity, environment, timestamp). Environment is populated unconditionally so atmos auth whoami continues to report the expected surface for pure-passthrough ambient identities. Keystore cache, reference handle, BuildWhoamiInfo, and GetExpiration branches are skipped — there is nothing to cache for an identity that does not own credentials.
Tests:
TestManager_buildWhoamiInfo_NilCredentials— unit coverage of the nil-creds branch. Before the fix, this test panicked atmanager_whoami.go:25.TestManager_Authenticate_Ambient_Standalone— end-to-end via realNewAuthManager+Authenticate(). Before the fix, this path panicked in the same location throughmanager.go:294.
Both pass post-fix alongside the existing whoami tests.
Full write-up: docs/fixes/2026-04-17-ambient-identity-nil-credentials.md.
2. Global panic handler
Motivation: the ambient crash surfaced as a wall of Go runtime output that was useless to end users. Any future bug of the same shape would produce the same bad experience. The handler is defensive infrastructure, not a workaround for the ambient fix — both ship together so a regression cannot reintroduce a raw crash.
Behavior:
- One deferred
panics.Recover(&exitCode)at the top ofmain.run()covers every code path reachable synchronously fromcmd.Execute()— every command, theinternal/exec/pipeline,pkg/auth/,pkg/stack/, etc. Installed beforedefer cmd.Cleanup()so Cleanup runs normally on clean exit and Recover also catches anything that escapes Cleanup itself. - User-facing output uses
pkg/uiexclusively (per CLAUDE.md I/O/UI rules): red ✗Atmos crashed unexpectedlyheadline, Markdown-rendered body with panic summary, version, OS/arch, Go build toolchain, command-line, crash-report path, and an issue-tracker link. - Full stack is shown inline only when
ATMOS_LOGS_LEVEL=Debugor=Trace(case-insensitive). Otherwise it is written to a0o600crash report at$TMPDIR/atmos-crash-<UTC>-<pid>.txtwhose path appears in the friendly message. - The panic is wrapped via
cockroachdb/errors.WithStackand forwarded toerrUtils.CaptureError, so Sentry (when configured) gets a proper event with breadcrumbs through the existing error pipeline. - Exit code 1 matches the existing error-exit convention — no CI/pre-commit behavior change.
Out of scope (tracked as follow-up): panics on spawned goroutines (signal handler, telemetry flushes, async work) — those need their own deferred Recover at each entry point.
Tests: 14 unit cases covering string / error / runtime.Error panic values, debug-mode on/off, crash-file write success and graceful failure, option defaults, env-gate matrix (canonical / lower / upper / whitespace / non-debug levels), and Recover with nil and non-nil exit-code pointers.
Manual verification: injected a nil-pointer dereference into the version command, ran ./build/atmos version in both default and ATMOS_LOGS_LEVEL=Debug modes. Exact output is reproduced in the fix doc for PR/release-note reuse.
Full write-up: docs/fixes/2026-04-17-global-panic-handler.md.
3. yq bump + logger API migration
github.com/mikefarah/yq/v4 is bumped from 4.52.5 → 4.53.2. The 4.53 line replaces yqlib's internal logger — previously built on op/go-logging.v1 — with one built on Go's standard log/slog. The old yqlib.GetLogger().SetBackend(backend logging.Backend) entry point is gone; the new API exposes SetLevel(slog.Level) and SetSlogger(*slog.Logger).
Atmos's pkg/utils/yq_utils.go used SetBackend with a no-op logBackend struct to silence yq's internal chatter unless Logs.Level == Trace. Without migration, atmos fails to build against the new yq with logger.SetBackend undefined.
Migration:
- Removed the
logBackendtype and its four methods (Log,GetLevel,SetLevel,IsEnabledFor) along with thegopkg.in/op/go-logging.v1import. - Rewrote
configureYqLoggerto install anio.Discardslog handler viayqlib.GetLogger().SetSlogger(...)when the Atmos log level is not Trace. Semantics are preserved: yq's internal diagnostics are suppressed by default and only surface at Trace level. - Deleted
TestLogBackendfrompkg/utils/yq_utils_test.go(tested a type that no longer exists).TestConfigureYqLoggerand allEvaluateYqExpressiontests still pass.
No behavior change for end users: templates and YAML-function calls that route through yq produce the same output with the same suppression of yq's internal logs.
Also
- Bump
ATMOS_VERSION=1.216.0inexamples/quick-start-advanced/Dockerfileand two test fixtures that referenced the old version.
why
- Ambient identity crash is a complete blocker. Any user running
atmos auth loginagainst a genericambientidentity — the canonical pattern for IRSA / IMDS / ECS task roles / cloud-agnostic passthrough — hits a hard SIGSEGV on every invocation. There is no workaround short of not using the identity kind, which defeats the reason the kind exists. - The panic handler is defensive UX. Cloud-credential code paths are full of nil-interface boundaries; the ambient crash is proof that a similar bug could slip in again. Intercepting panics at the main-goroutine entry point turns any future incident of the same shape into a crisp bug-report loop (one friendly line + one file path to attach) instead of a wall of goroutine output, with the full stack one env var away for contributors.
- The yq bump is required to stay on a maintained yqlib. 4.53 is the current minor line; staying on 4.52 leaves us one release behind on upstream fixes and drifts further from the slog-based logger API that the rest of the Go ecosystem is converging on. The migration is a one-file change with identical user-visible behavior.
references
docs/fixes/2026-04-17-ambient-identity-nil-credentials.md— ambient crash fix: root cause, scope, tests, and why the fix belongs at the manager layer rather than synthesizing a credential stub in the identity.docs/fixes/2026-04-17-global-panic-handler.md— panic handler design, sample output (default + debug mode + crash report), test matrix, and follow-up items.docs/prd/ambient-identity.md— the ambient-identity PRD. The(nil, nil)return contract fromambient.Authenticate()is intentional for the generic kind; the bug was the manager failing to honor it..claude/agents/tui-expert.md—pkg/uioutput-channel rules the panic handler follows (stderr UI channel viaui.Error/ui.MarkdownMessage; neverfmt.Fprintf(os.Stderr, ...)).github.com/mikefarah/yqv4.53.0 release notes — upstream changelog for the logger migration.
Summary by CodeRabbit
-
New Features
- Global panic recovery with user-friendly crash reports and automatic crash-file generation.
-
Bug Fixes
- Prevented crash when authenticating with generic ambient identities that return nil credentials; authentication now returns stable identity info without panicking.
-
Documentation
- Added detailed fix write-ups for panic recovery and nil-credential behavior.
-
Tests
- Added unit and integration tests covering panic handling and nil-credential authentication paths.
-
Chores
- Updated dependencies, bumped example default version to 1.216.0, adjusted logger handling, and refreshed NOTICE entries.