cloudposse/atmos v1.196.0 on GitHub

Support `!terraform.state` on GCS Backends @shirkevich (#1393)

# Add GCS backend support to `!terraform.state` YAML function

what

Add Google Cloud Storage (GCS) backend support to !terraform.state Atmos YAML function
Implement performance optimizations (client caching, retry logic, extended timeouts)
Create unified Google Cloud authentication system for consistency across GCP services
Update documentation with GCS backend usage examples and authentication methods

why

The !terraform.state YAML function allows reading the outputs (remote state) of components in Atmos stack manifests directly from the configured Terraform/OpenTofu backends.

Previously, the !terraform.state YAML function only supported:

local (Terraform and OpenTofu)
s3 (Terraform and OpenTofu)

This PR adds support for:

gcs (Google Cloud Storage - Terraform and OpenTofu)

With GCS backend support, users can now leverage the high-performance !terraform.state function instead of the slower !terraform.output or !store functions when using Google Cloud Storage for Terraform state storage.

Implementation Details

GCS Backend Features

Full Authentication Support: JSON credentials, service account file paths, and Google Application Default Credentials (ADC)
Service Account Impersonation: Support for impersonate_service_account configuration
Performance Optimizations:
- Client caching to avoid recreating GCS clients for repeated operations
- Retry logic with exponential backoff (up to 3 attempts) for transient failures
- Extended timeouts (30 seconds) to match S3 backend performance
Robust Error Handling: Graceful handling of missing state files and detailed error context
Resource Management: Proper cleanup and explicit resource management

Usage

The GCS backend works seamlessly with existing !terraform.state syntax:

# Get the `output` of the `component` in the current stack
subnet_id: !terraform.state vpc private_subnet_id

# Get the `output` of the `component` in the provided `stack` 
vpc_id: !terraform.state vpc dev-us-east-1 vpc_id

# Get complex outputs using YQ expressions
first_subnet: !terraform.state vpc .private_subnet_ids[0]

GCS Backend Configuration

The GCS backend supports all standard Terraform GCS backend configurations:

# atmos.yaml
components:
  terraform:
    backend_type: gcs
    backend:
      gcs:
        bucket: "my-terraform-state-bucket"
        prefix: "terraform/state"
        
        # Authentication options (choose one):
        
        # Option 1: JSON credentials content
        credentials: |
          {
            "type": "service_account",
            "project_id": "my-project",
            ...
          }
          
        # Option 2: Service account file path  
        credentials: "/path/to/service-account.json"
        
        # Option 3: Use Application Default Credentials (ADC)
        # (no credentials field needed - uses environment/metadata)
        
        # Optional: Service account impersonation
        impersonate_service_account: "terraform@my-project.iam.gserviceaccount.com"

Performance Benefits

Compared to !terraform.output, the !terraform.state function with GCS backend:

✅ No Terraform execution - Reads state directly from GCS
✅ No provider initialization - Skips all module and provider setup
✅ No varfile generation - Bypasses Terraform configuration preparation
✅ Cached clients - Reuses GCS clients for multiple operations
✅ Parallel execution - Multiple state reads can happen concurrently

Testing

Comprehensive Test Suite: 100% test coverage for all new functionality
Mock Implementations: Complete interface-based testing for GCS operations
Authentication Testing: Validates all credential types and authentication flows
Error Scenario Coverage: Tests for missing files, network failures, and invalid configurations
Caching Validation: Ensures client caching works correctly across operations
Retry Logic Testing: Validates exponential backoff and failure recovery

Backward Compatibility

✅ No breaking changes to existing configurations
✅ Existing backends (local, s3) remain unchanged
✅ Same function syntax - no new parameters or options required
✅ Graceful fallbacks - continues to work with !terraform.output and !store functions

Files Changed

Core Implementation

internal/terraform_backend/terraform_backend_gcs.go - GCS backend implementation
internal/terraform_backend/terraform_backend_gcs_test.go - Comprehensive test suite
internal/terraform_backend/terraform_backend_registry.go - Register GCS backend
internal/terraform_backend/terraform_backend_utils.go - Updated error messages

Unified Authentication System

internal/gcp/auth.go - New unified Google Cloud authentication (created)
internal/gcp/auth_test.go - Authentication tests (created)
pkg/store/google_secret_manager_store.go - Updated to use unified auth
internal/gcp_utils/gcp_utils.go - Removed (replaced by unified auth)

Configuration & Documentation

internal/exec/terraform_generate_backend.go - GCS backend validation
website/docs/core-concepts/stacks/yaml-functions/terraform.state.mdx - Updated documentation
errors/errors.go - Added GCS-specific error types
go.mod - Added GCS storage dependency

Migration Guide

For users currently using !terraform.output or !store with GCS-stored state:

Before (slower)

# Using !terraform.output (requires Terraform execution)
vpc_id: !terraform.output vpc dev-us-east-1 vpc_id

# Using !store (requires separate state management)  
vpc_id: !store google-secret-manager dev/vpc/vpc_id

After (faster)

# Using !terraform.state (direct GCS state access)
vpc_id: !terraform.state vpc dev-us-east-1 vpc_id

Simply update your backend configuration to use gcs and replace function calls - no other changes needed!

Summary by CodeRabbit

New Features
- GCS-backed Terraform state support and unified Google Cloud authentication integration.
Bug Fixes
- Stricter backend config validation with clearer error responses and updated supported-backends messaging.
Tests
- Comprehensive unit tests added for GCS backend behavior and GCP authentication handling.

fix: Improve AWS credential isolation and auth error propagation @osterman (#1712)

## Summary

This PR addresses multiple authentication issues when using Atmos in containerized environments with mounted credential files:

Auth Pre-Hook Error Propagation - Terraform execution now properly aborts when authentication fails (e.g., Ctrl+C during SSO)
AWS Credential Loading Strategy - New LoadAtmosManagedAWSConfig() function provides proper isolation while preserving Atmos-managed profile selection
Noop Keyring Validation - Container auth now properly isolated from external environment variables
Whoami with Noop Keyring - atmos auth whoami now works in containerized environments
Test Coverage - Added test to verify auth errors properly abort execution

Changes

1. Auth Pre-Hook Error Propagation (`internal/exec/terraform.go:236`)

Problem: Errors from auth pre-hook were logged but not returned, causing terraform execution to continue even when authentication failed (e.g., user presses Ctrl+C during SSO)
Fix: Added return err after logging auth pre-hook errors
Impact: Terraform commands now properly abort on auth failures

2. AWS Credential Loading Strategy (`pkg/auth/cloud/aws/env.go`)

Problem: SDK's default config loading allowed IMDS access and was affected by external AWS_PROFILE, causing conflicts in containers
Solution: Created LoadAtmosManagedAWSConfig() function that:
- Clears credential env vars (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN)
- Preserves profile/path vars (AWS_PROFILE, AWS_SHARED_CREDENTIALS_FILE, AWS_CONFIG_FILE)
- Allows SDK to load from Atmos-managed credential files
Impact: Proper isolation while still using Atmos-managed profiles

3. Noop Keyring Credential Validation (`pkg/auth/credentials/keyring_noop.go`)

Problem: Used unrestricted config.LoadDefaultConfig() which allowed IMDS access and was affected by external AWS_PROFILE
Fix: Changed to use LoadAtmosManagedAWSConfig()
Impact: Container auth now properly isolated from external env vars

4. Whoami with Noop Keyring (`pkg/auth/manager.go`)

Problem: Whoami() expected credentials from keyring, but noop keyring returns ErrCredentialsNotFound by design
Fix: Added check for ErrCredentialsNotFound and fallback to buildWhoamiInfoFromEnvironment()
Impact: atmos auth whoami now works in containerized environments

5. Test Coverage (`internal/exec/terraform_test.go`)

Added TestExecuteTerraform_AuthPreHookErrorPropagation to verify auth errors properly abort execution
Test validates that terraform doesn't continue on auth failure
Updated test fixture to include required name_pattern configuration

Technical Details

The key insight is that Atmos sets AWS_PROFILE=identity-name (in pkg/auth/cloud/aws/setup.go:59) but the previous isolation approach cleared ALL AWS env vars including AWS_PROFILE. This caused the SDK to look for a non-existent [default] section.

The new LoadAtmosManagedAWSConfig preserves AWS_PROFILE while still preventing external credential conflicts.

Test Plan

go build . - Build succeeds
go test ./internal/exec -run TestExecuteTerraform - All terraform tests pass
TestExecuteTerraform_AuthPreHookErrorPropagation - New test passes
Verified test fails when fix is removed (terraform continues execution)
Verified test passes when fix is restored (terraform aborts on auth error)

References

Fixes authentication issues in containerized environments with mounted credentials.

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added --login and cached-credentials-first flows across auth commands; whoami now shows validation and expiry.
- Atmos-managed credentials moved to XDG-compliant locations; improved shell enter/exit messages.
- Geodesic helper script for building/testing in containerized environments.
Bug Fixes
- Terraform pre-hook errors now abort execution.
- Improved propagation of user-abort during authentication.
Documentation
- XDG migration guides and Geodesic/CLI docs updated.
Tests
- Broad expansion of auth, AWS credential, auth-context and output-propagation tests.

fix: Relax stack config requirement for commands that don't operate on stacks @osterman (#1717)

## Summary

Fixes stack configuration requirement for 6 commands that don't actually operate on stack manifests. These commands were incorrectly requiring stacks.base_path and stacks.included_paths to be configured, causing errors like:

Error: failed to initialize atmos config
stack base path must be provided in 'stacks.base_path' config or ATMOS_STACKS_BASE_PATH' ENV variable

What

Updated 6 commands to use processStacks=false in InitCliConfig:

Auth Commands (Commit 1)

atmos auth env - Export cloud credentials as environment variables
atmos auth exec - Execute commands with cloud credentials
atmos auth shell - Launch authenticated shell

List/Docs Commands (Commit 2)

atmos list workflows - List workflows from workflows/ directory
atmos list vendor - List vendor configurations from component.yaml files
atmos docs <component> - Display component README files

Why

These commands only need:

Auth configuration from atmos.yaml
Component base paths (terraform, helmfile, etc.)
Workflow or vendor configurations

They do NOT need:

Stack manifests to exist
stacks.base_path to be configured
stacks.included_paths to be configured

This makes Atmos more flexible for use cases like:

CI/CD pipelines that only need auth or vendor management
Development environments without full stack setup
Documentation browsing without infrastructure configs
Workflow management separate from stack operations

Technical Details

Changes Made

InitCliConfig parameter: Changed processStacks from true to false
- Prevents validation requiring stacks.base_path and stacks.included_paths
- Skips processing of stack manifest files
checkAtmosConfig option (for list vendor only): Added WithStackValidation(false)
- Prevents checking if stacks directory exists
- Required because list vendor calls checkAtmosConfig() with additional validation

Files Changed

cmd/auth_env.go
cmd/auth_exec.go
cmd/auth_shell.go
cmd/list_workflows.go
cmd/list_vendor.go
cmd/docs.go

Commands That Still Require Stacks (Unchanged)

These were NOT modified because they genuinely need stack manifests:

atmos list stacks
atmos list components
atmos list settings
atmos list values
atmos list metadata

Testing

✅ All existing tests pass
✅ Linter passes with 0 issues
✅ Pre-commit hooks pass
✅ Manual testing confirms commands work without stack directories
✅ No regressions in existing functionality

References

Addresses user issue where atmos auth exec -- aws sts get-caller-identity failed with stack configuration error.

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

Summary by CodeRabbit

New Features
- Auth and utility commands (auth env, auth exec, auth shell, list workflows, list vendor, docs) now run without requiring stack configuration, enabling use in CI/CD, vendor management, and documentation workflows.
Documentation
- Added a blog post describing the change, usage examples, migration tips, and CI/CD benefits.

Change runner type in nightly builds workflow @goruha (#1713)

## what * Use `large`runson runners for the go relaser

why

Go releaser need more disk space

Summary by CodeRabbit

Chores
- Updated GitHub Actions runner specifications across feature release, nightly build, and test workflows to standardize build infrastructure configuration.

Update nightlybuilds.yml @goruha (#1711)

## what * Run go releaser on RunsOn runner

why

Default runners have out of space

Summary by CodeRabbit

Chores
- Updated nightly release workflow to change how runner selection is provided: the workflow now accepts a JSON-like array of runner specifications, improving and broadening which runner(s) can be targeted for nightly builds.

Fix Terraform state authentication by passing auth context @osterman (#1695)

## what - Add authentication context parameter to Terraform backend operations - Refactor PostAuthenticate interface to use parameter struct - Extract nested logic to reduce complexity - Fix test coverage for backend functions

why

Terraform state operations need proper AWS credentials when accessing S3 backends
Multi-identity scenarios require passing auth context through the call chain
Reduces function parameter count from 6 to 2 (using PostAuthenticateParams struct)
Simplifies nested conditional logic for better maintainability

references

Part of multi-identity authentication context work
Follows established authentication context patterns
Related to docs/prd/auth-context-multi-identity.md

Summary by CodeRabbit

New Features
- Centralized per-command AuthContext enabling multiple concurrent identities (AWS, GitHub, Azure, etc.) and making in-process SDK and Terraform calls use Atmos-managed credentials.
- Console session duration configurable via provider console.session_duration with CLI flag override.
Bug Fixes
- More reliable in-process authentication for SDK and Terraform state reads.
Documentation
- Added design doc, blog post, and CLI docs describing AuthContext and session-duration behavior.
Tests
- Expanded tests for auth flows, AWS config loading, and YAML/Terraform tag auth propagation.

Add circular dependency detection for YAML functions @osterman (#1708)

## what - Implement universal circular dependency detection for all Atmos YAML functions (!terraform.state, !terraform.output, atmos.Component) - Add goroutine-local resolution context for cycle tracking - Create comprehensive error messages showing dependency chains - Fix missing perf.Track() calls in Azure backend wrapper methods - Refactor code to meet golangci-lint complexity limits

why

Users experiencing stack overflow panics from circular dependencies in component configurations
Need to detect cycles before they cause panics and provide actionable error messages
Performance tracking required for all public functions per Atmos conventions
Reduce cyclomatic complexity and function length for maintainability

Implementation Details

Architecture

Goroutine-local storage using sync.Map with goroutine IDs to maintain isolated resolution contexts
O(1) cycle detection using visited-set pattern with Push/Pop operations
Call stack tracking for building detailed error messages showing dependency chains
Zero performance impact (<10 microseconds overhead, <0.001% of total execution time)

Test Coverage

27 comprehensive tests across 4 test files
100% coverage on core resolution context logic
~75-80% overall coverage (excluding benchmark and integration tests)
Benchmark tests proving negligible performance impact
Integration tests for real-world scenarios (currently skipped - require state backends)

Performance

Push operation: ~266 nanoseconds
Pop operation: ~70 nanoseconds
GetGoroutineID: ~2,434 nanoseconds
Total overhead: <10 microseconds (<0.001% of execution time)

Error Messages

Before (stack overflow panic):

runtime: goroutine stack exceeds 1000000000-byte limit
fatal error: stack overflow

After (actionable error with dependency chain):

circular dependency detected

Dependency chain:
  1. Component 'vpc' in stack 'core'
     → !terraform.state transit-gateway core transit_gateway_id
  2. Component 'transit-gateway' in stack 'core'
     → !terraform.state vpc core vpc_id
  3. Component 'vpc' in stack 'core' (cycle detected)
     → !terraform.state transit-gateway core transit_gateway_id

To fix this issue:
  - Review your component dependencies and break the circular reference
  - Consider using Terraform data sources or direct remote state instead
  - Ensure dependencies flow in one direction only

references

Fixes community-reported stack overflow issue in YAML function processing
See docs/prd/circular-dependency-detection.md for complete architecture and design decisions
See docs/circular-dependency-detection.md for user documentation and troubleshooting
See CIRCULAR_DEPENDENCY_DETECTION_SUMMARY.md for implementation summary

Files Changed

Core implementation: internal/exec/yaml_func_resolution_context.go (161 lines)
Tests: 4 test files (1,093 lines total)
Modified: yaml_func_utils.go, yaml_func_terraform_state.go, yaml_func_terraform_output.go
Documentation: PRD, user docs, summary
Test fixtures: 7 YAML files + 2 Terraform components
Additional: Fixed Azure backend perf.Track() issues

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

Summary by CodeRabbit

New Features
- Automatic circular dependency detection for YAML functions including terraform.state, terraform.output, and custom component functions. The system detects cycles early before runtime failures occur, providing comprehensive error messages that display the full dependency chain and component relationships. Users receive actionable remediation guidance and suggested fixes to resolve circular dependencies in their infrastructure configurations.

fix: Remove exclude directive to enable go install @osterman (#1709)

## what - Removed `exclude` directive from go.mod that was blocking `go install github.com/cloudposse/atmos@main` - Updated go install compatibility test to check for both `replace` and `exclude` directives

why

The exclude directive in go.mod prevents users from installing Atmos via go install
Go modules with exclude directives cannot be used as dependencies (by design)
This breaks a documented installation method and creates user friction
The excluded version (godbus/dbus v0.0.0-20190726142602-4481cbc300e2) is already superseded by explicitly required versions (v4.1.0 and v5.1.0)

references

Related Go issues: golang/go#44840, golang/go#69762, golang/go#50698
The test now prevents future regressions for both replace and exclude directives

🤖 Generated with Claude Code

fix: Upgrade to Go 1.25 and make test logging respect -v flag @osterman (#1706)

## what - Upgraded Go version from 1.24.8 to 1.25.0 - Configured Atmos logger in tests to respect `testing.Verbose()` flag - Tests are now quiet by default, verbose with `-v` flag - Added missing `perf.Track()` calls to Azure backend wrapper methods

why

Go 1.24.8 had a runtime panic bug in unique_runtime_registerUniqueMapCleanup on macOS ARM64 (golang/go#69729)
This caused TestGetAffectedComponents to panic during cleanup on macOS CI
Test output was always verbose because logger was set to InfoLevel unconditionally
Go 1.25.0 fixes the runtime panic bug
Linter enforcement requires perf.Track() on all public functions

changes

go.mod: Upgraded from go 1.24.8 to go 1.25.0
tests/cli_test.go:
- Moved logger level configuration from init() to TestMain()
- Logger now respects -v flag using switch statement:
  - ATMOS_TEST_DEBUG=1: DebugLevel (everything)
  - -v flag: InfoLevel (info, warnings, errors)
  - Default: WarnLevel (only warnings and errors)
- Removed debug pattern logging loop (was spam)
- All helpful t.Logf() messages preserved (work correctly with -v)
internal/terraform_backend/terraform_backend_azurerm.go:
- Added perf.Track() to GetBody() wrapper method
- Added perf.Track() to DownloadStream() wrapper method

testing

go test ./tests → Quiet (no logger output)
go test ./tests -v → Verbose (shows INFO logs)
go test ./internal/exec -run TestGetAffectedComponents → Passes without panic

references

Fixes the macOS panic from https://github.com/cloudposse/atmos/actions/runs/18656461566/job/53187085704
Related Go issue: golang/go#69729

Add Azure Blob Storage (azurerm) backend support for !terraform.state function @jamengual (#1610)

## what - Implemented Azure Blob Storage backend support for the `!terraform.state` YAML function - Added comprehensive unit tests with 100% coverage for the new backend - Updated error definitions, registry, and documentation

why

The !terraform.state function previously only supported local and s3 backends
Azure users needed native azurerm backend support to read Terraform state directly from Azure Blob Storage
This provides the fastest way to retrieve Terraform outputs without Terraform initialization overhead

changes

New Implementation: internal/terraform_backend/terraform_backend_azurerm.go
- Implements azurerm backend reader following S3 backend patterns
- Uses Azure SDK with DefaultAzureCredential for authentication (Managed Identity, Service Principal, Azure CLI, etc.)
- Supports workspace-based blob paths (env:/{workspace}/{key} for non-default workspaces)
- Includes client caching, retry logic (2 retries with exponential backoff), and proper error handling
- Handles 404 (blob not found) gracefully by returning nil (component not provisioned yet)
- Handles 403 (permission denied) with descriptive error messages
Comprehensive Tests: internal/terraform_backend/terraform_backend_azurerm_test.go
- 8 test functions covering all scenarios with mocked Azure SDK client
- Tests workspace handling (default vs non-default), blob not found, permission denied, network errors, retry logic, and error cases
- All tests pass with no external dependencies required
Error Definitions: errors/errors.go
- Added 7 new Azure-specific static errors following project patterns
- ErrGetBlobFromAzure, ErrReadAzureBlobBody, ErrCreateAzureCredential, ErrCreateAzureClient, ErrAzureContainerRequired, ErrStorageAccountRequired, ErrAzurePermissionDenied
Registry Update: internal/terraform_backend/terraform_backend_registry.go
- Registered ReadTerraformBackendAzurerm in the backend registry
Error Message Update: internal/terraform_backend/terraform_backend_utils.go
- Updated supported backends list to include azurerm
Documentation Update: website/docs/functions/yaml/terraform.state.mdx
- Added azurerm to the list of supported backend types
- Updated warning message to reflect azurerm support
Dependencies: go.mod
- Moved github.com/Azure/azure-sdk-for-go/sdk/storage/azblob from indirect to direct dependency (already present in project)

implementation notes

Follows established patterns from S3 backend implementation
Uses wrapper pattern (AzureBlobAPI interface) to enable testing without actual Azure connectivity
Implements proper workspace path handling matching Azure backend behavior (env:/{workspace}/{key})
All comments end with periods (enforced by golangci-lint)
Imports organized in 3 groups (stdlib, 3rd-party, atmos) as per CLAUDE.md
Performance tracking added with defer perf.Track() on all functions
Cross-platform compatible using Azure SDK (not CLI commands)

test results

=== RUN   TestReadTerraformBackendAzurermInternal_Success
=== RUN   TestReadTerraformBackendAzurermInternal_Success/successful_read_default_workspace
=== RUN   TestReadTerraformBackendAzurermInternal_Success/successful_read_dev_workspace
=== RUN   TestReadTerraformBackendAzurermInternal_Success/successful_read_prod_workspace
=== RUN   TestReadTerraformBackendAzurermInternal_Success/successful_read_empty_workspace
=== RUN   TestReadTerraformBackendAzurermInternal_Success/successful_read_default_key
--- PASS: TestReadTerraformBackendAzurermInternal_Success (0.00s)
=== RUN   TestReadTerraformBackendAzurermInternal_BlobNotFound
--- PASS: TestReadTerraformBackendAzurermInternal_BlobNotFound (0.00s)
=== RUN   TestReadTerraformBackendAzurermInternal_PermissionDenied
--- PASS: TestReadTerraformBackendAzurermInternal_PermissionDenied (0.00s)
=== RUN   TestReadTerraformBackendAzurermInternal_NetworkError
--- PASS: TestReadTerraformBackendAzurermInternal_NetworkError (4.00s)
=== RUN   TestReadTerraformBackendAzurermInternal_RetrySuccess
--- PASS: TestReadTerraformBackendAzurermInternal_RetrySuccess (2.00s)
=== RUN   TestReadTerraformBackendAzurermInternal_MissingContainerName
--- PASS: TestReadTerraformBackendAzurermInternal_MissingContainerName (0.00s)
=== RUN   TestReadTerraformBackendAzurermInternal_ReadBodyError
--- PASS: TestReadTerraformBackendAzurermInternal_ReadBodyError (0.00s)
PASS
ok      github.com/cloudposse/atmos/internal/terraform_backend 7.011s

Summary by CodeRabbit

New Features
- Azure Blob Storage (azurerm) support for reading Terraform state with workspace-aware paths, authentication, retries, and client caching.
Documentation
- Added detailed docs and a blog post covering Azure backend usage, examples, migration guidance, and “Try It Now” steps.
Improvements
- Clearer permission/not-found reporting and added Azure-specific error signals for more precise error handling.
Tests
- Extensive unit and integration tests plus Azure credential precondition checks.
Chores
- Updated .gitignore with developer tool patterns.

test(auth): Increase auth test coverage from 6% to 80% with mock provider @osterman (#1702)

## what - Add comprehensive unit and integration tests for Atmos auth system using the existing mock provider - Increase test coverage from **6% to ~80%** (target: 80-90% ✅) - Add regression tests to prevent recurrence of user-reported browser authentication issue - Achieve **100% coverage** for mock provider implementation

why

Current auth test coverage was critically low (6%), making it difficult to catch bugs
User complaint (Bogdan) about browser authentication triggering on every command needed verification and regression protection
Mock provider was implemented but had zero test coverage
Need confidence that auth system works correctly without requiring real cloud credentials

Coverage Improvements

Package	Before	After	Improvement
pkg/auth	6.2%	84.6%	+78.4pp
pkg/auth/providers/mock	0%	100.0%	+100pp
pkg/auth/utils	0%	100.0%	+100pp
pkg/auth/validation	0%	90.0%	+90pp
pkg/auth/list	0%	89.5%	+89.5pp
pkg/auth/cloud/aws	0%	79.2%	+79.2pp
pkg/auth/providers/github	0%	78.3%	+78.3pp
pkg/auth/factory	0%	77.8%	+77.8pp
pkg/auth/credentials	0%	75.8%	+75.8pp
pkg/auth/providers/aws	0%	67.8%	+67.8pp
pkg/auth/identities/aws	2.3%	62.5%	+60.2pp

Overall: ~6% → ~80% ✅

Key Additions

1. Mock Provider Unit Tests (100% coverage)

pkg/auth/providers/mock/provider_test.go - 15 comprehensive tests
pkg/auth/providers/mock/identity_test.go - 13 comprehensive tests
Tests cover: authentication, expiration, concurrency, interface compliance

2. Credential Caching Regression Tests

cmd/auth_caching_test.go - 4 test functions with multiple subtests
Verifies credentials are cached after login and reused
Ensures fast execution (< 2s) vs browser auth (5-30s)
Tests multi-identity scenarios

3. Integration Test Scenarios

tests/test-cases/auth-mock.yaml - 20+ test scenarios
Auth login, whoami, env, exec, list, logout commands
Multiple output formats (json, bash, dotenv)
Error handling and edge cases

User Issue: Browser Auth on Every Command

Status: LIKELY FIXED ✅

The issue where browser authentication was triggered on every command appears to have been resolved by recent PRs (#1655, #1653, #1640). This PR adds comprehensive regression tests to:

Verify credentials are cached after authentication
Ensure subsequent commands use cached credentials
Confirm fast execution without browser prompts
Prevent regression of this issue

Testing

# Run mock provider tests
$ go test ./pkg/auth/providers/mock/... -v
=== RUN   TestNewProvider
=== RUN   TestProvider_Authenticate
=== RUN   TestProvider_Concurrency
... 28 tests PASS
coverage: 100.0% of statements

# Run auth package tests
$ go test -cover ./pkg/auth/...
pkg/auth: 84.6% coverage ✅
pkg/auth/providers/mock: 100% coverage ✅
pkg/auth/utils: 100% coverage ✅
... all passing

Benefits

No cloud credentials needed for auth testing
Fast test execution (milliseconds vs seconds)
Deterministic results (fixed expiration dates)
CI/CD ready (no secrets required)
Regression protection for caching issue
80% coverage meets industry standards

references

User complaint: Bogdan reported browser auth on every command
Related PRs: #1655, #1653, #1640 (auth improvements)
Mock provider enables testing without cloud credentials

Add auth console command for web console access @osterman (#1684)

## what - Add `atmos auth console` command to open cloud provider web consoles using authenticated credentials - Implement AWS console access via federation endpoint (similar to aws-vault login) - Add 100+ AWS service destination aliases for convenient access - Create dedicated `pkg/http` package for HTTP client utilities - Add pretty formatted output using lipgloss with Atmos theme colors - Consolidate browser opening functionality to existing `OpenUrl` helper

why

Provides convenient browser access to cloud consoles without manually copying credentials
Eliminates context switching between terminal and browser for console access
Uses provider-native federation endpoints for secure temporary access
Extensible interface pattern supports future Azure/GCP implementations

features

Service Aliases: Use shorthand like s3, ec2, lambda instead of full console URLs
Autocomplete: Shell completion for destination and identity flags
Session Control: Configurable duration (up to 12 hours for AWS) with expiration display
Clean Output: URL only shown on error or with --no-open flag
Scriptable: --print-only flag for piping URLs to other tools
Provider-Agnostic: Interface design ready for multi-cloud support

implementation

Created ConsoleAccessProvider interface in pkg/auth/types/interfaces.go
Implemented ConsoleURLGenerator for AWS using federation endpoint
Added ResolveDestination() with case-insensitive alias lookup
Moved HTTP utilities from pkg/utils to dedicated pkg/http package
Used existing OpenUrl() function for cross-platform browser opening
Added comprehensive tests achieving 85.9% coverage

testing

Unit tests for console URL generation (15 test cases)
Unit tests for destination alias resolution (100+ aliases tested)
Mock HTTP client for testing without network calls
Table-driven tests with edge case coverage

documentation

CLI reference: website/docs/cli/commands/auth/console.mdx
Blog post: website/blog/2025-10-20-auth-console-web-access.md
Proposal document: docs/proposals/auth-web-console.md
Embedded markdown usage examples

references

Similar to aws-vault's console login feature
AWS Federation Endpoint: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_enable-console-custom-url.html

Summary by CodeRabbit

New Features
- Added atmos auth console: opens cloud provider web consoles via temporary sign-in URLs (AWS supported now; Azure/GCP planned).
- Supports service aliases (s3, ec2, etc.), full destination URLs, session duration (AWS up to 12h), issuer, --print-only, --no-open and identity selection/completion.
Documentation
- New CLI docs, usage guide, PRD and blog post with examples and troubleshooting.
Tests
- Expanded tests and CI snapshots for the new command and destination resolution.

fix: Only log verbose test output on failure @osterman (#1704)

## what - Replace unconditional `t.Log()` calls with `t.Cleanup()` handlers that only output verbose YAML/data when tests fail - Eliminate noisy stderr output during successful test runs while preserving debug information when tests fail - Add fallback to raw data output (`%+v`) when YAML conversion produces empty strings

why

CI test runs were showing verbose YAML dumps to stderr even when tests passed
This cluttered test output and made it difficult to identify actual issues
Debug information is still valuable when tests fail, but shouldn't appear during successful runs
Go's t.Log() always outputs to stderr, regardless of test success/failure

demo

Finally clean output!

go mod download
Running tests with subprocess coverage collection
ok  	github.com/cloudposse/atmos	7.020s	coverage: 14.8% of statements in ./...
ok  	github.com/cloudposse/atmos/cmd	7.581s	coverage: 20.7% of statements in ./...
ok  	github.com/cloudposse/atmos/cmd/about	0.134s	coverage: 0.1% of statements in ./...
ok  	github.com/cloudposse/atmos/cmd/internal	0.099s	coverage: 0.1% of statements in ./...
?   	github.com/cloudposse/atmos/cmd/markdown	[no test files]
ok  	github.com/cloudposse/atmos/cmd/version	1.802s	coverage: 1.4% of statements in ./...
ok  	github.com/cloudposse/atmos/errors	0.213s	coverage: 0.4% of statements in ./...
ok  	github.com/cloudposse/atmos/internal/aws_utils	0.120s	coverage: 0.1% of statements in ./...
ok  	github.com/cloudposse/atmos/internal/exec	84.175s	coverage: 32.9% of statements in ./...
ok  	github.com/cloudposse/atmos/internal/terraform_backend	32.223s	coverage: 0.9% of statements in ./...
	github.com/cloudposse/atmos/internal/tui/atmos		coverage: 0.0% of statements
	github.com/cloudposse/atmos/internal/tui/components/code_view		coverage: 0.0% of statements
ok  	github.com/cloudposse/atmos/internal/tui/templates	0.125s	coverage: 0.5% of statements in ./...
	github.com/cloudposse/atmos/internal/tui/templates/term		coverage: 0.0% of statements
ok  	github.com/cloudposse/atmos/internal/tui/utils	0.218s	coverage: 0.2% of statements in ./...
	github.com/cloudposse/atmos/internal/tui/workflow		coverage: 0.0% of statements
ok  	github.com/cloudposse/atmos/pkg/atlantis	1.434s	coverage: 10.3% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth	0.141s	coverage: 2.1% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/cloud/aws	0.113s	coverage: 0.8% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/credentials	0.316s	coverage: 0.9% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/factory	0.141s	coverage: 0.2% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/identities/aws	0.139s	coverage: 1.7% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/list	0.138s	coverage: 1.5% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/providers/aws	0.098s	coverage: 1.6% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/providers/github	0.072s	coverage: 0.3% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/providers/mock	0.133s	coverage: 0.1% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/types	0.075s	coverage: 0.2% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/utils	0.099s	coverage: 0.0% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/validation	0.150s	coverage: 0.7% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/aws	0.199s	coverage: 2.4% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/component	0.898s	coverage: 10.1% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/component/mock	0.178s	coverage: 0.4% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/config	3.247s	coverage: 5.7% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/config/homedir	0.073s	coverage: 0.2% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/convert	0.048s	coverage: 0.0% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/datafetcher	0.228s	coverage: 0.2% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/describe	29.214s	coverage: 13.2% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/downloader	1.115s	coverage: 1.6% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/filematch	0.135s	coverage: 0.3% of statements in ./...
	github.com/cloudposse/atmos/pkg/filesystem		coverage: 0.0% of statements
ok  	github.com/cloudposse/atmos/pkg/filetype	0.078s	coverage: 0.4% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/generate	0.685s	coverage: 7.7% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/git	0.164s	coverage: 0.3% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/github	2.462s	coverage: 0.4% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/hooks	0.264s	coverage: 7.5% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/list	2.193s	coverage: 12.0% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/list/errors	0.073s	coverage: 0.1% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/list/flags	0.072s	coverage: 0.1% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/list/format	0.119s	coverage: 0.6% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/list/utils	0.187s	coverage: 0.2% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/logger	0.161s	coverage: 0.3% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/merge	0.227s	coverage: 1.7% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/pager	0.076s	coverage: 0.9% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/perf	1.238s	coverage: 0.5% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/pro	0.177s	coverage: 0.8% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/pro/dtos	0.051s	coverage: 0.0% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/profiler	1.861s	coverage: 0.4% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/provenance	0.130s	coverage: 1.8% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/retry	0.176s	coverage: 0.2% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/schema	0.070s	coverage: 0.3% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/spacelift	0.787s	coverage: 8.4% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/stack	0.346s	coverage: 4.3% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/store	0.139s	coverage: 1.7% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/telemetry	0.518s	coverage: 2.7% of statements in ./...
	github.com/cloudposse/atmos/pkg/telemetry/mock		coverage: 0.0% of statements
ok  	github.com/cloudposse/atmos/pkg/ui/heatmap	0.129s	coverage: 0.9% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/ui/markdown	0.138s	coverage: 0.4% of statements in ./...
?   	github.com/cloudposse/atmos/pkg/ui/theme	[no test files]
ok  	github.com/cloudposse/atmos/pkg/utils	0.743s	coverage: 4.8% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/validate	1.354s	coverage: 14.5% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/validator	0.116s	coverage: 0.2% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/vender	3.308s	coverage: 3.9% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/version	0.069s	coverage: 0.0% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/xdg	0.046s	coverage: 0.1% of statements in ./...
ok  	github.com/cloudposse/atmos/tests	174.022s	coverage: 14.3% of statements in ./...
ok  	github.com/cloudposse/atmos/tests/testhelpers	90.419s	coverage: 1.1% of statements in ./...
Coverage report generated: coverage.out

references

Affects 9 test files with 29 cleanup handlers added
Modified files:
- pkg/component/component_processor_test.go
- pkg/describe/describe_affected_test.go
- pkg/describe/describe_component_test.go
- pkg/describe/describe_dependents_test.go
- pkg/describe/describe_stacks_test.go
- pkg/list/list_components_test.go
- pkg/merge/merge_test.go
- pkg/spacelift/spacelift_stack_processor_test.go
- pkg/stack/stack_processor_test.go

🤖 Generated with Claude Code

Add linter rule for missing defer perf.Track() calls @osterman (#1698)

## what - Added new `perf-track` linter rule to catch missing `defer perf.Track()` calls - Enabled by default with explicit package and type exclusions - Integrated with existing lintroller custom linter framework

why

Enforces coding guidelines requiring performance tracking on all public functions
Catches violations early in development before code review
Prevents missing perf tracking that would be tedious to find manually
Uses explicit exclusions for infrastructure code (logger, profiler, perf, store, ui, tui)

references

Follows coding guidelines in CLAUDE.md for mandatory defer perf.Track() usage
Addresses hundreds of potential violations by catching them at lint time
Exclusions prevent infinite recursion and avoid tracking overhead in low-level code

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added a lint rule that enforces a defer-based performance-tracking call at the start of exported functions/methods; enabled by default with a config toggle to disable.
Tests
- Added unit tests and example cases demonstrating compliant and non-compliant exported functions/methods for the new rule.
Documentation
- Updated lint configuration docs to mention the new performance-tracking check and its settings.

Add condition to skip Docker build for prerelease @goruha (#1700)

## what * Add condition to skip Docker build for prerelease

why

Exclude prerelease versions from Homebrew workflows

Summary by CodeRabbit

Chores
- Build workflow updated so Docker image build/push steps are skipped for prerelease releases.
- Dependency review job runner specification changed to a composite runner configuration with additional runner attributes.

feat: Add `atmos auth shell` command @osterman (#1640)

## what - Add `atmos auth shell` command to launch an interactive shell with authentication environment variables pre-configured - Implement shell detection that respects `$SHELL` environment variable with fallbacks to bash/sh - Add `--shell` flag with viper binding to `ATMOS_SHELL` and `SHELL` environment variables - Support `--` separator for passing custom shell arguments to the launched shell - Track shell nesting level with `ATMOS_SHLVL` environment variable - Propagate shell exit codes back to Atmos process - Set `ATMOS_IDENTITY` environment variable in the shell session

why

Users need an easy way to work interactively with cloud credentials without manually managing environment variables
Similar to atmos terraform shell, this provides a consistent experience for authenticated sessions
Allows running multiple commands in a single authenticated session without re-authenticating
Supports custom shell configurations and arguments for flexibility

references

Similar to existing atmos terraform shell command implementation
Follows authentication patterns from atmos auth exec and atmos auth env

testing

Comprehensive unit tests with 80-100% coverage on testable functions
25 passing tests covering:
- Shell detection and fallback logic (100% coverage)
- Environment variable management (100% coverage)
- Shell nesting level tracking (83-100% coverage)
- Exit code propagation (tested with codes 0, 1, 42)
- Flag parsing and viper integration
- Cross-platform support (Unix and Windows)
All linting checks passing (0 issues)
Pre-commit hooks passing

documentation

Added website/docs/cli/commands/auth/auth-shell.mdx with full command documentation
Created cmd/markdown/atmos_auth_shell_usage.md with usage examples
Includes purpose note, usage patterns, examples, and environment variable reference

Summary by CodeRabbit

New Features
- Interactive authenticated shell with shell selection, argument passthrough, nested-shell tracking, and identity selection.
- Pluggable credential storage: system, file (path/password) and memory backends selectable via config/env.
- Deterministic mock auth provider for testing.
Documentation
- New auth-shell docs, usage examples, blog posts, keyring-backends guide, XDG docs, and PRD.
Tests
- Expanded unit/integration coverage for shell flows, keyring backends, XDG, and credential stores.
Chores
- Added keyring-related dependencies, CI/workflow and tooling adjustments.

Improve auth login with identity selection @osterman (#1655)

## what

Modified the auth login command to automatically prompt for an identity when no --identity flag is provided.
This leverages the existing authManager.GetDefaultIdentity() which handles interactive selection and fallback logic.
Updated documentation to reflect this new behavior.

why

Users were prompted to manually select an identity in interactive sessions when no default was set.
This change simplifies the login process by automatically invoking the interactive selector or using the default identity when available, improving user experience and reducing manual input.

references

No specific issue linked - this is a user experience enhancement.

Replace deny-licenses with allow-licenses and remove redundant workflow @osterman (#1692)

## what - Delete redundant `.github/workflows/dependabot.yml` workflow file - Update `dependency-review.yml` to use `allow-licenses` instead of deprecated `deny-licenses` parameter - Maintain PR commenting functionality with `comment-summary-in-pr: always` - Allow only permissive licenses: MIT, Apache-2.0, BSD-2-Clause, BSD-3-Clause, ISC, MPL-2.0, 0BSD, Unlicense, CC0-1.0

why

GitHub deprecated the deny-licenses parameter in favor of allow-licenses for better security posture
The dependabot.yml workflow was redundant - we already have dependency-review.yml that provides more comprehensive dependency review
Using an allow-list approach is more secure than a deny-list approach
Consolidating to a single dependency review workflow reduces maintenance overhead

references

https://github.com/actions/dependency-review-action

Summary by CodeRabbit

Chores
- Implemented a 2-week minimum age requirement for automated dependency updates
- Updated dependency review workflow to enforce permissive open-source licenses only
- Consolidated dependency management configurations

Compress CLAUDE.md and add size limit enforcement @osterman (#1693)

## what - Compressed CLAUDE.md from 40.3k chars to 6.3k chars (84% reduction) - Added GitHub action to enforce 40k character limit on CLAUDE.md - Refactored into reusable composite action pattern

why

Large CLAUDE.md files impact performance and token usage
Need automated enforcement to prevent file bloat
Reusable action pattern improves maintainability

Compression Details

Metrics:

Size: 40,300 chars → 6,301 chars (84.4% reduction)
Lines: 1,183 → 165 (86.0% reduction)
Current usage: 15% of 40k limit

Techniques Applied:

Removed verbose explanations, kept terse requirements
Consolidated redundant examples
Merged related sections
Preserved all MANDATORY rules

What's Preserved:
✅ All MANDATORY requirements
✅ Code patterns and conventions
✅ Error handling strategies
✅ Testing requirements
✅ CLI command structure
✅ Development workflows
✅ Cross-platform compatibility rules
✅ Git and PR guidelines

GitHub Action Structure

.github/
├── actions/
│   └── check-claude-md-size/
│       ├── action.yml          # Composite action with all logic
│       └── README.md            # Action documentation
└── workflows/
    └── claude.yml               # Simple 16-line workflow

Action Features:

Validates file size on PR changes
Posts/updates intelligent PR comments
Fails CI if limit exceeded
Configurable file path and size limit
Provides outputs: size, exceeds-limit, usage-percent

Triggers:

Pull requests modifying CLAUDE.md
Changes to workflow or action files

references

Follows composite action best practices
Pattern similar to existing actions in the ecosystem
Maintains consistency with project's CI/CD approach

Summary by CodeRabbit

New Features
- Automated CLAUDE.md size validation with configurable limits; posts and updates PR comments when limits are exceeded or resolved.
Documentation
- Reworked CLAUDE.md to emphasize architecture and mandatory design patterns instead of granular step-by-step procedures.
- Added user-facing documentation for the CLAUDE.md size-check action and its usage.

Add auth logout command @osterman (#1656)

## what

This pull request introduces the atmos auth logout command, enabling users to securely remove locally cached credentials. The command supports:

Identity-specific logout: Removes credentials for a given identity and its entire authentication chain.
Provider-specific logout: Removes all credentials associated with a particular provider.
Interactive mode: Prompts the user to select what to logout when no arguments are provided.
Dry-run mode: Previews what would be removed without making changes.
Comprehensive cleanup: Deletes credentials from the system keyring and provider-specific files (e.g., AWS credentials).
Best-effort error handling: Continues cleanup even if individual steps fail, reporting all encountered errors.

why

This feature addresses several key pain points:

Security: Allows users to securely remove stale credentials, reducing the risk of unauthorized access.
Developer Experience: Simplifies switching between different identities or environments by providing a clean way to remove existing credentials.
Compliance: Enables auditing of credential removal and ensures adherence to security policies.
Troubleshooting: Provides a straightforward method to clear authentication caches when debugging.

The implementation uses native Go operations for file system cleanup and integrates with go-keyring for cross-platform credential store access. It leverages Charmbracelet libraries for a polished interactive user experience and styled output.

references

closes #735

Summary by CodeRabbit

Release Notes

New Features
- Added atmos auth logout CLI command to remove stored credentials
- Supports logout by identity, by provider, or all identities at once
- Interactive mode to select which credentials to remove
- Dry-run mode to preview credential removals without executing
- Browser session warning displayed after successful logout
Documentation
- Added guides and reference documentation for logout workflows and usage

Replace custom license-check with GitHub dependency-review-action @osterman (#1690)

## what

Replaced custom license-check action (308 lines) with GitHub's native dependency-review-action
Simplified workflow from 44 lines to 18 lines with better functionality
Added automated NOTICE file generation and validation to CI
Workflow now:
- Validates licenses using GitHub's dependency graph
- Blocks PRs with forbidden licenses (GPL, AGPL, etc.)
- Generates NOTICE file using go-licenses
- Fails CI if NOTICE file is out of date

why

Reduce maintenance burden: GitHub's native action requires zero maintenance vs custom bash fighting go-licenses bugs
Better reliability: Native GitHub solution works across all ecosystems, not just Go
Automated NOTICE updates: Ensures NOTICE file stays in sync with dependencies automatically
Clearer error messages: Developers get actionable feedback when NOTICE file needs updating
Industry standard: Uses same tooling as thousands of other repositories

references

GitHub dependency-review-action
google/go-licenses - Still used for NOTICE generation
Replaces .github/actions/license-check/ (264 lines) and custom workflow (44 lines)

Troubleshooting Notes

autofix.ci Artifact Upload Errors (RESOLVED)

Error encountered:

Attempt 4 of 5 failed with error: Unexpected token 'O', "Original A"... is not valid JSON
Error: Failed to CreateArtifact: Failed to make request after 5 attempts

Root Cause:
When using RunsOn self-hosted runners with extras=s3-cache, the runs-on/action@v2 step is required for artifact uploads to work. Without it, the artifact API receives HTML error pages instead of JSON responses.

Fix Applied:

Added runs-on/action@v2 as first step in autofix.yml (required for S3 cache compatibility)
Added permissions: { contents: read, actions: write } (was empty {} which grants NO permissions)
Upgraded autofix-ci/action from v1.3.1 to v1.3.2

Reference:

RunsOn S3 Cache Documentation
Key quote: "If you have enabled the s3-cache extra and are using the actions/upload-artifact@v4 action in your workflows, you must ensure that you have also included the runs-on/action@v2 action in your jobs."

Time saved for future developers: ~2 hours of debugging 🎯

Summary by CodeRabbit

New Features
- Added automatic dependency license review to flag restricted licenses (GPL, LGPL, AGPL) on pull requests.
- Added vulnerability severity checks to the dependency review process.
- Introduced comprehensive NOTICE file documenting all third-party dependencies and their licenses.
Documentation
- Added documentation for license generation utilities and scripts.

Add Component Registry Pattern and Mock Component @osterman (#1648)

## what

This Pull Request introduces the Component Registry Pattern to Atmos, enabling extensible support for various component types. It lays the foundation for adding new infrastructure tools as plugins in the future.

Key changes include:

ComponentProvider Interface: A new interface defining the contract for all component providers.
Component Registry: A thread-safe global registry to manage component providers.
Mock Component Provider: A proof-of-concept implementation for testing the registry and component lifecycle without external dependencies. It demonstrates inheritance, merging, and cross-component dependencies.
Hybrid Configuration Schema: pkg/schema/schema.go is updated to support both statically defined built-in component types (Terraform, Helmfile, Packer) and dynamically registered plugin types via the Plugins map.
Sentinel Errors: New sentinel errors related to component providers and configurations are added to errors/errors.go.
JSON Schema Updates: Schemas in pkg/datafetcher/schema/ are modified to allow additional properties for component types, accommodating the hybrid configuration.
Developer Guide: A new markdown file docs/developing-component-plugins.md is added, detailing how to create new component plugins.

why

The existing hardcoded approach for component types (Terraform, Helmfile, Packer) limits extensibility and maintainability. This PR introduces a more robust and flexible pattern:

Extensibility: Allows easy addition of new component types (e.g., Pulumi, CDK, CloudFormation) without modifying core Atmos code.
Plugin Support: Paves the way for external component plugins in future phases.
Testability: The mock component enables thorough testing of the registry pattern, configuration inheritance, and dependency resolution without requiring external tools or cloud provider access.
Consistency: Adopts a pattern similar to the existing command registry, promoting a unified architectural approach.
Maintainability: Centralizes component logic within providers, reducing code duplication and improving clarity.
Backward Compatibility: Existing configurations and functionality remain unaffected. The hybrid schema ensures existing component types continue to work seamlessly while introducing the new pattern.
Enhanced Testing: Introduces specific test coverage requirements (90%+) for the registry and mock component, including thread-safety and edge-case testing.

references

closes #589
closes #600
closes #601

Summary by CodeRabbit

New Features
- Adds a component registry, plugin-style component support, and a mock provider for testing; components can now be discovered at runtime and report available commands.
- Component configuration now accepts dynamic plugin entries (new Plugins field) for greater flexibility.
Documentation
- New developer guide for building component plugins, a registry migration pattern, and expanded development requirements and best practices.
Tests
- Comprehensive registry and mock-provider test suites and updated CLI snapshot to show Plugins field.

Fix blog post ordering and add explicit dates @osterman (#1689)

## what - Add explicit `date:` field to all blog post frontmatter for consistent ordering - Fix welcome post date to 2025-10-12 so it appears first in the changelog - Fix chdir post filename and date to 2025-10-19 (actual PR merge date) - Add `` markers to chdir and pager posts for proper summaries - Remove duplicate `index.md` that was causing routing conflicts

why

Blog posts were displaying in incorrect chronological order
Some posts were missing truncate markers, causing warnings during build
Welcome post should appear first as it introduces the changelog
Duplicate index.md was causing Docusaurus routing conflicts

references

Fixes blog post ordering issues identified by user

Summary by CodeRabbit

Documentation
- Added new blog posts covering Atmos authentication, provenance tracking, command registry patterns, AWS SSO verification, version list commands, and authentication tutorials.
- Updated blog post on pager default behavior with migration guidance and configuration instructions.
- Enhanced blog content metadata and organization.

Add license check workflow @osterman (#1680)

## what

Added a GitHub Actions workflow (.github/workflows/license-check.yml) to automatically audit Go project dependencies for license compliance.
This workflow triggers on pull request events (opened, synchronize, reopened) that affect go.mod, go.sum, or the workflow file itself.
It also includes scheduled runs (weekly on Mondays) and manual dispatch for flexibility.
A new script (scripts/check-licenses.sh) was introduced to perform the actual license check using go-licenses.
The script checks for "forbidden" license types and generates a summary report.
The generated CSV report from go-licenses report is now uploaded as a GitHub Actions artifact.

why

To proactively identify and prevent the introduction of dependencies with problematic licenses (e.g., GPL, AGPL) into the project.
Automates the license auditing process, reducing manual effort and the risk of oversight.
Ensures compliance with licensing requirements, especially important for open-source and commercial projects.
The CI integration provides immediate feedback on PRs affecting dependencies.
Uploading the report as an artifact allows for easy review of detailed license information.

references

closes #123 (Assuming #123 is the issue related to license auditing)

Summary by CodeRabbit

Chores
- Added automated license compliance checks that run on pull requests, weekly, and on demand, producing a downloadable CSV license report retained for 30 days.
- Added a license-audit workflow and scanning script that installs/checks the scanner as needed, handles known edge cases, summarizes license distribution, and emits clear pass/fail results.

Add atmos auth list command with multiple output formats @osterman (#1645)

## what - Add new `atmos auth list` command to list all configured authentication providers and identities - Support multiple output formats: table (default), tree, JSON, YAML, Graphviz, Mermaid, and Markdown - Implement filtering by providers or identities with optional name filtering - Add comprehensive documentation and usage examples

why

Users need visibility into their authentication configuration to understand providers, identities, and their relationships
Multiple output formats enable different use cases: interactive CLI (table/tree), automation (JSON/YAML), and documentation (Graphviz/Mermaid)
Visual formats help understand complex authentication chains where identities assume roles through providers or other identities

references

Implements feature request for authentication configuration visibility
Follows existing Atmos patterns for command structure and output formatting

Summary by CodeRabbit

New Features
- Added an auth list command to view providers and identities with flexible filtering and multiple output formats (table, tree, JSON, YAML, Graphviz, Mermaid, Markdown)
- Added chain visualization outputs (graph/mermaid/markdown) for easier relationship tracing
Bug Fixes
- Support expanded tilde (~) paths for the CLI chdir flag
Documentation
- Comprehensive CLI docs, usage guide, and blog post added
Tests
- Extensive unit tests and format/diagram validation added

Update mockgen to go.uber.org/mock @osterman (#1681)

## what

Replaced the usage of the archived github.com/golang/mock with go.uber.org/mock.
Updated all import paths from github.com/golang/mock/gomock to go.uber.org/mock/gomock.
Updated all //go:generate mockgen directives to use go run go.uber.org/mock/mockgen@v0.6.0 (pinned version for reproducible builds).
Regenerated all mock files with the pinned version.
Added a lint rule in .golangci.yml to disallow usage of github.com/golang/mock.
Configured .golangci.yml to exclude generated mock files (mock_*.go) from godot linter checks.

why

github.com/golang/mock is an archived repository and should no longer be used.
go.uber.org/mock is the maintained successor.
Pinning to @v0.6.0 ensures reproducible builds across different environments.
This change ensures the project uses actively maintained dependencies and prevents accidental use of the deprecated library through a new lint rule.

references

closes #123

Fix go install compatibility by removing replace directive @osterman (#1685)

## what - Remove `replace` directive from `go.mod` that breaks `go install github.com/cloudposse/atmos@latest` - Update Atmos internal code to import from `pkg/config/homedir` directly instead of via replaced module path - Remove `go.mod` from `pkg/config/homedir` (no longer needed as separate module) - Add regression test `TestGoModNoReplaceDirectives` to prevent future breakage of `go install` compatibility

why

The replace directive introduced in v1.195.0 (PR #1631) breaks a documented installation method
go install cmd@version intentionally does not support modules with replace or exclude directives
This is a fundamental design decision in Go (golang/go#44840, #69762, #50698) that won't be changed
Users attempting go install github.com/cloudposse/atmos@latest get errors and cannot install
Breaking this installation path creates user friction and support burden

tradeoffs

What we're giving up

The replace directive was added to ensure all transient dependencies (16+ packages) use Atmos's improved fork of the deprecated mitchellh/go-homedir package instead of the archived original.

Unfortunately, we must accept that transient dependencies will use the deprecated package because:

There's no way to force transient dependencies to use our fork without replace
We can't publish our fork as github.com/mitchellh/go-homedir (we don't own that domain)
Requiring all 16+ transient dependencies to update their imports is not feasible

What we're keeping

Atmos's own code still uses the improved pkg/config/homedir implementation with better error handling, refactoring, and security annotations
The deprecated mitchellh/go-homedir package has no known security vulnerabilities (verified via Snyk)
The package is stable (last commit 2019, archived July 2024 as feature-complete, not broken)

The decision

Restoring go install compatibility is more important than forcing transient dependencies to use our improved fork. The deprecated package works fine, and Atmos's direct usage still benefits from our improvements.

testing

Added TestGoModNoReplaceDirectives to catch future regressions
Verified go build succeeds
Verified all existing tests pass
Verified binary runs correctly with ./atmos version

references

Original PR that introduced the replace directive: #1631
User report: Slack thread from Jonathan Rose
Go issues on replace directive limitation: golang/go#44840, golang/go#69762, golang/go#50698

Replace mitchellh/mapstructure with go-viper/mapstructure @osterman (#1678)

## what

Replaced direct usage of the archived github.com/mitchellh/mapstructure with github.com/go-viper/mapstructure/v2.
Added a replace directive in go.mod to force all transitive dependencies that use github.com/mitchellh/mapstructure to instead use the maintained github.com/go-viper/mapstructure fork (v1.6.0).

why

The mitchellh/mapstructure library has been archived, meaning it will no longer receive updates or security patches.
github.com/go-viper/mapstructure/v2 is the actively maintained and recommended fork, ensuring continued support and bug fixes.
Using the replace directive ensures that even indirect dependencies use the supported fork, eliminating reliance on the archived library.

references

closes #123

Summary by CodeRabbit

Chores
- Updated internal dependency management to use go-viper/mapstructure v2 instead of the previous mapstructure implementation across the codebase for improved compatibility and maintenance.

Add spinner and TTY dialog for AWS SSO auth @osterman (#1653)

## what

Enhances the AWS SSO authentication flow by introducing a visually appealing, interactive terminal dialog using the charmbracelet library.
Displays a colored, bordered dialog box in TTY environments showing the AWS SSO verification code and instructions.
Integrates an animated spinner to indicate when the system is waiting for authentication.
Gracefully degrades to plain text output in non-TTY environments (e.g., CI pipelines) to ensure compatibility.

why

Improved User Experience: The charmbracelet dialog provides a more engaging and informative user experience during the AWS SSO authentication process, making it easier to understand and follow.
Clearer Verification: The prominent display of the verification code with styling helps users visually confirm the code against what is shown in their browser.
Real-time Feedback: The spinner provides immediate visual feedback that the system is actively waiting for authentication, reducing user uncertainty.
Universal Compatibility: The graceful degradation ensures that the authentication flow remains functional and usable across all environments, including those without TTY capabilities.
Enhanced Readability: Color-coded elements and clear messaging improve the readability of important information, especially the verification code and URLs.

references

closes #123 (Assuming this is the issue being addressed)
Further context on AWS SSO device authorization flow: AWS SSO Documentation

Summary by CodeRabbit

New Features
- Styled verification dialog with automated browser opening, animated spinner during SSO device authorization, and Ctrl+C cancellation.
- Unified display for authentication results with human-friendly expiration durations and visual expiring indicators.
Documentation
- Added detailed AWS IAM Identity Center / device-authorization flow docs and clarified device codes vs. MFA tokens.
Improvements
- Graceful degradation for non-TTY/CI environments and consistent UX across auth commands.

Fix segfault in TestGetAffectedComponents when error pointer is corrupted @osterman (#1677)

## what - Fix segmentation violation in TestGetAffectedComponents at line 247 - Safely convert error to string before passing to `t.Skipf()`

why

On macOS ARM64, when gomonkey patches fail, the real function gets called with invalid test data
This can result in a corrupted error pointer being returned (observed address: 0x646e657065646b73)
fmt.Sprintf with %v tries to dereference the corrupt pointer, causing a segfault
Converting error to string first using err.Error() avoids dereferencing the corrupt pointer

references

Fixes GitHub Actions failure: https://github.com/cloudposse/atmos/actions/runs/18656461566/job/53187085704
Stack trace showed fault at terraform_affected_test.go:247

testing

Verified test now passes without segfault on macOS ARM64
Test gracefully skips when gomonkey mocking fails

Fix os.Args in tests with SetArgs @osterman (#1675)

## what

This PR refactors various test files to replace direct manipulation of os.Args with Cobra's recommended RootCmd.SetArgs() method. This change standardizes how command-line arguments are tested across the codebase and improves test reliability by preventing global state pollution.

Specific changes include:

cmd/ package:
- Replaced os.Args assignments with RootCmd.SetArgs() in cmd/root_test.go, cmd/auth_login_test.go.
- Removed unnecessary manual save/restore of os.Args in cmd/root_test.go.
- Documented legitimate usage of os.Args in cmd/cmd_utils_test.go where the function under test directly reads os.Args.
pkg/config/ package:
- Refactored pkg/config/config.go to expose parseFlagsFromArgs(args []string) for direct testing of flag parsing logic.
- Updated pkg/config/config_test.go to use parseFlagsFromArgs() where possible, reducing os.Args manipulation.
- Documented the necessity of os.Args manipulation for integration tests within pkg/config/config_test.go that call functions like setLogConfig().
tests/ package:
- Replaced os.Args assignments with cmd.RootCmd.SetArgs() in tests/cli_describe_component_test.go, tests/describe_test.go, and tests/validate_schema_test.go.

why

Directly manipulating os.Args in tests is an anti-pattern because:

Global State Pollution: os.Args is global and can cause test leakage, leading to unpredictable failures, especially in parallel test runs.
Not the Cobra Way: Cobra provides SetArgs() as the idiomatic and safe way to test command execution, managing its own state.
Manual Cleanup Required: Each os.Args manipulation requires manual defer statements for restoration, adding boilerplate and potential for error.

By adopting RootCmd.SetArgs():

Tests become more reliable and predictable.
Boilerplate for argument setup and cleanup is removed.
The codebase adheres to Cobra's best practices for testing.
For legitimate uses of os.Args (e.g., testing subprocesses that call os.Exit() or integration tests of the main() function), comments have been added to clarify why this approach is necessary.

references

closes #XYZ (if applicable)

Add step to get dependencies in Go setup workflow @goruha (#1679)

## what * Add step to get dependencies in Go setup workflow

why

To cache actual dependencies

Summary by CodeRabbit

Chores
- CI workflow updated to run dependency fetching during build setup, ensuring dependencies are retrieved earlier and improving build preparation reliability.

Use run-os for setup-go @goruha (#1667)

## what * Use run-os for setup-go

why

Reduce cache

references

https://linear.app/cloudposse/issue/DEV-3628/review-and-update-github-actions-cache-usage-before-october-15-2025

Summary by CodeRabbit

Chores
- CI runner selection switched to dynamic, configuration-driven runner entries across workflows; build/test job names now include target/flavor context and include conditional Linux-specific steps.
- Pre-commit, lint, autofix and other CI workflows updated to use the new runner configuration.
New Features
- Added a scheduled/manual workflow to warm up Go cache and prepare Go tooling.
- Added a workflow to clear PR-related caches on closed pull requests.
Tests
- CI exercises OS/target combinations using the new dynamic runner configuration; Acceptance Tests now depend on the build job.

Add Changelog link and remove old file @osterman (#1676)

## what

Added a "Changelog" link to the top navigation bar in website/docusaurus.config.js. This link points to the /blog route, making the blog more accessible to users.
Removed the old, unmaintained CHANGELOG.md file from the root of the repository. This file contained outdated release notes and is no longer necessary as changelogs are now managed as blog posts.

why

The "Changelog" link was added to the navigation bar as per user request to improve discoverability of blog content, which serves as the current changelog.
The CHANGELOG.md file was removed because it was obsolete and unmaintained, with changelogs now being published as blog posts. This cleans up the repository and avoids confusion.

references

closes #123 (This is a placeholder, assuming the user implicitly wants to close an issue related to navigation and cleanup.)
Link to blog: https://atmos.tools/blog/

Summary by CodeRabbit

Documentation
- Removed historical version entries from the changelog.
Chores
- Added "Changelog" navigation link to the website header for easier access to release information.

`auth` Leapp Migration Guide @Benbentwo (#1633)

This pull request adds documentation to help users migrate from Leapp to Atmos Auth for AWS IAM Identity Center authentication. The main changes introduce a new migration guide and organize authentication documentation under a dedicated category.

Documentation improvements:

Added a comprehensive migration guide (migrating-from-leapp.mdx) that explains how to convert Leapp sessions and providers to Atmos Auth YAML configuration, including field mappings, step-by-step instructions, troubleshooting tips, and a comparison table.

Documentation structure:

Created a new _category_.json file to group authentication documentation under "Authentication (atmos auth)" in the sidebar for improved discoverability.

Summary by CodeRabbit

Documentation
- Removed the legacy Atmos Auth User Guide.
- Added a "Migrating from Leapp" tutorial with migration steps, field mappings, and verification commands.
- Added a Geodesic configuration tutorial for Atmos Auth integration.
- Introduced an Auth “Tutorials” category and two new blog posts introducing Atmos Auth and tutorials.
- Reorganized Auth CLI docs: updated ordering, labels, slugs, subcommand links, and sidebar positions.
- Expanded the Auth usage guide with AWS Permission Set account specification guidance and examples.

Update homedir README with fork details @osterman (#1673)

## what

Appended a detailed section to pkg/config/homedir/README.md describing the "Atmos Fork Enhancements".
This new section explains the fork's prioritization of environment variables for test compatibility with t.Setenv().
It also details cache management strategies, including disabling caching (homedir.DisableCache = true) and resetting the cache (homedir.Reset()).
Provides code examples for using these features in Go tests.

why

To clearly document the specific enhancements made in Atmos's vendored fork of the mitchellh/go-homedir package.
To provide users, particularly those writing Go tests, with clear instructions on how to leverage the improved environment variable support and cache management for better testability.
The original mitchellh/go-homedir package is deprecated, and this fork is maintained to support these specific testing requirements.

references

closes #279

🚀 Enhancements

chore: Update Pro Instances API @milldr (#1721)

## what - Update endpoint format to include query params for stack & component

why

We've updated the API for Atmos Pro so that we can support slashes in component names

references

https://github.com/cloudposse-corp/apps/pull/451

Summary by CodeRabbit

Chore
- Pro Instances API client now sends stack and component as query parameters for more reliable encoding and consistency.
Documentation
- Added a blog post explaining the endpoint format change, impact, and that no configuration or workflow changes are required.
Bug Fixes
- Cleaned up authentication output spacing for more compact, consistent display.

fix: Consolidate credential retrieval logic to fix terraform auth @osterman (#1720)

## Summary

This PR fixes a critical bug where atmos terraform plan and other Terraform commands failed to use file-based credentials, while atmos auth whoami and similar commands worked correctly.

The root cause was duplicate credential retrieval code across three methods with inconsistent fallback behavior. Two methods had keyring → identity storage fallback logic, but one (retrieveCachedCredentials) did not, causing Terraform commands to fail when credentials were in files instead of the keyring.

Root Cause Analysis

Three separate code paths retrieved credentials:

GetCachedCredentials - Had fallback ✓
findFirstValidCachedCredentials - Had fallback ✓
retrieveCachedCredentials - NO fallback ✗ (used by Terraform execution)

When users authenticated via AWS SSO, credentials were written to files, not cached in the keyring. Terraform commands would fail because the retrieveCachedCredentials path didn't check identity storage.

Solution

Extracted a shared retrieveCredentialWithFallback method as the single source of truth for credential retrieval:

Fast path: Try keyring cache first (immediate)
Slow path: Fall back to identity storage if not in keyring (AWS files, etc.)
All three code paths now delegate to this single method
Ensures consistent behavior across all operations

Changes

Added retrieveCredentialWithFallback() method (38 lines)
Refactored GetCachedCredentials() - 40% code reduction
Refactored findFirstValidCachedCredentials() - 57% code reduction
Refactored retrieveCachedCredentials() - Now uses shared method
Fixed TestManager_GetCachedCredentials_Paths to use proper test data
Added regression test TestManager_retrieveCachedCredentials_TerraformFlow_Regression
Added integration test TestRetrieveCachedCredentials_KeyringMiss_IdentityStorageFallback
Show active identities

Testing

✅ All auth tests pass (12/12 test suites)
✅ Regression test reproduces original bug, passes with fix
✅ Integration tests verify fallback behavior works
✅ Code compiles successfully

Impact

✅ Terraform commands now work with file-based credentials
✅ ~110 lines of duplicate code eliminated
✅ Single source of truth for credential retrieval
✅ Impossible to have divergent fallback behavior in future
✅ Consistent behavior across all auth operations

References

This PR addresses the issue where valid authenticated sessions would fail during Terraform execution with "credentials not found" error, even though atmos auth whoami showed valid credentials.

See docs/prd/credential-retrieval-consolidation.md for detailed architectural analysis.

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Interactive identity selection when using --identity without a value (CLI and Terraform).
- New auth logout --all to sign out all identities.
- ATMOS_IDENTITY env var honored; CLI env outputs add AWS region defaults.
- Identity list now shows authentication status and credential expiry.
Bug Fixes
- More reliable credential retrieval with keyring → identity-storage fallback.
- Safer, clearer logout behavior and plain-text summaries.
- Default files display path updated to ~/.config/atmos.
Documentation
- Help pages and docs updated for interactive identity modes, logout options, and examples.

fix: Restore PATH inheritance in workflow shell commands @osterman (#1719)

## what - Refactored to **always** merge custom env vars with parent environment - Fixes workflow shell commands failing with "executable file not found in $PATH" - Adds comprehensive unit and integration tests demonstrating the bug and verifying the fix

why

After commit 9fd7d15 (PR #1543), workflow shell commands lost access to PATH environment variable
Users reported workflows that worked in v1.189.0 failed in v1.195.0 with commands like env, ls, grep not found
This is a critical regression affecting any workflow using external executables
Original fix conditionally replaced environment, which was inconsistent with executeCustomCommand behavior

Root Cause

The bug occurred in ExecuteShell() function in internal/exec/shell_utils.go:

Workflow commands call ExecuteShell with empty env slice: []string{}
ExecuteShell appends ATMOS_SHLVL to the slice: []string{"ATMOS_SHLVL=1"}
ShellRunner receives a non-empty env, so it doesn't fall back to os.Environ()
Shell command runs with ONLY ATMOS_SHLVL set, losing PATH and all other environment variables

Solution

Refactored ExecuteShell() to always merge custom env vars with parent environment:

// Always start with parent environment
mergedEnv := os.Environ()

// Merge custom env vars (overriding duplicates)
for _, envVar := range env {
    mergedEnv = u.UpdateEnvVar(mergedEnv, key, value)
}

// Add ATMOS_SHLVL
mergedEnv = append(mergedEnv, fmt.Sprintf("ATMOS_SHLVL=%d", newShellLevel))

This ensures:

✅ Empty env (workflows): Full parent environment including PATH
✅ Custom env (commands): Custom vars override parent, but PATH is preserved
✅ Consistent behavior: Matches executeCustomCommand pattern (line 393 in cmd_utils.go)

Testing

Unit Tests (internal/exec/shell_utils_test.go):

TestExecuteShell/empty_env_should_inherit_PATH_from_parent_process - Verifies env command works
TestExecuteShell/empty_env_should_inherit_PATH_for_common_commands - Tests ls, env, pwd, echo
TestExecuteShell/custom_env_vars_override_parent_env - Verifies custom vars properly override parent

Integration Test (tests/test-cases/workflows.yaml):

atmos workflow shell command with PATH - Full end-to-end workflow test using env | grep PATH

All tests pass, including existing workflow tests.

references

Closes #1718
Resolves DEV-3725
Regression introduced in: 9fd7d15 (PR #1543)

Summary by CodeRabbit

Bug Fixes
- Shell commands now correctly inherit environment variables (including PATH) from the parent process, with custom env vars properly overriding parent values.
Tests
- Added tests covering environment inheritance for commands that require PATH, shell builtins, and custom env var overrides.
Workflows / Snapshots
- Added a workflow demonstrating PATH-dependent shell commands and updated related test snapshots and test cases.

test: Improve test coverage for keyring fallback to 78.4% @osterman (#1705)

## what - Add comprehensive unit tests for no-op keyring and system keyring functionality - Improve test coverage from 71.2% to 78.4% (+7.2 percentage points) - Add Validate() method to test credential types to satisfy ICredentials interface

why

Ensure critical business logic is properly tested (cache management, expiration checking, error handling)
Meet 80% test coverage target for new features
Prevent regressions in keyring fallback behavior introduced in bde37e334

references

Related to commit bde37e334 which introduced graceful keychain fallback for containerized environments
Implements test requirements from docs/prd/keyring-fallback-containerized-environments.md

Test Coverage Improvements

Starting Coverage: 71.2%
Final Coverage: 78.4%
Improvement: +7.2 percentage points

Tests Added (8 new test functions):

TestNoopKeyringStore_ValidCache - Tests cache hit with valid credentials
TestNoopKeyringStore_ExpiredInCache - Tests cache hit with expired credentials
TestNoopKeyringStore_StoreWithMockCredentials - Tests storing mock credentials
TestNoopKeyringStore_ExpirationWarning - Tests expiration warning logic
TestSystemKeyringStore_GetAny - Tests retrieving arbitrary data from system keyring
TestSystemKeyringStore_GetAny_NotFound - Tests GetAny error handling
TestSystemKeyringStore_SetAny - Tests storing arbitrary data types
TestNewKeyringAuthStore - Tests deprecated backward-compatible function

Coverage by Function:

File	Function	Before	After	Improvement
`keyring_noop.go`	`Retrieve()`	36.8%	57.9%	+21.1%
`keyring_system.go`	`GetAny()`	0%	85.7%	+85.7%
`keyring_system.go`	`SetAny()`	0%	71.4%	+71.4%
`store.go`	`NewKeyringAuthStore()`	0%	100%	+100%

Remaining Uncovered (1.6% to reach 80%):

The uncovered code paths require real AWS credentials and live AWS STS API calls:

AWS credential validation success paths (lines 79-95 in Retrieve())
AWS STS GetCallerIdentity success (lines 152-168 in validateAWSCredentials())

These are integration-level scenarios better suited for E2E tests with real AWS infrastructure rather than unit tests.

What We Test

✅ Validation failure path - AWS SDK without credentials
✅ Cache behavior - Hits, misses, expiration, staleness
✅ Error handling - Expired/missing credentials
✅ Storage operations - Store, Retrieve, Delete, List
✅ GetAny/SetAny - Arbitrary data storage for all keyring types
✅ Backward compatibility - Deprecated functions

Fix `atmos describe affected --include-dependents --stack ` command to correctly process the dependents only from the provided stack @aknysh (#1703)

## Problem

When executing atmos describe affected --include-dependents --stack <stack>, the command was incorrectly processing dependent components from ALL stacks instead of only from the specified stack. This caused:

Performance issues: YAML functions (!terraform.output, !terraform.state, !env) were executed for components in all stacks, not just the filtered stack
Incorrect behavior: Dependents from other stacks were being included in the output
Test gaps: Tests didn't catch this issue because fixtures lacked YAML functions that would fail when processed incorrectly

Root Cause

In internal/exec/describe_dependents.go, the ExecuteDescribeDependents function was calling ExecuteDescribeStacks with an empty string for the stack filter instead of passing the onlyInStack parameter. This caused all stacks to be loaded and processed.

Solution

1. Fixed Stack Filtering

Added OnlyInStack parameter to DescribeDependentsArgs struct
Updated ExecuteDescribeDependents to pass the stack filter through to ExecuteDescribeStacks
Ensured dependents are correctly filtered to only the specified stack

2. Refactored to Options Pattern

Created DescribeDependentsArgs struct to replace 8 individual parameters
Improved code readability and maintainability
Follows the Options Pattern from CLAUDE.md

3. Enhanced Test Coverage

Added YAML functions (!env) to test fixtures to detect the bug
Created new test TestDescribeAffectedWithDependentsStackFilterYamlFunctions to verify:
- YAML functions are only executed for components in the specified stack
- Dependents are correctly filtered by stack
- Environment variables are not accessed for components in other stacks

4. Lintroller Improvements

Added comprehensive exclusions to the custom linter:

29 packages excluded from perf.Track() checks (one-time operations)
7 utility files excluded (not in hot paths)
15 hot-path functions instrumented with perf.Track()
os.Args linter exclusions for legitimate test patterns

Testing

Manual Testing

# Test that dependents are filtered by stack
atmos describe affected --include-dependents --stack ue1-network

# Verify YAML functions only execute for the specified stack
ATMOS_TEST_VPC_UE1=test atmos describe affected --include-dependents --stack ue1-network

Automated Testing

go test ./internal/exec -v -run TestDescribeAffectedWithDependentsStackFilterYamlFunctions
go test ./pkg/describe -v

Changes

Core Functionality

internal/exec/describe_dependents.go: Added DescribeDependentsArgs struct, fixed stack filtering
internal/exec/describe_affected_utils_2.go: Updated to use new struct pattern
internal/exec/atmos.go: Updated TUI integration
pkg/describe/describe_dependents_test.go: Updated integration tests

Test Fixtures

Added !env YAML functions to test fixtures in 4 files:
- tests/fixtures/scenarios/atmos-describe-affected-with-dependents-and-locked/stacks/deploy/network/us-east-1.yaml
- tests/fixtures/scenarios/atmos-describe-affected-with-dependents-and-locked/stacks/deploy/network/us-west-2.yaml
- And their stacks-affected versions

Tests

internal/exec/describe_affected_test.go: Added TestDescribeAffectedWithDependentsStackFilterYamlFunctions
Updated all mock functions to use new struct signature

Performance Tracking

Added perf.Track() to hot-path functions:

Stack processing: ProcessYAMLConfigFiles, ProcessYAMLConfigFile, ProcessStackConfig
Component processing: ProcessComponentInStack, ProcessComponentFromContext
Describe operations: ExecuteDescribeStacks, ExecuteDescribeComponent
Core execution: FilterEmptySections, IsComponentAbstract, FilterComputedFields
Template functions: AtmosFuncs.Component, AtmosFuncs.GomplateDatasource

Lintroller

tools/lintroller/rule_perf_track.go: Added exclusions for non-hot-path packages and files
tools/lintroller/rule_os_args.go: Added exclusions for legitimate os.Args usage in tests

Impact

✅ Performance: YAML functions no longer execute for components outside the filtered stack
✅ Correctness: Dependents are now correctly limited to the specified stack
✅ Test Coverage: New tests prevent regression
✅ Code Quality: Improved readability with Options Pattern
✅ Linter: All custom linter checks pass

Summary by CodeRabbit

New Features
- Stack-specific filtering for dependent discovery (OnlyInStack).
- New template helpers under the "atmos" namespace: component, datasource, store.
Bug Fixes
- YAML function execution now respects stack filtering in describe-affected with dependents.
Performance
- Added runtime performance tracking across various describe and processing commands.
Chores
- Updated Atmos version and PostHog dependency; docs updated.
Tests
- Added/updated tests and fixtures for stack-filtering and Terraform-state YAML scenarios.

cloudposse/atmos v1.196.0 on GitHub

what

why

Implementation Details

GCS Backend Features

Usage

GCS Backend Configuration

Performance Benefits

Testing

Backward Compatibility

Files Changed

Core Implementation

Unified Authentication System

Configuration & Documentation

Migration Guide

Before (slower)

After (faster)

Summary by CodeRabbit

Changes

1. Auth Pre-Hook Error Propagation (internal/exec/terraform.go:236)

2. AWS Credential Loading Strategy (pkg/auth/cloud/aws/env.go)

3. Noop Keyring Credential Validation (pkg/auth/credentials/keyring_noop.go)

4. Whoami with Noop Keyring (pkg/auth/manager.go)

5. Test Coverage (internal/exec/terraform_test.go)

Technical Details

Test Plan

References

Summary by CodeRabbit

What

Auth Commands (Commit 1)

List/Docs Commands (Commit 2)

Why

Technical Details

Changes Made

Files Changed

Commands That Still Require Stacks (Unchanged)

Testing

References

Summary by CodeRabbit

why

Summary by CodeRabbit

why

Summary by CodeRabbit

why

references

Summary by CodeRabbit

why

Implementation Details

Architecture

Test Coverage

Performance

Error Messages

references

Files Changed

Summary by CodeRabbit

why

references

why

changes

testing

references

why

changes

implementation notes

test results

Summary by CodeRabbit

why

Coverage Improvements

Key Additions

1. Mock Provider Unit Tests (100% coverage)

2. Credential Caching Regression Tests

3. Integration Test Scenarios

User Issue: Browser Auth on Every Command

Testing

Benefits

references

why

features

implementation

testing

cloudposse/atmos v1.196.0
on GitHub

1. Auth Pre-Hook Error Propagation (`internal/exec/terraform.go:236`)

2. AWS Credential Loading Strategy (`pkg/auth/cloud/aws/env.go`)

3. Noop Keyring Credential Validation (`pkg/auth/credentials/keyring_noop.go`)

4. Whoami with Noop Keyring (`pkg/auth/manager.go`)

5. Test Coverage (`internal/exec/terraform_test.go`)