github cloudposse/atmos v1.196.0

latest release: v1.197.0-rc.0
10 hours ago
Support `!terraform.state` on GCS Backends @shirkevich (#1393) # Add GCS backend support to `!terraform.state` YAML function

what

  • Add Google Cloud Storage (GCS) backend support to !terraform.state Atmos YAML function
  • Implement performance optimizations (client caching, retry logic, extended timeouts)
  • Create unified Google Cloud authentication system for consistency across GCP services
  • Update documentation with GCS backend usage examples and authentication methods

why

The !terraform.state YAML function allows reading the outputs (remote state) of components in Atmos stack manifests directly from the configured Terraform/OpenTofu backends.

Previously, the !terraform.state YAML function only supported:

  • local (Terraform and OpenTofu)
  • s3 (Terraform and OpenTofu)

This PR adds support for:

  • gcs (Google Cloud Storage - Terraform and OpenTofu)

With GCS backend support, users can now leverage the high-performance !terraform.state function instead of the slower !terraform.output or !store functions when using Google Cloud Storage for Terraform state storage.

Implementation Details

GCS Backend Features

  • Full Authentication Support: JSON credentials, service account file paths, and Google Application Default Credentials (ADC)
  • Service Account Impersonation: Support for impersonate_service_account configuration
  • Performance Optimizations:
    • Client caching to avoid recreating GCS clients for repeated operations
    • Retry logic with exponential backoff (up to 3 attempts) for transient failures
    • Extended timeouts (30 seconds) to match S3 backend performance
  • Robust Error Handling: Graceful handling of missing state files and detailed error context
  • Resource Management: Proper cleanup and explicit resource management

Usage

The GCS backend works seamlessly with existing !terraform.state syntax:

# Get the `output` of the `component` in the current stack
subnet_id: !terraform.state vpc private_subnet_id

# Get the `output` of the `component` in the provided `stack` 
vpc_id: !terraform.state vpc dev-us-east-1 vpc_id

# Get complex outputs using YQ expressions
first_subnet: !terraform.state vpc .private_subnet_ids[0]

GCS Backend Configuration

The GCS backend supports all standard Terraform GCS backend configurations:

# atmos.yaml
components:
  terraform:
    backend_type: gcs
    backend:
      gcs:
        bucket: "my-terraform-state-bucket"
        prefix: "terraform/state"
        
        # Authentication options (choose one):
        
        # Option 1: JSON credentials content
        credentials: |
          {
            "type": "service_account",
            "project_id": "my-project",
            ...
          }
          
        # Option 2: Service account file path  
        credentials: "/path/to/service-account.json"
        
        # Option 3: Use Application Default Credentials (ADC)
        # (no credentials field needed - uses environment/metadata)
        
        # Optional: Service account impersonation
        impersonate_service_account: "terraform@my-project.iam.gserviceaccount.com"

Performance Benefits

Compared to !terraform.output, the !terraform.state function with GCS backend:

  • No Terraform execution - Reads state directly from GCS
  • No provider initialization - Skips all module and provider setup
  • No varfile generation - Bypasses Terraform configuration preparation
  • Cached clients - Reuses GCS clients for multiple operations
  • Parallel execution - Multiple state reads can happen concurrently

Testing

  • Comprehensive Test Suite: 100% test coverage for all new functionality
  • Mock Implementations: Complete interface-based testing for GCS operations
  • Authentication Testing: Validates all credential types and authentication flows
  • Error Scenario Coverage: Tests for missing files, network failures, and invalid configurations
  • Caching Validation: Ensures client caching works correctly across operations
  • Retry Logic Testing: Validates exponential backoff and failure recovery

Backward Compatibility

  • No breaking changes to existing configurations
  • Existing backends (local, s3) remain unchanged
  • Same function syntax - no new parameters or options required
  • Graceful fallbacks - continues to work with !terraform.output and !store functions

Files Changed

Core Implementation

  • internal/terraform_backend/terraform_backend_gcs.go - GCS backend implementation
  • internal/terraform_backend/terraform_backend_gcs_test.go - Comprehensive test suite
  • internal/terraform_backend/terraform_backend_registry.go - Register GCS backend
  • internal/terraform_backend/terraform_backend_utils.go - Updated error messages

Unified Authentication System

  • internal/gcp/auth.go - New unified Google Cloud authentication (created)
  • internal/gcp/auth_test.go - Authentication tests (created)
  • pkg/store/google_secret_manager_store.go - Updated to use unified auth
  • internal/gcp_utils/gcp_utils.go - Removed (replaced by unified auth)

Configuration & Documentation

  • internal/exec/terraform_generate_backend.go - GCS backend validation
  • website/docs/core-concepts/stacks/yaml-functions/terraform.state.mdx - Updated documentation
  • errors/errors.go - Added GCS-specific error types
  • go.mod - Added GCS storage dependency

Migration Guide

For users currently using !terraform.output or !store with GCS-stored state:

Before (slower)

# Using !terraform.output (requires Terraform execution)
vpc_id: !terraform.output vpc dev-us-east-1 vpc_id

# Using !store (requires separate state management)  
vpc_id: !store google-secret-manager dev/vpc/vpc_id

After (faster)

# Using !terraform.state (direct GCS state access)
vpc_id: !terraform.state vpc dev-us-east-1 vpc_id

Simply update your backend configuration to use gcs and replace function calls - no other changes needed!

Summary by CodeRabbit

  • New Features

    • GCS-backed Terraform state support and unified Google Cloud authentication integration.
  • Bug Fixes

    • Stricter backend config validation with clearer error responses and updated supported-backends messaging.
  • Tests

    • Comprehensive unit tests added for GCS backend behavior and GCP authentication handling.
fix: Improve AWS credential isolation and auth error propagation @osterman (#1712) ## Summary

This PR addresses multiple authentication issues when using Atmos in containerized environments with mounted credential files:

  1. Auth Pre-Hook Error Propagation - Terraform execution now properly aborts when authentication fails (e.g., Ctrl+C during SSO)
  2. AWS Credential Loading Strategy - New LoadAtmosManagedAWSConfig() function provides proper isolation while preserving Atmos-managed profile selection
  3. Noop Keyring Validation - Container auth now properly isolated from external environment variables
  4. Whoami with Noop Keyring - atmos auth whoami now works in containerized environments
  5. Test Coverage - Added test to verify auth errors properly abort execution

Changes

1. Auth Pre-Hook Error Propagation (internal/exec/terraform.go:236)

  • Problem: Errors from auth pre-hook were logged but not returned, causing terraform execution to continue even when authentication failed (e.g., user presses Ctrl+C during SSO)
  • Fix: Added return err after logging auth pre-hook errors
  • Impact: Terraform commands now properly abort on auth failures

2. AWS Credential Loading Strategy (pkg/auth/cloud/aws/env.go)

  • Problem: SDK's default config loading allowed IMDS access and was affected by external AWS_PROFILE, causing conflicts in containers
  • Solution: Created LoadAtmosManagedAWSConfig() function that:
    • Clears credential env vars (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN)
    • Preserves profile/path vars (AWS_PROFILE, AWS_SHARED_CREDENTIALS_FILE, AWS_CONFIG_FILE)
    • Allows SDK to load from Atmos-managed credential files
  • Impact: Proper isolation while still using Atmos-managed profiles

3. Noop Keyring Credential Validation (pkg/auth/credentials/keyring_noop.go)

  • Problem: Used unrestricted config.LoadDefaultConfig() which allowed IMDS access and was affected by external AWS_PROFILE
  • Fix: Changed to use LoadAtmosManagedAWSConfig()
  • Impact: Container auth now properly isolated from external env vars

4. Whoami with Noop Keyring (pkg/auth/manager.go)

  • Problem: Whoami() expected credentials from keyring, but noop keyring returns ErrCredentialsNotFound by design
  • Fix: Added check for ErrCredentialsNotFound and fallback to buildWhoamiInfoFromEnvironment()
  • Impact: atmos auth whoami now works in containerized environments

5. Test Coverage (internal/exec/terraform_test.go)

  • Added TestExecuteTerraform_AuthPreHookErrorPropagation to verify auth errors properly abort execution
  • Test validates that terraform doesn't continue on auth failure
  • Updated test fixture to include required name_pattern configuration

Technical Details

The key insight is that Atmos sets AWS_PROFILE=identity-name (in pkg/auth/cloud/aws/setup.go:59) but the previous isolation approach cleared ALL AWS env vars including AWS_PROFILE. This caused the SDK to look for a non-existent [default] section.

The new LoadAtmosManagedAWSConfig preserves AWS_PROFILE while still preventing external credential conflicts.

Test Plan

  • go build . - Build succeeds
  • go test ./internal/exec -run TestExecuteTerraform - All terraform tests pass
  • TestExecuteTerraform_AuthPreHookErrorPropagation - New test passes
  • Verified test fails when fix is removed (terraform continues execution)
  • Verified test passes when fix is restored (terraform aborts on auth error)

References

Fixes authentication issues in containerized environments with mounted credentials.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added --login and cached-credentials-first flows across auth commands; whoami now shows validation and expiry.
    • Atmos-managed credentials moved to XDG-compliant locations; improved shell enter/exit messages.
    • Geodesic helper script for building/testing in containerized environments.
  • Bug Fixes

    • Terraform pre-hook errors now abort execution.
    • Improved propagation of user-abort during authentication.
  • Documentation

    • XDG migration guides and Geodesic/CLI docs updated.
  • Tests

    • Broad expansion of auth, AWS credential, auth-context and output-propagation tests.
fix: Relax stack config requirement for commands that don't operate on stacks @osterman (#1717) ## Summary

Fixes stack configuration requirement for 6 commands that don't actually operate on stack manifests. These commands were incorrectly requiring stacks.base_path and stacks.included_paths to be configured, causing errors like:

Error: failed to initialize atmos config
stack base path must be provided in 'stacks.base_path' config or ATMOS_STACKS_BASE_PATH' ENV variable

What

Updated 6 commands to use processStacks=false in InitCliConfig:

Auth Commands (Commit 1)

  • atmos auth env - Export cloud credentials as environment variables
  • atmos auth exec - Execute commands with cloud credentials
  • atmos auth shell - Launch authenticated shell

List/Docs Commands (Commit 2)

  • atmos list workflows - List workflows from workflows/ directory
  • atmos list vendor - List vendor configurations from component.yaml files
  • atmos docs <component> - Display component README files

Why

These commands only need:

  • Auth configuration from atmos.yaml
  • Component base paths (terraform, helmfile, etc.)
  • Workflow or vendor configurations

They do NOT need:

  • Stack manifests to exist
  • stacks.base_path to be configured
  • stacks.included_paths to be configured

This makes Atmos more flexible for use cases like:

  • CI/CD pipelines that only need auth or vendor management
  • Development environments without full stack setup
  • Documentation browsing without infrastructure configs
  • Workflow management separate from stack operations

Technical Details

Changes Made

  1. InitCliConfig parameter: Changed processStacks from true to false

    • Prevents validation requiring stacks.base_path and stacks.included_paths
    • Skips processing of stack manifest files
  2. checkAtmosConfig option (for list vendor only): Added WithStackValidation(false)

    • Prevents checking if stacks directory exists
    • Required because list vendor calls checkAtmosConfig() with additional validation

Files Changed

  • cmd/auth_env.go
  • cmd/auth_exec.go
  • cmd/auth_shell.go
  • cmd/list_workflows.go
  • cmd/list_vendor.go
  • cmd/docs.go

Commands That Still Require Stacks (Unchanged)

These were NOT modified because they genuinely need stack manifests:

  • atmos list stacks
  • atmos list components
  • atmos list settings
  • atmos list values
  • atmos list metadata

Testing

✅ All existing tests pass
✅ Linter passes with 0 issues
✅ Pre-commit hooks pass
✅ Manual testing confirms commands work without stack directories
✅ No regressions in existing functionality

References

Addresses user issue where atmos auth exec -- aws sts get-caller-identity failed with stack configuration error.

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

Summary by CodeRabbit

  • New Features

    • Auth and utility commands (auth env, auth exec, auth shell, list workflows, list vendor, docs) now run without requiring stack configuration, enabling use in CI/CD, vendor management, and documentation workflows.
  • Documentation

    • Added a blog post describing the change, usage examples, migration tips, and CI/CD benefits.
Change runner type in nightly builds workflow @goruha (#1713) ## what * Use `large`runson runners for the go relaser

why

  • Go releaser need more disk space

Summary by CodeRabbit

  • Chores
    • Updated GitHub Actions runner specifications across feature release, nightly build, and test workflows to standardize build infrastructure configuration.
Update nightlybuilds.yml @goruha (#1711) ## what * Run go releaser on RunsOn runner

why

  • Default runners have out of space

Summary by CodeRabbit

  • Chores
    • Updated nightly release workflow to change how runner selection is provided: the workflow now accepts a JSON-like array of runner specifications, improving and broadening which runner(s) can be targeted for nightly builds.
Fix Terraform state authentication by passing auth context @osterman (#1695) ## what - Add authentication context parameter to Terraform backend operations - Refactor PostAuthenticate interface to use parameter struct - Extract nested logic to reduce complexity - Fix test coverage for backend functions

why

  • Terraform state operations need proper AWS credentials when accessing S3 backends
  • Multi-identity scenarios require passing auth context through the call chain
  • Reduces function parameter count from 6 to 2 (using PostAuthenticateParams struct)
  • Simplifies nested conditional logic for better maintainability

references

  • Part of multi-identity authentication context work
  • Follows established authentication context patterns
  • Related to docs/prd/auth-context-multi-identity.md

Summary by CodeRabbit

  • New Features

    • Centralized per-command AuthContext enabling multiple concurrent identities (AWS, GitHub, Azure, etc.) and making in-process SDK and Terraform calls use Atmos-managed credentials.
    • Console session duration configurable via provider console.session_duration with CLI flag override.
  • Bug Fixes

    • More reliable in-process authentication for SDK and Terraform state reads.
  • Documentation

    • Added design doc, blog post, and CLI docs describing AuthContext and session-duration behavior.
  • Tests

    • Expanded tests for auth flows, AWS config loading, and YAML/Terraform tag auth propagation.
Add circular dependency detection for YAML functions @osterman (#1708) ## what - Implement universal circular dependency detection for all Atmos YAML functions (!terraform.state, !terraform.output, atmos.Component) - Add goroutine-local resolution context for cycle tracking - Create comprehensive error messages showing dependency chains - Fix missing perf.Track() calls in Azure backend wrapper methods - Refactor code to meet golangci-lint complexity limits

why

  • Users experiencing stack overflow panics from circular dependencies in component configurations
  • Need to detect cycles before they cause panics and provide actionable error messages
  • Performance tracking required for all public functions per Atmos conventions
  • Reduce cyclomatic complexity and function length for maintainability

Implementation Details

Architecture

  • Goroutine-local storage using sync.Map with goroutine IDs to maintain isolated resolution contexts
  • O(1) cycle detection using visited-set pattern with Push/Pop operations
  • Call stack tracking for building detailed error messages showing dependency chains
  • Zero performance impact (<10 microseconds overhead, <0.001% of total execution time)

Test Coverage

  • 27 comprehensive tests across 4 test files
  • 100% coverage on core resolution context logic
  • ~75-80% overall coverage (excluding benchmark and integration tests)
  • Benchmark tests proving negligible performance impact
  • Integration tests for real-world scenarios (currently skipped - require state backends)

Performance

  • Push operation: ~266 nanoseconds
  • Pop operation: ~70 nanoseconds
  • GetGoroutineID: ~2,434 nanoseconds
  • Total overhead: <10 microseconds (<0.001% of execution time)

Error Messages

Before (stack overflow panic):

runtime: goroutine stack exceeds 1000000000-byte limit
fatal error: stack overflow

After (actionable error with dependency chain):

circular dependency detected

Dependency chain:
  1. Component 'vpc' in stack 'core'
     → !terraform.state transit-gateway core transit_gateway_id
  2. Component 'transit-gateway' in stack 'core'
     → !terraform.state vpc core vpc_id
  3. Component 'vpc' in stack 'core' (cycle detected)
     → !terraform.state transit-gateway core transit_gateway_id

To fix this issue:
  - Review your component dependencies and break the circular reference
  - Consider using Terraform data sources or direct remote state instead
  - Ensure dependencies flow in one direction only

references

  • Fixes community-reported stack overflow issue in YAML function processing
  • See docs/prd/circular-dependency-detection.md for complete architecture and design decisions
  • See docs/circular-dependency-detection.md for user documentation and troubleshooting
  • See CIRCULAR_DEPENDENCY_DETECTION_SUMMARY.md for implementation summary

Files Changed

  • Core implementation: internal/exec/yaml_func_resolution_context.go (161 lines)
  • Tests: 4 test files (1,093 lines total)
  • Modified: yaml_func_utils.go, yaml_func_terraform_state.go, yaml_func_terraform_output.go
  • Documentation: PRD, user docs, summary
  • Test fixtures: 7 YAML files + 2 Terraform components
  • Additional: Fixed Azure backend perf.Track() issues

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

Summary by CodeRabbit

  • New Features
    • Automatic circular dependency detection for YAML functions including terraform.state, terraform.output, and custom component functions. The system detects cycles early before runtime failures occur, providing comprehensive error messages that display the full dependency chain and component relationships. Users receive actionable remediation guidance and suggested fixes to resolve circular dependencies in their infrastructure configurations.
fix: Remove exclude directive to enable go install @osterman (#1709) ## what - Removed `exclude` directive from go.mod that was blocking `go install github.com/cloudposse/atmos@main` - Updated go install compatibility test to check for both `replace` and `exclude` directives

why

  • The exclude directive in go.mod prevents users from installing Atmos via go install
  • Go modules with exclude directives cannot be used as dependencies (by design)
  • This breaks a documented installation method and creates user friction
  • The excluded version (godbus/dbus v0.0.0-20190726142602-4481cbc300e2) is already superseded by explicitly required versions (v4.1.0 and v5.1.0)

references

🤖 Generated with Claude Code

fix: Upgrade to Go 1.25 and make test logging respect -v flag @osterman (#1706) ## what - Upgraded Go version from 1.24.8 to 1.25.0 - Configured Atmos logger in tests to respect `testing.Verbose()` flag - Tests are now quiet by default, verbose with `-v` flag - Added missing `perf.Track()` calls to Azure backend wrapper methods

why

  • Go 1.24.8 had a runtime panic bug in unique_runtime_registerUniqueMapCleanup on macOS ARM64 (golang/go#69729)
  • This caused TestGetAffectedComponents to panic during cleanup on macOS CI
  • Test output was always verbose because logger was set to InfoLevel unconditionally
  • Go 1.25.0 fixes the runtime panic bug
  • Linter enforcement requires perf.Track() on all public functions

changes

  • go.mod: Upgraded from go 1.24.8 to go 1.25.0
  • tests/cli_test.go:
    • Moved logger level configuration from init() to TestMain()
    • Logger now respects -v flag using switch statement:
      • ATMOS_TEST_DEBUG=1: DebugLevel (everything)
      • -v flag: InfoLevel (info, warnings, errors)
      • Default: WarnLevel (only warnings and errors)
    • Removed debug pattern logging loop (was spam)
    • All helpful t.Logf() messages preserved (work correctly with -v)
  • internal/terraform_backend/terraform_backend_azurerm.go:
    • Added perf.Track() to GetBody() wrapper method
    • Added perf.Track() to DownloadStream() wrapper method

testing

  • go test ./tests → Quiet (no logger output)
  • go test ./tests -v → Verbose (shows INFO logs)
  • go test ./internal/exec -run TestGetAffectedComponents → Passes without panic

references

Add Azure Blob Storage (azurerm) backend support for !terraform.state function @jamengual (#1610) ## what - Implemented Azure Blob Storage backend support for the `!terraform.state` YAML function - Added comprehensive unit tests with 100% coverage for the new backend - Updated error definitions, registry, and documentation

why

  • The !terraform.state function previously only supported local and s3 backends
  • Azure users needed native azurerm backend support to read Terraform state directly from Azure Blob Storage
  • This provides the fastest way to retrieve Terraform outputs without Terraform initialization overhead

changes

  • New Implementation: internal/terraform_backend/terraform_backend_azurerm.go

    • Implements azurerm backend reader following S3 backend patterns
    • Uses Azure SDK with DefaultAzureCredential for authentication (Managed Identity, Service Principal, Azure CLI, etc.)
    • Supports workspace-based blob paths (env:/{workspace}/{key} for non-default workspaces)
    • Includes client caching, retry logic (2 retries with exponential backoff), and proper error handling
    • Handles 404 (blob not found) gracefully by returning nil (component not provisioned yet)
    • Handles 403 (permission denied) with descriptive error messages
  • Comprehensive Tests: internal/terraform_backend/terraform_backend_azurerm_test.go

    • 8 test functions covering all scenarios with mocked Azure SDK client
    • Tests workspace handling (default vs non-default), blob not found, permission denied, network errors, retry logic, and error cases
    • All tests pass with no external dependencies required
  • Error Definitions: errors/errors.go

    • Added 7 new Azure-specific static errors following project patterns
    • ErrGetBlobFromAzure, ErrReadAzureBlobBody, ErrCreateAzureCredential, ErrCreateAzureClient, ErrAzureContainerRequired, ErrStorageAccountRequired, ErrAzurePermissionDenied
  • Registry Update: internal/terraform_backend/terraform_backend_registry.go

    • Registered ReadTerraformBackendAzurerm in the backend registry
  • Error Message Update: internal/terraform_backend/terraform_backend_utils.go

    • Updated supported backends list to include azurerm
  • Documentation Update: website/docs/functions/yaml/terraform.state.mdx

    • Added azurerm to the list of supported backend types
    • Updated warning message to reflect azurerm support
  • Dependencies: go.mod

    • Moved github.com/Azure/azure-sdk-for-go/sdk/storage/azblob from indirect to direct dependency (already present in project)

implementation notes

  • Follows established patterns from S3 backend implementation
  • Uses wrapper pattern (AzureBlobAPI interface) to enable testing without actual Azure connectivity
  • Implements proper workspace path handling matching Azure backend behavior (env:/{workspace}/{key})
  • All comments end with periods (enforced by golangci-lint)
  • Imports organized in 3 groups (stdlib, 3rd-party, atmos) as per CLAUDE.md
  • Performance tracking added with defer perf.Track() on all functions
  • Cross-platform compatible using Azure SDK (not CLI commands)

test results

=== RUN   TestReadTerraformBackendAzurermInternal_Success
=== RUN   TestReadTerraformBackendAzurermInternal_Success/successful_read_default_workspace
=== RUN   TestReadTerraformBackendAzurermInternal_Success/successful_read_dev_workspace
=== RUN   TestReadTerraformBackendAzurermInternal_Success/successful_read_prod_workspace
=== RUN   TestReadTerraformBackendAzurermInternal_Success/successful_read_empty_workspace
=== RUN   TestReadTerraformBackendAzurermInternal_Success/successful_read_default_key
--- PASS: TestReadTerraformBackendAzurermInternal_Success (0.00s)
=== RUN   TestReadTerraformBackendAzurermInternal_BlobNotFound
--- PASS: TestReadTerraformBackendAzurermInternal_BlobNotFound (0.00s)
=== RUN   TestReadTerraformBackendAzurermInternal_PermissionDenied
--- PASS: TestReadTerraformBackendAzurermInternal_PermissionDenied (0.00s)
=== RUN   TestReadTerraformBackendAzurermInternal_NetworkError
--- PASS: TestReadTerraformBackendAzurermInternal_NetworkError (4.00s)
=== RUN   TestReadTerraformBackendAzurermInternal_RetrySuccess
--- PASS: TestReadTerraformBackendAzurermInternal_RetrySuccess (2.00s)
=== RUN   TestReadTerraformBackendAzurermInternal_MissingContainerName
--- PASS: TestReadTerraformBackendAzurermInternal_MissingContainerName (0.00s)
=== RUN   TestReadTerraformBackendAzurermInternal_ReadBodyError
--- PASS: TestReadTerraformBackendAzurermInternal_ReadBodyError (0.00s)
PASS
ok      github.com/cloudposse/atmos/internal/terraform_backend 7.011s

Summary by CodeRabbit

  • New Features

    • Azure Blob Storage (azurerm) support for reading Terraform state with workspace-aware paths, authentication, retries, and client caching.
  • Documentation

    • Added detailed docs and a blog post covering Azure backend usage, examples, migration guidance, and “Try It Now” steps.
  • Improvements

    • Clearer permission/not-found reporting and added Azure-specific error signals for more precise error handling.
  • Tests

    • Extensive unit and integration tests plus Azure credential precondition checks.
  • Chores

    • Updated .gitignore with developer tool patterns.
test(auth): Increase auth test coverage from 6% to 80% with mock provider @osterman (#1702) ## what - Add comprehensive unit and integration tests for Atmos auth system using the existing mock provider - Increase test coverage from **6% to ~80%** (target: 80-90% ✅) - Add regression tests to prevent recurrence of user-reported browser authentication issue - Achieve **100% coverage** for mock provider implementation

why

  • Current auth test coverage was critically low (6%), making it difficult to catch bugs
  • User complaint (Bogdan) about browser authentication triggering on every command needed verification and regression protection
  • Mock provider was implemented but had zero test coverage
  • Need confidence that auth system works correctly without requiring real cloud credentials

Coverage Improvements

Package Before After Improvement
pkg/auth 6.2% 84.6% +78.4pp
pkg/auth/providers/mock 0% 100.0% +100pp
pkg/auth/utils 0% 100.0% +100pp
pkg/auth/validation 0% 90.0% +90pp
pkg/auth/list 0% 89.5% +89.5pp
pkg/auth/cloud/aws 0% 79.2% +79.2pp
pkg/auth/providers/github 0% 78.3% +78.3pp
pkg/auth/factory 0% 77.8% +77.8pp
pkg/auth/credentials 0% 75.8% +75.8pp
pkg/auth/providers/aws 0% 67.8% +67.8pp
pkg/auth/identities/aws 2.3% 62.5% +60.2pp

Overall: ~6% → ~80%

Key Additions

1. Mock Provider Unit Tests (100% coverage)

  • pkg/auth/providers/mock/provider_test.go - 15 comprehensive tests
  • pkg/auth/providers/mock/identity_test.go - 13 comprehensive tests
  • Tests cover: authentication, expiration, concurrency, interface compliance

2. Credential Caching Regression Tests

  • cmd/auth_caching_test.go - 4 test functions with multiple subtests
  • Verifies credentials are cached after login and reused
  • Ensures fast execution (< 2s) vs browser auth (5-30s)
  • Tests multi-identity scenarios

3. Integration Test Scenarios

  • tests/test-cases/auth-mock.yaml - 20+ test scenarios
  • Auth login, whoami, env, exec, list, logout commands
  • Multiple output formats (json, bash, dotenv)
  • Error handling and edge cases

User Issue: Browser Auth on Every Command

Status: LIKELY FIXED

The issue where browser authentication was triggered on every command appears to have been resolved by recent PRs (#1655, #1653, #1640). This PR adds comprehensive regression tests to:

  1. Verify credentials are cached after authentication
  2. Ensure subsequent commands use cached credentials
  3. Confirm fast execution without browser prompts
  4. Prevent regression of this issue

Testing

# Run mock provider tests
$ go test ./pkg/auth/providers/mock/... -v
=== RUN   TestNewProvider
=== RUN   TestProvider_Authenticate
=== RUN   TestProvider_Concurrency
... 28 tests PASS
coverage: 100.0% of statements

# Run auth package tests
$ go test -cover ./pkg/auth/...
pkg/auth: 84.6% coverage ✅
pkg/auth/providers/mock: 100% coverage ✅
pkg/auth/utils: 100% coverage ✅
... all passing

Benefits

  • No cloud credentials needed for auth testing
  • Fast test execution (milliseconds vs seconds)
  • Deterministic results (fixed expiration dates)
  • CI/CD ready (no secrets required)
  • Regression protection for caching issue
  • 80% coverage meets industry standards

references

  • User complaint: Bogdan reported browser auth on every command
  • Related PRs: #1655, #1653, #1640 (auth improvements)
  • Mock provider enables testing without cloud credentials
Add auth console command for web console access @osterman (#1684) ## what - Add `atmos auth console` command to open cloud provider web consoles using authenticated credentials - Implement AWS console access via federation endpoint (similar to aws-vault login) - Add 100+ AWS service destination aliases for convenient access - Create dedicated `pkg/http` package for HTTP client utilities - Add pretty formatted output using lipgloss with Atmos theme colors - Consolidate browser opening functionality to existing `OpenUrl` helper

why

  • Provides convenient browser access to cloud consoles without manually copying credentials
  • Eliminates context switching between terminal and browser for console access
  • Uses provider-native federation endpoints for secure temporary access
  • Extensible interface pattern supports future Azure/GCP implementations

features

  • Service Aliases: Use shorthand like s3, ec2, lambda instead of full console URLs
  • Autocomplete: Shell completion for destination and identity flags
  • Session Control: Configurable duration (up to 12 hours for AWS) with expiration display
  • Clean Output: URL only shown on error or with --no-open flag
  • Scriptable: --print-only flag for piping URLs to other tools
  • Provider-Agnostic: Interface design ready for multi-cloud support

implementation

  • Created ConsoleAccessProvider interface in pkg/auth/types/interfaces.go
  • Implemented ConsoleURLGenerator for AWS using federation endpoint
  • Added ResolveDestination() with case-insensitive alias lookup
  • Moved HTTP utilities from pkg/utils to dedicated pkg/http package
  • Used existing OpenUrl() function for cross-platform browser opening
  • Added comprehensive tests achieving 85.9% coverage

testing

  • Unit tests for console URL generation (15 test cases)
  • Unit tests for destination alias resolution (100+ aliases tested)
  • Mock HTTP client for testing without network calls
  • Table-driven tests with edge case coverage

documentation

  • CLI reference: website/docs/cli/commands/auth/console.mdx
  • Blog post: website/blog/2025-10-20-auth-console-web-access.md
  • Proposal document: docs/proposals/auth-web-console.md
  • Embedded markdown usage examples

references

Summary by CodeRabbit

  • New Features

    • Added atmos auth console: opens cloud provider web consoles via temporary sign-in URLs (AWS supported now; Azure/GCP planned).
    • Supports service aliases (s3, ec2, etc.), full destination URLs, session duration (AWS up to 12h), issuer, --print-only, --no-open and identity selection/completion.
  • Documentation

    • New CLI docs, usage guide, PRD and blog post with examples and troubleshooting.
  • Tests

    • Expanded tests and CI snapshots for the new command and destination resolution.
fix: Only log verbose test output on failure @osterman (#1704) ## what - Replace unconditional `t.Log()` calls with `t.Cleanup()` handlers that only output verbose YAML/data when tests fail - Eliminate noisy stderr output during successful test runs while preserving debug information when tests fail - Add fallback to raw data output (`%+v`) when YAML conversion produces empty strings

why

  • CI test runs were showing verbose YAML dumps to stderr even when tests passed
  • This cluttered test output and made it difficult to identify actual issues
  • Debug information is still valuable when tests fail, but shouldn't appear during successful runs
  • Go's t.Log() always outputs to stderr, regardless of test success/failure

demo

Finally clean output!

go mod download
Running tests with subprocess coverage collection
ok  	github.com/cloudposse/atmos	7.020s	coverage: 14.8% of statements in ./...
ok  	github.com/cloudposse/atmos/cmd	7.581s	coverage: 20.7% of statements in ./...
ok  	github.com/cloudposse/atmos/cmd/about	0.134s	coverage: 0.1% of statements in ./...
ok  	github.com/cloudposse/atmos/cmd/internal	0.099s	coverage: 0.1% of statements in ./...
?   	github.com/cloudposse/atmos/cmd/markdown	[no test files]
ok  	github.com/cloudposse/atmos/cmd/version	1.802s	coverage: 1.4% of statements in ./...
ok  	github.com/cloudposse/atmos/errors	0.213s	coverage: 0.4% of statements in ./...
ok  	github.com/cloudposse/atmos/internal/aws_utils	0.120s	coverage: 0.1% of statements in ./...
ok  	github.com/cloudposse/atmos/internal/exec	84.175s	coverage: 32.9% of statements in ./...
ok  	github.com/cloudposse/atmos/internal/terraform_backend	32.223s	coverage: 0.9% of statements in ./...
	github.com/cloudposse/atmos/internal/tui/atmos		coverage: 0.0% of statements
	github.com/cloudposse/atmos/internal/tui/components/code_view		coverage: 0.0% of statements
ok  	github.com/cloudposse/atmos/internal/tui/templates	0.125s	coverage: 0.5% of statements in ./...
	github.com/cloudposse/atmos/internal/tui/templates/term		coverage: 0.0% of statements
ok  	github.com/cloudposse/atmos/internal/tui/utils	0.218s	coverage: 0.2% of statements in ./...
	github.com/cloudposse/atmos/internal/tui/workflow		coverage: 0.0% of statements
ok  	github.com/cloudposse/atmos/pkg/atlantis	1.434s	coverage: 10.3% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth	0.141s	coverage: 2.1% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/cloud/aws	0.113s	coverage: 0.8% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/credentials	0.316s	coverage: 0.9% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/factory	0.141s	coverage: 0.2% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/identities/aws	0.139s	coverage: 1.7% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/list	0.138s	coverage: 1.5% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/providers/aws	0.098s	coverage: 1.6% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/providers/github	0.072s	coverage: 0.3% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/providers/mock	0.133s	coverage: 0.1% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/types	0.075s	coverage: 0.2% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/utils	0.099s	coverage: 0.0% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/auth/validation	0.150s	coverage: 0.7% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/aws	0.199s	coverage: 2.4% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/component	0.898s	coverage: 10.1% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/component/mock	0.178s	coverage: 0.4% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/config	3.247s	coverage: 5.7% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/config/homedir	0.073s	coverage: 0.2% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/convert	0.048s	coverage: 0.0% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/datafetcher	0.228s	coverage: 0.2% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/describe	29.214s	coverage: 13.2% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/downloader	1.115s	coverage: 1.6% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/filematch	0.135s	coverage: 0.3% of statements in ./...
	github.com/cloudposse/atmos/pkg/filesystem		coverage: 0.0% of statements
ok  	github.com/cloudposse/atmos/pkg/filetype	0.078s	coverage: 0.4% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/generate	0.685s	coverage: 7.7% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/git	0.164s	coverage: 0.3% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/github	2.462s	coverage: 0.4% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/hooks	0.264s	coverage: 7.5% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/list	2.193s	coverage: 12.0% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/list/errors	0.073s	coverage: 0.1% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/list/flags	0.072s	coverage: 0.1% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/list/format	0.119s	coverage: 0.6% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/list/utils	0.187s	coverage: 0.2% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/logger	0.161s	coverage: 0.3% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/merge	0.227s	coverage: 1.7% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/pager	0.076s	coverage: 0.9% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/perf	1.238s	coverage: 0.5% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/pro	0.177s	coverage: 0.8% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/pro/dtos	0.051s	coverage: 0.0% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/profiler	1.861s	coverage: 0.4% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/provenance	0.130s	coverage: 1.8% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/retry	0.176s	coverage: 0.2% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/schema	0.070s	coverage: 0.3% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/spacelift	0.787s	coverage: 8.4% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/stack	0.346s	coverage: 4.3% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/store	0.139s	coverage: 1.7% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/telemetry	0.518s	coverage: 2.7% of statements in ./...
	github.com/cloudposse/atmos/pkg/telemetry/mock		coverage: 0.0% of statements
ok  	github.com/cloudposse/atmos/pkg/ui/heatmap	0.129s	coverage: 0.9% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/ui/markdown	0.138s	coverage: 0.4% of statements in ./...
?   	github.com/cloudposse/atmos/pkg/ui/theme	[no test files]
ok  	github.com/cloudposse/atmos/pkg/utils	0.743s	coverage: 4.8% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/validate	1.354s	coverage: 14.5% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/validator	0.116s	coverage: 0.2% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/vender	3.308s	coverage: 3.9% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/version	0.069s	coverage: 0.0% of statements in ./...
ok  	github.com/cloudposse/atmos/pkg/xdg	0.046s	coverage: 0.1% of statements in ./...
ok  	github.com/cloudposse/atmos/tests	174.022s	coverage: 14.3% of statements in ./...
ok  	github.com/cloudposse/atmos/tests/testhelpers	90.419s	coverage: 1.1% of statements in ./...
Coverage report generated: coverage.out

references

  • Affects 9 test files with 29 cleanup handlers added
  • Modified files:
    • pkg/component/component_processor_test.go
    • pkg/describe/describe_affected_test.go
    • pkg/describe/describe_component_test.go
    • pkg/describe/describe_dependents_test.go
    • pkg/describe/describe_stacks_test.go
    • pkg/list/list_components_test.go
    • pkg/merge/merge_test.go
    • pkg/spacelift/spacelift_stack_processor_test.go
    • pkg/stack/stack_processor_test.go

🤖 Generated with Claude Code

Add linter rule for missing defer perf.Track() calls @osterman (#1698) ## what - Added new `perf-track` linter rule to catch missing `defer perf.Track()` calls - Enabled by default with explicit package and type exclusions - Integrated with existing lintroller custom linter framework

why

  • Enforces coding guidelines requiring performance tracking on all public functions
  • Catches violations early in development before code review
  • Prevents missing perf tracking that would be tedious to find manually
  • Uses explicit exclusions for infrastructure code (logger, profiler, perf, store, ui, tui)

references

  • Follows coding guidelines in CLAUDE.md for mandatory defer perf.Track() usage
  • Addresses hundreds of potential violations by catching them at lint time
  • Exclusions prevent infinite recursion and avoid tracking overhead in low-level code

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added a lint rule that enforces a defer-based performance-tracking call at the start of exported functions/methods; enabled by default with a config toggle to disable.
  • Tests

    • Added unit tests and example cases demonstrating compliant and non-compliant exported functions/methods for the new rule.
  • Documentation

    • Updated lint configuration docs to mention the new performance-tracking check and its settings.
Add condition to skip Docker build for prerelease @goruha (#1700) ## what * Add condition to skip Docker build for prerelease

why

  • Exclude prerelease versions from Homebrew workflows

Summary by CodeRabbit

  • Chores
    • Build workflow updated so Docker image build/push steps are skipped for prerelease releases.
    • Dependency review job runner specification changed to a composite runner configuration with additional runner attributes.
feat: Add `atmos auth shell` command @osterman (#1640) ## what - Add `atmos auth shell` command to launch an interactive shell with authentication environment variables pre-configured - Implement shell detection that respects `$SHELL` environment variable with fallbacks to bash/sh - Add `--shell` flag with viper binding to `ATMOS_SHELL` and `SHELL` environment variables - Support `--` separator for passing custom shell arguments to the launched shell - Track shell nesting level with `ATMOS_SHLVL` environment variable - Propagate shell exit codes back to Atmos process - Set `ATMOS_IDENTITY` environment variable in the shell session

why

  • Users need an easy way to work interactively with cloud credentials without manually managing environment variables
  • Similar to atmos terraform shell, this provides a consistent experience for authenticated sessions
  • Allows running multiple commands in a single authenticated session without re-authenticating
  • Supports custom shell configurations and arguments for flexibility

references

  • Similar to existing atmos terraform shell command implementation
  • Follows authentication patterns from atmos auth exec and atmos auth env

testing

  • Comprehensive unit tests with 80-100% coverage on testable functions
  • 25 passing tests covering:
    • Shell detection and fallback logic (100% coverage)
    • Environment variable management (100% coverage)
    • Shell nesting level tracking (83-100% coverage)
    • Exit code propagation (tested with codes 0, 1, 42)
    • Flag parsing and viper integration
    • Cross-platform support (Unix and Windows)
  • All linting checks passing (0 issues)
  • Pre-commit hooks passing

documentation

  • Added website/docs/cli/commands/auth/auth-shell.mdx with full command documentation
  • Created cmd/markdown/atmos_auth_shell_usage.md with usage examples
  • Includes purpose note, usage patterns, examples, and environment variable reference

Summary by CodeRabbit

  • New Features

    • Interactive authenticated shell with shell selection, argument passthrough, nested-shell tracking, and identity selection.
    • Pluggable credential storage: system, file (path/password) and memory backends selectable via config/env.
    • Deterministic mock auth provider for testing.
  • Documentation

    • New auth-shell docs, usage examples, blog posts, keyring-backends guide, XDG docs, and PRD.
  • Tests

    • Expanded unit/integration coverage for shell flows, keyring backends, XDG, and credential stores.
  • Chores

    • Added keyring-related dependencies, CI/workflow and tooling adjustments.
Improve auth login with identity selection @osterman (#1655) ## what
  • Modified the auth login command to automatically prompt for an identity when no --identity flag is provided.
  • This leverages the existing authManager.GetDefaultIdentity() which handles interactive selection and fallback logic.
  • Updated documentation to reflect this new behavior.

why

  • Users were prompted to manually select an identity in interactive sessions when no default was set.
  • This change simplifies the login process by automatically invoking the interactive selector or using the default identity when available, improving user experience and reducing manual input.

references

  • No specific issue linked - this is a user experience enhancement.
Replace deny-licenses with allow-licenses and remove redundant workflow @osterman (#1692) ## what - Delete redundant `.github/workflows/dependabot.yml` workflow file - Update `dependency-review.yml` to use `allow-licenses` instead of deprecated `deny-licenses` parameter - Maintain PR commenting functionality with `comment-summary-in-pr: always` - Allow only permissive licenses: MIT, Apache-2.0, BSD-2-Clause, BSD-3-Clause, ISC, MPL-2.0, 0BSD, Unlicense, CC0-1.0

why

  • GitHub deprecated the deny-licenses parameter in favor of allow-licenses for better security posture
  • The dependabot.yml workflow was redundant - we already have dependency-review.yml that provides more comprehensive dependency review
  • Using an allow-list approach is more secure than a deny-list approach
  • Consolidating to a single dependency review workflow reduces maintenance overhead

references

Summary by CodeRabbit

  • Chores
    • Implemented a 2-week minimum age requirement for automated dependency updates
    • Updated dependency review workflow to enforce permissive open-source licenses only
    • Consolidated dependency management configurations
Compress CLAUDE.md and add size limit enforcement @osterman (#1693) ## what - Compressed CLAUDE.md from 40.3k chars to 6.3k chars (84% reduction) - Added GitHub action to enforce 40k character limit on CLAUDE.md - Refactored into reusable composite action pattern

why

  • Large CLAUDE.md files impact performance and token usage
  • Need automated enforcement to prevent file bloat
  • Reusable action pattern improves maintainability

Compression Details

Metrics:

  • Size: 40,300 chars → 6,301 chars (84.4% reduction)
  • Lines: 1,183 → 165 (86.0% reduction)
  • Current usage: 15% of 40k limit

Techniques Applied:

  • Removed verbose explanations, kept terse requirements
  • Consolidated redundant examples
  • Merged related sections
  • Preserved all MANDATORY rules

What's Preserved:
✅ All MANDATORY requirements
✅ Code patterns and conventions
✅ Error handling strategies
✅ Testing requirements
✅ CLI command structure
✅ Development workflows
✅ Cross-platform compatibility rules
✅ Git and PR guidelines

GitHub Action Structure

.github/
├── actions/
│   └── check-claude-md-size/
│       ├── action.yml          # Composite action with all logic
│       └── README.md            # Action documentation
└── workflows/
    └── claude.yml               # Simple 16-line workflow

Action Features:

  • Validates file size on PR changes
  • Posts/updates intelligent PR comments
  • Fails CI if limit exceeded
  • Configurable file path and size limit
  • Provides outputs: size, exceeds-limit, usage-percent

Triggers:

  • Pull requests modifying CLAUDE.md
  • Changes to workflow or action files

references

  • Follows composite action best practices
  • Pattern similar to existing actions in the ecosystem
  • Maintains consistency with project's CI/CD approach

Summary by CodeRabbit

  • New Features

    • Automated CLAUDE.md size validation with configurable limits; posts and updates PR comments when limits are exceeded or resolved.
  • Documentation

    • Reworked CLAUDE.md to emphasize architecture and mandatory design patterns instead of granular step-by-step procedures.
    • Added user-facing documentation for the CLAUDE.md size-check action and its usage.
Add auth logout command @osterman (#1656) ## what

This pull request introduces the atmos auth logout command, enabling users to securely remove locally cached credentials. The command supports:

  • Identity-specific logout: Removes credentials for a given identity and its entire authentication chain.
  • Provider-specific logout: Removes all credentials associated with a particular provider.
  • Interactive mode: Prompts the user to select what to logout when no arguments are provided.
  • Dry-run mode: Previews what would be removed without making changes.
  • Comprehensive cleanup: Deletes credentials from the system keyring and provider-specific files (e.g., AWS credentials).
  • Best-effort error handling: Continues cleanup even if individual steps fail, reporting all encountered errors.

why

This feature addresses several key pain points:

  • Security: Allows users to securely remove stale credentials, reducing the risk of unauthorized access.
  • Developer Experience: Simplifies switching between different identities or environments by providing a clean way to remove existing credentials.
  • Compliance: Enables auditing of credential removal and ensures adherence to security policies.
  • Troubleshooting: Provides a straightforward method to clear authentication caches when debugging.

The implementation uses native Go operations for file system cleanup and integrates with go-keyring for cross-platform credential store access. It leverages Charmbracelet libraries for a polished interactive user experience and styled output.

references

closes #735

Summary by CodeRabbit

Release Notes

  • New Features

    • Added atmos auth logout CLI command to remove stored credentials
    • Supports logout by identity, by provider, or all identities at once
    • Interactive mode to select which credentials to remove
    • Dry-run mode to preview credential removals without executing
    • Browser session warning displayed after successful logout
  • Documentation

    • Added guides and reference documentation for logout workflows and usage
Replace custom license-check with GitHub dependency-review-action @osterman (#1690) ## what
  • Replaced custom license-check action (308 lines) with GitHub's native dependency-review-action
  • Simplified workflow from 44 lines to 18 lines with better functionality
  • Added automated NOTICE file generation and validation to CI
  • Workflow now:
    • Validates licenses using GitHub's dependency graph
    • Blocks PRs with forbidden licenses (GPL, AGPL, etc.)
    • Generates NOTICE file using go-licenses
    • Fails CI if NOTICE file is out of date

why

  • Reduce maintenance burden: GitHub's native action requires zero maintenance vs custom bash fighting go-licenses bugs
  • Better reliability: Native GitHub solution works across all ecosystems, not just Go
  • Automated NOTICE updates: Ensures NOTICE file stays in sync with dependencies automatically
  • Clearer error messages: Developers get actionable feedback when NOTICE file needs updating
  • Industry standard: Uses same tooling as thousands of other repositories

references


Troubleshooting Notes

autofix.ci Artifact Upload Errors (RESOLVED)

Error encountered:

Attempt 4 of 5 failed with error: Unexpected token 'O', "Original A"... is not valid JSON
Error: Failed to CreateArtifact: Failed to make request after 5 attempts

Root Cause:
When using RunsOn self-hosted runners with extras=s3-cache, the runs-on/action@v2 step is required for artifact uploads to work. Without it, the artifact API receives HTML error pages instead of JSON responses.

Fix Applied:

  1. Added runs-on/action@v2 as first step in autofix.yml (required for S3 cache compatibility)
  2. Added permissions: { contents: read, actions: write } (was empty {} which grants NO permissions)
  3. Upgraded autofix-ci/action from v1.3.1 to v1.3.2

Reference:

  • RunsOn S3 Cache Documentation
  • Key quote: "If you have enabled the s3-cache extra and are using the actions/upload-artifact@v4 action in your workflows, you must ensure that you have also included the runs-on/action@v2 action in your jobs."

Time saved for future developers: ~2 hours of debugging 🎯

Summary by CodeRabbit

  • New Features

    • Added automatic dependency license review to flag restricted licenses (GPL, LGPL, AGPL) on pull requests.
    • Added vulnerability severity checks to the dependency review process.
    • Introduced comprehensive NOTICE file documenting all third-party dependencies and their licenses.
  • Documentation

    • Added documentation for license generation utilities and scripts.
Add Component Registry Pattern and Mock Component @osterman (#1648) ## what

This Pull Request introduces the Component Registry Pattern to Atmos, enabling extensible support for various component types. It lays the foundation for adding new infrastructure tools as plugins in the future.

Key changes include:

  • ComponentProvider Interface: A new interface defining the contract for all component providers.
  • Component Registry: A thread-safe global registry to manage component providers.
  • Mock Component Provider: A proof-of-concept implementation for testing the registry and component lifecycle without external dependencies. It demonstrates inheritance, merging, and cross-component dependencies.
  • Hybrid Configuration Schema: pkg/schema/schema.go is updated to support both statically defined built-in component types (Terraform, Helmfile, Packer) and dynamically registered plugin types via the Plugins map.
  • Sentinel Errors: New sentinel errors related to component providers and configurations are added to errors/errors.go.
  • JSON Schema Updates: Schemas in pkg/datafetcher/schema/ are modified to allow additional properties for component types, accommodating the hybrid configuration.
  • Developer Guide: A new markdown file docs/developing-component-plugins.md is added, detailing how to create new component plugins.

why

The existing hardcoded approach for component types (Terraform, Helmfile, Packer) limits extensibility and maintainability. This PR introduces a more robust and flexible pattern:

  • Extensibility: Allows easy addition of new component types (e.g., Pulumi, CDK, CloudFormation) without modifying core Atmos code.
  • Plugin Support: Paves the way for external component plugins in future phases.
  • Testability: The mock component enables thorough testing of the registry pattern, configuration inheritance, and dependency resolution without requiring external tools or cloud provider access.
  • Consistency: Adopts a pattern similar to the existing command registry, promoting a unified architectural approach.
  • Maintainability: Centralizes component logic within providers, reducing code duplication and improving clarity.
  • Backward Compatibility: Existing configurations and functionality remain unaffected. The hybrid schema ensures existing component types continue to work seamlessly while introducing the new pattern.
  • Enhanced Testing: Introduces specific test coverage requirements (90%+) for the registry and mock component, including thread-safety and edge-case testing.

references

closes #589
closes #600
closes #601

Summary by CodeRabbit

  • New Features

    • Adds a component registry, plugin-style component support, and a mock provider for testing; components can now be discovered at runtime and report available commands.
    • Component configuration now accepts dynamic plugin entries (new Plugins field) for greater flexibility.
  • Documentation

    • New developer guide for building component plugins, a registry migration pattern, and expanded development requirements and best practices.
  • Tests

    • Comprehensive registry and mock-provider test suites and updated CLI snapshot to show Plugins field.
Fix blog post ordering and add explicit dates @osterman (#1689) ## what - Add explicit `date:` field to all blog post frontmatter for consistent ordering - Fix welcome post date to 2025-10-12 so it appears first in the changelog - Fix chdir post filename and date to 2025-10-19 (actual PR merge date) - Add `` markers to chdir and pager posts for proper summaries - Remove duplicate `index.md` that was causing routing conflicts

why

  • Blog posts were displaying in incorrect chronological order
  • Some posts were missing truncate markers, causing warnings during build
  • Welcome post should appear first as it introduces the changelog
  • Duplicate index.md was causing Docusaurus routing conflicts

references

  • Fixes blog post ordering issues identified by user

Summary by CodeRabbit

  • Documentation
    • Added new blog posts covering Atmos authentication, provenance tracking, command registry patterns, AWS SSO verification, version list commands, and authentication tutorials.
    • Updated blog post on pager default behavior with migration guidance and configuration instructions.
    • Enhanced blog content metadata and organization.
Add license check workflow @osterman (#1680) ## what
  • Added a GitHub Actions workflow (.github/workflows/license-check.yml) to automatically audit Go project dependencies for license compliance.
  • This workflow triggers on pull request events (opened, synchronize, reopened) that affect go.mod, go.sum, or the workflow file itself.
  • It also includes scheduled runs (weekly on Mondays) and manual dispatch for flexibility.
  • A new script (scripts/check-licenses.sh) was introduced to perform the actual license check using go-licenses.
  • The script checks for "forbidden" license types and generates a summary report.
  • The generated CSV report from go-licenses report is now uploaded as a GitHub Actions artifact.

why

  • To proactively identify and prevent the introduction of dependencies with problematic licenses (e.g., GPL, AGPL) into the project.
  • Automates the license auditing process, reducing manual effort and the risk of oversight.
  • Ensures compliance with licensing requirements, especially important for open-source and commercial projects.
  • The CI integration provides immediate feedback on PRs affecting dependencies.
  • Uploading the report as an artifact allows for easy review of detailed license information.

references

  • closes #123 (Assuming #123 is the issue related to license auditing)

Summary by CodeRabbit

  • Chores
    • Added automated license compliance checks that run on pull requests, weekly, and on demand, producing a downloadable CSV license report retained for 30 days.
    • Added a license-audit workflow and scanning script that installs/checks the scanner as needed, handles known edge cases, summarizes license distribution, and emits clear pass/fail results.
Add atmos auth list command with multiple output formats @osterman (#1645) ## what - Add new `atmos auth list` command to list all configured authentication providers and identities - Support multiple output formats: table (default), tree, JSON, YAML, Graphviz, Mermaid, and Markdown - Implement filtering by providers or identities with optional name filtering - Add comprehensive documentation and usage examples

why

  • Users need visibility into their authentication configuration to understand providers, identities, and their relationships
  • Multiple output formats enable different use cases: interactive CLI (table/tree), automation (JSON/YAML), and documentation (Graphviz/Mermaid)
  • Visual formats help understand complex authentication chains where identities assume roles through providers or other identities

references

  • Implements feature request for authentication configuration visibility
  • Follows existing Atmos patterns for command structure and output formatting

Summary by CodeRabbit

  • New Features

    • Added an auth list command to view providers and identities with flexible filtering and multiple output formats (table, tree, JSON, YAML, Graphviz, Mermaid, Markdown)
    • Added chain visualization outputs (graph/mermaid/markdown) for easier relationship tracing
  • Bug Fixes

    • Support expanded tilde (~) paths for the CLI chdir flag
  • Documentation

    • Comprehensive CLI docs, usage guide, and blog post added
  • Tests

    • Extensive unit tests and format/diagram validation added
Update mockgen to go.uber.org/mock @osterman (#1681) ## what
  • Replaced the usage of the archived github.com/golang/mock with go.uber.org/mock.
  • Updated all import paths from github.com/golang/mock/gomock to go.uber.org/mock/gomock.
  • Updated all //go:generate mockgen directives to use go run go.uber.org/mock/mockgen@v0.6.0 (pinned version for reproducible builds).
  • Regenerated all mock files with the pinned version.
  • Added a lint rule in .golangci.yml to disallow usage of github.com/golang/mock.
  • Configured .golangci.yml to exclude generated mock files (mock_*.go) from godot linter checks.

why

  • github.com/golang/mock is an archived repository and should no longer be used.
  • go.uber.org/mock is the maintained successor.
  • Pinning to @v0.6.0 ensures reproducible builds across different environments.
  • This change ensures the project uses actively maintained dependencies and prevents accidental use of the deprecated library through a new lint rule.

references

Fix go install compatibility by removing replace directive @osterman (#1685) ## what - Remove `replace` directive from `go.mod` that breaks `go install github.com/cloudposse/atmos@latest` - Update Atmos internal code to import from `pkg/config/homedir` directly instead of via replaced module path - Remove `go.mod` from `pkg/config/homedir` (no longer needed as separate module) - Add regression test `TestGoModNoReplaceDirectives` to prevent future breakage of `go install` compatibility

why

  • The replace directive introduced in v1.195.0 (PR #1631) breaks a documented installation method
  • go install cmd@version intentionally does not support modules with replace or exclude directives
  • This is a fundamental design decision in Go (golang/go#44840, #69762, #50698) that won't be changed
  • Users attempting go install github.com/cloudposse/atmos@latest get errors and cannot install
  • Breaking this installation path creates user friction and support burden

tradeoffs

What we're giving up

The replace directive was added to ensure all transient dependencies (16+ packages) use Atmos's improved fork of the deprecated mitchellh/go-homedir package instead of the archived original.

Unfortunately, we must accept that transient dependencies will use the deprecated package because:

  • There's no way to force transient dependencies to use our fork without replace
  • We can't publish our fork as github.com/mitchellh/go-homedir (we don't own that domain)
  • Requiring all 16+ transient dependencies to update their imports is not feasible

What we're keeping

  • Atmos's own code still uses the improved pkg/config/homedir implementation with better error handling, refactoring, and security annotations
  • The deprecated mitchellh/go-homedir package has no known security vulnerabilities (verified via Snyk)
  • The package is stable (last commit 2019, archived July 2024 as feature-complete, not broken)

The decision

Restoring go install compatibility is more important than forcing transient dependencies to use our improved fork. The deprecated package works fine, and Atmos's direct usage still benefits from our improvements.

testing

  • Added TestGoModNoReplaceDirectives to catch future regressions
  • Verified go build succeeds
  • Verified all existing tests pass
  • Verified binary runs correctly with ./atmos version

references

Replace mitchellh/mapstructure with go-viper/mapstructure @osterman (#1678) ## what
  • Replaced direct usage of the archived github.com/mitchellh/mapstructure with github.com/go-viper/mapstructure/v2.
  • Added a replace directive in go.mod to force all transitive dependencies that use github.com/mitchellh/mapstructure to instead use the maintained github.com/go-viper/mapstructure fork (v1.6.0).

why

  • The mitchellh/mapstructure library has been archived, meaning it will no longer receive updates or security patches.
  • github.com/go-viper/mapstructure/v2 is the actively maintained and recommended fork, ensuring continued support and bug fixes.
  • Using the replace directive ensures that even indirect dependencies use the supported fork, eliminating reliance on the archived library.

references

Summary by CodeRabbit

  • Chores
    • Updated internal dependency management to use go-viper/mapstructure v2 instead of the previous mapstructure implementation across the codebase for improved compatibility and maintenance.
Add spinner and TTY dialog for AWS SSO auth @osterman (#1653) ## what
  • Enhances the AWS SSO authentication flow by introducing a visually appealing, interactive terminal dialog using the charmbracelet library.
  • Displays a colored, bordered dialog box in TTY environments showing the AWS SSO verification code and instructions.
  • Integrates an animated spinner to indicate when the system is waiting for authentication.
  • Gracefully degrades to plain text output in non-TTY environments (e.g., CI pipelines) to ensure compatibility.

why

  • Improved User Experience: The charmbracelet dialog provides a more engaging and informative user experience during the AWS SSO authentication process, making it easier to understand and follow.
  • Clearer Verification: The prominent display of the verification code with styling helps users visually confirm the code against what is shown in their browser.
  • Real-time Feedback: The spinner provides immediate visual feedback that the system is actively waiting for authentication, reducing user uncertainty.
  • Universal Compatibility: The graceful degradation ensures that the authentication flow remains functional and usable across all environments, including those without TTY capabilities.
  • Enhanced Readability: Color-coded elements and clear messaging improve the readability of important information, especially the verification code and URLs.

references

  • closes #123 (Assuming this is the issue being addressed)
  • Further context on AWS SSO device authorization flow: AWS SSO Documentation

Summary by CodeRabbit

  • New Features

    • Styled verification dialog with automated browser opening, animated spinner during SSO device authorization, and Ctrl+C cancellation.
    • Unified display for authentication results with human-friendly expiration durations and visual expiring indicators.
  • Documentation

    • Added detailed AWS IAM Identity Center / device-authorization flow docs and clarified device codes vs. MFA tokens.
  • Improvements

    • Graceful degradation for non-TTY/CI environments and consistent UX across auth commands.
Fix segfault in TestGetAffectedComponents when error pointer is corrupted @osterman (#1677) ## what - Fix segmentation violation in TestGetAffectedComponents at line 247 - Safely convert error to string before passing to `t.Skipf()`

why

  • On macOS ARM64, when gomonkey patches fail, the real function gets called with invalid test data
  • This can result in a corrupted error pointer being returned (observed address: 0x646e657065646b73)
  • fmt.Sprintf with %v tries to dereference the corrupt pointer, causing a segfault
  • Converting error to string first using err.Error() avoids dereferencing the corrupt pointer

references

testing

  • Verified test now passes without segfault on macOS ARM64
  • Test gracefully skips when gomonkey mocking fails
Fix os.Args in tests with SetArgs @osterman (#1675) ## what

This PR refactors various test files to replace direct manipulation of os.Args with Cobra's recommended RootCmd.SetArgs() method. This change standardizes how command-line arguments are tested across the codebase and improves test reliability by preventing global state pollution.

Specific changes include:

  • cmd/ package:

    • Replaced os.Args assignments with RootCmd.SetArgs() in cmd/root_test.go, cmd/auth_login_test.go.
    • Removed unnecessary manual save/restore of os.Args in cmd/root_test.go.
    • Documented legitimate usage of os.Args in cmd/cmd_utils_test.go where the function under test directly reads os.Args.
  • pkg/config/ package:

    • Refactored pkg/config/config.go to expose parseFlagsFromArgs(args []string) for direct testing of flag parsing logic.
    • Updated pkg/config/config_test.go to use parseFlagsFromArgs() where possible, reducing os.Args manipulation.
    • Documented the necessity of os.Args manipulation for integration tests within pkg/config/config_test.go that call functions like setLogConfig().
  • tests/ package:

    • Replaced os.Args assignments with cmd.RootCmd.SetArgs() in tests/cli_describe_component_test.go, tests/describe_test.go, and tests/validate_schema_test.go.

why

Directly manipulating os.Args in tests is an anti-pattern because:

  • Global State Pollution: os.Args is global and can cause test leakage, leading to unpredictable failures, especially in parallel test runs.
  • Not the Cobra Way: Cobra provides SetArgs() as the idiomatic and safe way to test command execution, managing its own state.
  • Manual Cleanup Required: Each os.Args manipulation requires manual defer statements for restoration, adding boilerplate and potential for error.

By adopting RootCmd.SetArgs():

  • Tests become more reliable and predictable.
  • Boilerplate for argument setup and cleanup is removed.
  • The codebase adheres to Cobra's best practices for testing.
  • For legitimate uses of os.Args (e.g., testing subprocesses that call os.Exit() or integration tests of the main() function), comments have been added to clarify why this approach is necessary.

references

closes #XYZ (if applicable)

Add step to get dependencies in Go setup workflow @goruha (#1679) ## what * Add step to get dependencies in Go setup workflow

why

  • To cache actual dependencies

Summary by CodeRabbit

  • Chores
    • CI workflow updated to run dependency fetching during build setup, ensuring dependencies are retrieved earlier and improving build preparation reliability.
Use run-os for setup-go @goruha (#1667) ## what * Use run-os for setup-go

why

  • Reduce cache

references

Summary by CodeRabbit

  • Chores

    • CI runner selection switched to dynamic, configuration-driven runner entries across workflows; build/test job names now include target/flavor context and include conditional Linux-specific steps.
    • Pre-commit, lint, autofix and other CI workflows updated to use the new runner configuration.
  • New Features

    • Added a scheduled/manual workflow to warm up Go cache and prepare Go tooling.
    • Added a workflow to clear PR-related caches on closed pull requests.
  • Tests

    • CI exercises OS/target combinations using the new dynamic runner configuration; Acceptance Tests now depend on the build job.
Add Changelog link and remove old file @osterman (#1676) ## what
  • Added a "Changelog" link to the top navigation bar in website/docusaurus.config.js. This link points to the /blog route, making the blog more accessible to users.
  • Removed the old, unmaintained CHANGELOG.md file from the root of the repository. This file contained outdated release notes and is no longer necessary as changelogs are now managed as blog posts.

why

  • The "Changelog" link was added to the navigation bar as per user request to improve discoverability of blog content, which serves as the current changelog.
  • The CHANGELOG.md file was removed because it was obsolete and unmaintained, with changelogs now being published as blog posts. This cleans up the repository and avoids confusion.

references

  • closes #123 (This is a placeholder, assuming the user implicitly wants to close an issue related to navigation and cleanup.)
  • Link to blog: https://atmos.tools/blog/

Summary by CodeRabbit

  • Documentation

    • Removed historical version entries from the changelog.
  • Chores

    • Added "Changelog" navigation link to the website header for easier access to release information.
`auth` Leapp Migration Guide @Benbentwo (#1633) This pull request adds documentation to help users migrate from Leapp to Atmos Auth for AWS IAM Identity Center authentication. The main changes introduce a new migration guide and organize authentication documentation under a dedicated category.

Documentation improvements:

  • Added a comprehensive migration guide (migrating-from-leapp.mdx) that explains how to convert Leapp sessions and providers to Atmos Auth YAML configuration, including field mappings, step-by-step instructions, troubleshooting tips, and a comparison table.

Documentation structure:

  • Created a new _category_.json file to group authentication documentation under "Authentication (atmos auth)" in the sidebar for improved discoverability.

Summary by CodeRabbit

  • Documentation
    • Removed the legacy Atmos Auth User Guide.
    • Added a "Migrating from Leapp" tutorial with migration steps, field mappings, and verification commands.
    • Added a Geodesic configuration tutorial for Atmos Auth integration.
    • Introduced an Auth “Tutorials” category and two new blog posts introducing Atmos Auth and tutorials.
    • Reorganized Auth CLI docs: updated ordering, labels, slugs, subcommand links, and sidebar positions.
    • Expanded the Auth usage guide with AWS Permission Set account specification guidance and examples.
Update homedir README with fork details @osterman (#1673) ## what
  • Appended a detailed section to pkg/config/homedir/README.md describing the "Atmos Fork Enhancements".
  • This new section explains the fork's prioritization of environment variables for test compatibility with t.Setenv().
  • It also details cache management strategies, including disabling caching (homedir.DisableCache = true) and resetting the cache (homedir.Reset()).
  • Provides code examples for using these features in Go tests.

why

  • To clearly document the specific enhancements made in Atmos's vendored fork of the mitchellh/go-homedir package.
  • To provide users, particularly those writing Go tests, with clear instructions on how to leverage the improved environment variable support and cache management for better testability.
  • The original mitchellh/go-homedir package is deprecated, and this fork is maintained to support these specific testing requirements.

references

  • closes #279

🚀 Enhancements

chore: Update Pro Instances API @milldr (#1721) ## what - Update endpoint format to include query params for stack & component

why

  • We've updated the API for Atmos Pro so that we can support slashes in component names

references

Summary by CodeRabbit

  • Chore
    • Pro Instances API client now sends stack and component as query parameters for more reliable encoding and consistency.
  • Documentation
    • Added a blog post explaining the endpoint format change, impact, and that no configuration or workflow changes are required.
  • Bug Fixes
    • Cleaned up authentication output spacing for more compact, consistent display.
fix: Consolidate credential retrieval logic to fix terraform auth @osterman (#1720) ## Summary

This PR fixes a critical bug where atmos terraform plan and other Terraform commands failed to use file-based credentials, while atmos auth whoami and similar commands worked correctly.

The root cause was duplicate credential retrieval code across three methods with inconsistent fallback behavior. Two methods had keyring → identity storage fallback logic, but one (retrieveCachedCredentials) did not, causing Terraform commands to fail when credentials were in files instead of the keyring.

Root Cause Analysis

Three separate code paths retrieved credentials:

  • GetCachedCredentials - Had fallback ✓
  • findFirstValidCachedCredentials - Had fallback ✓
  • retrieveCachedCredentials - NO fallback ✗ (used by Terraform execution)

When users authenticated via AWS SSO, credentials were written to files, not cached in the keyring. Terraform commands would fail because the retrieveCachedCredentials path didn't check identity storage.

Solution

Extracted a shared retrieveCredentialWithFallback method as the single source of truth for credential retrieval:

  • Fast path: Try keyring cache first (immediate)
  • Slow path: Fall back to identity storage if not in keyring (AWS files, etc.)
  • All three code paths now delegate to this single method
  • Ensures consistent behavior across all operations

Changes

  • Added retrieveCredentialWithFallback() method (38 lines)
  • Refactored GetCachedCredentials() - 40% code reduction
  • Refactored findFirstValidCachedCredentials() - 57% code reduction
  • Refactored retrieveCachedCredentials() - Now uses shared method
  • Fixed TestManager_GetCachedCredentials_Paths to use proper test data
  • Added regression test TestManager_retrieveCachedCredentials_TerraformFlow_Regression
  • Added integration test TestRetrieveCachedCredentials_KeyringMiss_IdentityStorageFallback
  • Show active identities image

Testing

✅ All auth tests pass (12/12 test suites)
✅ Regression test reproduces original bug, passes with fix
✅ Integration tests verify fallback behavior works
✅ Code compiles successfully

Impact

✅ Terraform commands now work with file-based credentials
✅ ~110 lines of duplicate code eliminated
✅ Single source of truth for credential retrieval
✅ Impossible to have divergent fallback behavior in future
✅ Consistent behavior across all auth operations

References

This PR addresses the issue where valid authenticated sessions would fail during Terraform execution with "credentials not found" error, even though atmos auth whoami showed valid credentials.

See docs/prd/credential-retrieval-consolidation.md for detailed architectural analysis.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Interactive identity selection when using --identity without a value (CLI and Terraform).
    • New auth logout --all to sign out all identities.
    • ATMOS_IDENTITY env var honored; CLI env outputs add AWS region defaults.
    • Identity list now shows authentication status and credential expiry.
  • Bug Fixes

    • More reliable credential retrieval with keyring → identity-storage fallback.
    • Safer, clearer logout behavior and plain-text summaries.
    • Default files display path updated to ~/.config/atmos.
  • Documentation

    • Help pages and docs updated for interactive identity modes, logout options, and examples.
fix: Restore PATH inheritance in workflow shell commands @osterman (#1719) ## what - Refactored to **always** merge custom env vars with parent environment - Fixes workflow shell commands failing with "executable file not found in $PATH" - Adds comprehensive unit and integration tests demonstrating the bug and verifying the fix

why

  • After commit 9fd7d15 (PR #1543), workflow shell commands lost access to PATH environment variable
  • Users reported workflows that worked in v1.189.0 failed in v1.195.0 with commands like env, ls, grep not found
  • This is a critical regression affecting any workflow using external executables
  • Original fix conditionally replaced environment, which was inconsistent with executeCustomCommand behavior

Root Cause

The bug occurred in ExecuteShell() function in internal/exec/shell_utils.go:

  1. Workflow commands call ExecuteShell with empty env slice: []string{}
  2. ExecuteShell appends ATMOS_SHLVL to the slice: []string{"ATMOS_SHLVL=1"}
  3. ShellRunner receives a non-empty env, so it doesn't fall back to os.Environ()
  4. Shell command runs with ONLY ATMOS_SHLVL set, losing PATH and all other environment variables

Solution

Refactored ExecuteShell() to always merge custom env vars with parent environment:

// Always start with parent environment
mergedEnv := os.Environ()

// Merge custom env vars (overriding duplicates)
for _, envVar := range env {
    mergedEnv = u.UpdateEnvVar(mergedEnv, key, value)
}

// Add ATMOS_SHLVL
mergedEnv = append(mergedEnv, fmt.Sprintf("ATMOS_SHLVL=%d", newShellLevel))

This ensures:

  • ✅ Empty env (workflows): Full parent environment including PATH
  • ✅ Custom env (commands): Custom vars override parent, but PATH is preserved
  • ✅ Consistent behavior: Matches executeCustomCommand pattern (line 393 in cmd_utils.go)

Testing

Unit Tests (internal/exec/shell_utils_test.go):

  • TestExecuteShell/empty_env_should_inherit_PATH_from_parent_process - Verifies env command works
  • TestExecuteShell/empty_env_should_inherit_PATH_for_common_commands - Tests ls, env, pwd, echo
  • TestExecuteShell/custom_env_vars_override_parent_env - Verifies custom vars properly override parent

Integration Test (tests/test-cases/workflows.yaml):

  • atmos workflow shell command with PATH - Full end-to-end workflow test using env | grep PATH

All tests pass, including existing workflow tests.

references

Summary by CodeRabbit

  • Bug Fixes

    • Shell commands now correctly inherit environment variables (including PATH) from the parent process, with custom env vars properly overriding parent values.
  • Tests

    • Added tests covering environment inheritance for commands that require PATH, shell builtins, and custom env var overrides.
  • Workflows / Snapshots

    • Added a workflow demonstrating PATH-dependent shell commands and updated related test snapshots and test cases.
test: Improve test coverage for keyring fallback to 78.4% @osterman (#1705) ## what - Add comprehensive unit tests for no-op keyring and system keyring functionality - Improve test coverage from 71.2% to 78.4% (+7.2 percentage points) - Add Validate() method to test credential types to satisfy ICredentials interface

why

  • Ensure critical business logic is properly tested (cache management, expiration checking, error handling)
  • Meet 80% test coverage target for new features
  • Prevent regressions in keyring fallback behavior introduced in bde37e334

references

  • Related to commit bde37e334 which introduced graceful keychain fallback for containerized environments
  • Implements test requirements from docs/prd/keyring-fallback-containerized-environments.md

Test Coverage Improvements

Starting Coverage: 71.2%
Final Coverage: 78.4%
Improvement: +7.2 percentage points

Tests Added (8 new test functions):

  1. TestNoopKeyringStore_ValidCache - Tests cache hit with valid credentials
  2. TestNoopKeyringStore_ExpiredInCache - Tests cache hit with expired credentials
  3. TestNoopKeyringStore_StoreWithMockCredentials - Tests storing mock credentials
  4. TestNoopKeyringStore_ExpirationWarning - Tests expiration warning logic
  5. TestSystemKeyringStore_GetAny - Tests retrieving arbitrary data from system keyring
  6. TestSystemKeyringStore_GetAny_NotFound - Tests GetAny error handling
  7. TestSystemKeyringStore_SetAny - Tests storing arbitrary data types
  8. TestNewKeyringAuthStore - Tests deprecated backward-compatible function

Coverage by Function:

File Function Before After Improvement
keyring_noop.go Retrieve() 36.8% 57.9% +21.1%
keyring_system.go GetAny() 0% 85.7% +85.7%
keyring_system.go SetAny() 0% 71.4% +71.4%
store.go NewKeyringAuthStore() 0% 100% +100%

Remaining Uncovered (1.6% to reach 80%):

The uncovered code paths require real AWS credentials and live AWS STS API calls:

  • AWS credential validation success paths (lines 79-95 in Retrieve())
  • AWS STS GetCallerIdentity success (lines 152-168 in validateAWSCredentials())

These are integration-level scenarios better suited for E2E tests with real AWS infrastructure rather than unit tests.

What We Test

Validation failure path - AWS SDK without credentials
Cache behavior - Hits, misses, expiration, staleness
Error handling - Expired/missing credentials
Storage operations - Store, Retrieve, Delete, List
GetAny/SetAny - Arbitrary data storage for all keyring types
Backward compatibility - Deprecated functions

Fix `atmos describe affected --include-dependents --stack ` command to correctly process the dependents only from the provided stack @aknysh (#1703) ## Problem

When executing atmos describe affected --include-dependents --stack <stack>, the command was incorrectly processing dependent components from ALL stacks instead of only from the specified stack. This caused:

  1. Performance issues: YAML functions (!terraform.output, !terraform.state, !env) were executed for components in all stacks, not just the filtered stack
  2. Incorrect behavior: Dependents from other stacks were being included in the output
  3. Test gaps: Tests didn't catch this issue because fixtures lacked YAML functions that would fail when processed incorrectly

Root Cause

In internal/exec/describe_dependents.go, the ExecuteDescribeDependents function was calling ExecuteDescribeStacks with an empty string for the stack filter instead of passing the onlyInStack parameter. This caused all stacks to be loaded and processed.

Solution

1. Fixed Stack Filtering

  • Added OnlyInStack parameter to DescribeDependentsArgs struct
  • Updated ExecuteDescribeDependents to pass the stack filter through to ExecuteDescribeStacks
  • Ensured dependents are correctly filtered to only the specified stack

2. Refactored to Options Pattern

  • Created DescribeDependentsArgs struct to replace 8 individual parameters
  • Improved code readability and maintainability
  • Follows the Options Pattern from CLAUDE.md

3. Enhanced Test Coverage

  • Added YAML functions (!env) to test fixtures to detect the bug
  • Created new test TestDescribeAffectedWithDependentsStackFilterYamlFunctions to verify:
    • YAML functions are only executed for components in the specified stack
    • Dependents are correctly filtered by stack
    • Environment variables are not accessed for components in other stacks

4. Lintroller Improvements

Added comprehensive exclusions to the custom linter:

  • 29 packages excluded from perf.Track() checks (one-time operations)
  • 7 utility files excluded (not in hot paths)
  • 15 hot-path functions instrumented with perf.Track()
  • os.Args linter exclusions for legitimate test patterns

Testing

Manual Testing

# Test that dependents are filtered by stack
atmos describe affected --include-dependents --stack ue1-network

# Verify YAML functions only execute for the specified stack
ATMOS_TEST_VPC_UE1=test atmos describe affected --include-dependents --stack ue1-network

Automated Testing

go test ./internal/exec -v -run TestDescribeAffectedWithDependentsStackFilterYamlFunctions
go test ./pkg/describe -v

Changes

Core Functionality

  • internal/exec/describe_dependents.go: Added DescribeDependentsArgs struct, fixed stack filtering
  • internal/exec/describe_affected_utils_2.go: Updated to use new struct pattern
  • internal/exec/atmos.go: Updated TUI integration
  • pkg/describe/describe_dependents_test.go: Updated integration tests

Test Fixtures

  • Added !env YAML functions to test fixtures in 4 files:
    • tests/fixtures/scenarios/atmos-describe-affected-with-dependents-and-locked/stacks/deploy/network/us-east-1.yaml
    • tests/fixtures/scenarios/atmos-describe-affected-with-dependents-and-locked/stacks/deploy/network/us-west-2.yaml
    • And their stacks-affected versions

Tests

  • internal/exec/describe_affected_test.go: Added TestDescribeAffectedWithDependentsStackFilterYamlFunctions
  • Updated all mock functions to use new struct signature

Performance Tracking

Added perf.Track() to hot-path functions:

  • Stack processing: ProcessYAMLConfigFiles, ProcessYAMLConfigFile, ProcessStackConfig
  • Component processing: ProcessComponentInStack, ProcessComponentFromContext
  • Describe operations: ExecuteDescribeStacks, ExecuteDescribeComponent
  • Core execution: FilterEmptySections, IsComponentAbstract, FilterComputedFields
  • Template functions: AtmosFuncs.Component, AtmosFuncs.GomplateDatasource

Lintroller

  • tools/lintroller/rule_perf_track.go: Added exclusions for non-hot-path packages and files
  • tools/lintroller/rule_os_args.go: Added exclusions for legitimate os.Args usage in tests

Impact

Performance: YAML functions no longer execute for components outside the filtered stack
Correctness: Dependents are now correctly limited to the specified stack
Test Coverage: New tests prevent regression
Code Quality: Improved readability with Options Pattern
Linter: All custom linter checks pass

Summary by CodeRabbit

  • New Features

    • Stack-specific filtering for dependent discovery (OnlyInStack).
    • New template helpers under the "atmos" namespace: component, datasource, store.
  • Bug Fixes

    • YAML function execution now respects stack filtering in describe-affected with dependents.
  • Performance

    • Added runtime performance tracking across various describe and processing commands.
  • Chores

    • Updated Atmos version and PostHog dependency; docs updated.
  • Tests

    • Added/updated tests and fixtures for stack-filtering and Terraform-state YAML scenarios.

Don't miss a new atmos release

NewReleases is sending notifications on new releases.