Support `!terraform.state` on GCS Backends @shirkevich (#1393)
# Add GCS backend support to `!terraform.state` YAML functionwhat
- Add Google Cloud Storage (GCS) backend support to
!terraform.stateAtmos YAML function - Implement performance optimizations (client caching, retry logic, extended timeouts)
- Create unified Google Cloud authentication system for consistency across GCP services
- Update documentation with GCS backend usage examples and authentication methods
why
The !terraform.state YAML function allows reading the outputs (remote state) of components in Atmos stack manifests directly from the configured Terraform/OpenTofu backends.
Previously, the !terraform.state YAML function only supported:
local(Terraform and OpenTofu)s3(Terraform and OpenTofu)
This PR adds support for:
gcs(Google Cloud Storage - Terraform and OpenTofu)
With GCS backend support, users can now leverage the high-performance !terraform.state function instead of the slower !terraform.output or !store functions when using Google Cloud Storage for Terraform state storage.
Implementation Details
GCS Backend Features
- Full Authentication Support: JSON credentials, service account file paths, and Google Application Default Credentials (ADC)
- Service Account Impersonation: Support for
impersonate_service_accountconfiguration - Performance Optimizations:
- Client caching to avoid recreating GCS clients for repeated operations
- Retry logic with exponential backoff (up to 3 attempts) for transient failures
- Extended timeouts (30 seconds) to match S3 backend performance
- Robust Error Handling: Graceful handling of missing state files and detailed error context
- Resource Management: Proper cleanup and explicit resource management
Usage
The GCS backend works seamlessly with existing !terraform.state syntax:
# Get the `output` of the `component` in the current stack
subnet_id: !terraform.state vpc private_subnet_id
# Get the `output` of the `component` in the provided `stack`
vpc_id: !terraform.state vpc dev-us-east-1 vpc_id
# Get complex outputs using YQ expressions
first_subnet: !terraform.state vpc .private_subnet_ids[0]GCS Backend Configuration
The GCS backend supports all standard Terraform GCS backend configurations:
# atmos.yaml
components:
terraform:
backend_type: gcs
backend:
gcs:
bucket: "my-terraform-state-bucket"
prefix: "terraform/state"
# Authentication options (choose one):
# Option 1: JSON credentials content
credentials: |
{
"type": "service_account",
"project_id": "my-project",
...
}
# Option 2: Service account file path
credentials: "/path/to/service-account.json"
# Option 3: Use Application Default Credentials (ADC)
# (no credentials field needed - uses environment/metadata)
# Optional: Service account impersonation
impersonate_service_account: "terraform@my-project.iam.gserviceaccount.com"Performance Benefits
Compared to !terraform.output, the !terraform.state function with GCS backend:
- ✅ No Terraform execution - Reads state directly from GCS
- ✅ No provider initialization - Skips all module and provider setup
- ✅ No varfile generation - Bypasses Terraform configuration preparation
- ✅ Cached clients - Reuses GCS clients for multiple operations
- ✅ Parallel execution - Multiple state reads can happen concurrently
Testing
- Comprehensive Test Suite: 100% test coverage for all new functionality
- Mock Implementations: Complete interface-based testing for GCS operations
- Authentication Testing: Validates all credential types and authentication flows
- Error Scenario Coverage: Tests for missing files, network failures, and invalid configurations
- Caching Validation: Ensures client caching works correctly across operations
- Retry Logic Testing: Validates exponential backoff and failure recovery
Backward Compatibility
- ✅ No breaking changes to existing configurations
- ✅ Existing backends (
local,s3) remain unchanged - ✅ Same function syntax - no new parameters or options required
- ✅ Graceful fallbacks - continues to work with
!terraform.outputand!storefunctions
Files Changed
Core Implementation
internal/terraform_backend/terraform_backend_gcs.go- GCS backend implementationinternal/terraform_backend/terraform_backend_gcs_test.go- Comprehensive test suiteinternal/terraform_backend/terraform_backend_registry.go- Register GCS backendinternal/terraform_backend/terraform_backend_utils.go- Updated error messages
Unified Authentication System
internal/gcp/auth.go- New unified Google Cloud authentication (created)internal/gcp/auth_test.go- Authentication tests (created)pkg/store/google_secret_manager_store.go- Updated to use unified authinternal/gcp_utils/gcp_utils.go- Removed (replaced by unified auth)
Configuration & Documentation
internal/exec/terraform_generate_backend.go- GCS backend validationwebsite/docs/core-concepts/stacks/yaml-functions/terraform.state.mdx- Updated documentationerrors/errors.go- Added GCS-specific error typesgo.mod- Added GCS storage dependency
Migration Guide
For users currently using !terraform.output or !store with GCS-stored state:
Before (slower)
# Using !terraform.output (requires Terraform execution)
vpc_id: !terraform.output vpc dev-us-east-1 vpc_id
# Using !store (requires separate state management)
vpc_id: !store google-secret-manager dev/vpc/vpc_idAfter (faster)
# Using !terraform.state (direct GCS state access)
vpc_id: !terraform.state vpc dev-us-east-1 vpc_idSimply update your backend configuration to use gcs and replace function calls - no other changes needed!
Summary by CodeRabbit
-
New Features
- GCS-backed Terraform state support and unified Google Cloud authentication integration.
-
Bug Fixes
- Stricter backend config validation with clearer error responses and updated supported-backends messaging.
-
Tests
- Comprehensive unit tests added for GCS backend behavior and GCP authentication handling.
fix: Improve AWS credential isolation and auth error propagation @osterman (#1712)
## SummaryThis PR addresses multiple authentication issues when using Atmos in containerized environments with mounted credential files:
- Auth Pre-Hook Error Propagation - Terraform execution now properly aborts when authentication fails (e.g., Ctrl+C during SSO)
- AWS Credential Loading Strategy - New
LoadAtmosManagedAWSConfig()function provides proper isolation while preserving Atmos-managed profile selection - Noop Keyring Validation - Container auth now properly isolated from external environment variables
- Whoami with Noop Keyring -
atmos auth whoaminow works in containerized environments - Test Coverage - Added test to verify auth errors properly abort execution
Changes
1. Auth Pre-Hook Error Propagation (internal/exec/terraform.go:236)
- Problem: Errors from auth pre-hook were logged but not returned, causing terraform execution to continue even when authentication failed (e.g., user presses Ctrl+C during SSO)
- Fix: Added
return errafter logging auth pre-hook errors - Impact: Terraform commands now properly abort on auth failures
2. AWS Credential Loading Strategy (pkg/auth/cloud/aws/env.go)
- Problem: SDK's default config loading allowed IMDS access and was affected by external
AWS_PROFILE, causing conflicts in containers - Solution: Created
LoadAtmosManagedAWSConfig()function that:- Clears credential env vars (
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_SESSION_TOKEN) - Preserves profile/path vars (
AWS_PROFILE,AWS_SHARED_CREDENTIALS_FILE,AWS_CONFIG_FILE) - Allows SDK to load from Atmos-managed credential files
- Clears credential env vars (
- Impact: Proper isolation while still using Atmos-managed profiles
3. Noop Keyring Credential Validation (pkg/auth/credentials/keyring_noop.go)
- Problem: Used unrestricted
config.LoadDefaultConfig()which allowed IMDS access and was affected by externalAWS_PROFILE - Fix: Changed to use
LoadAtmosManagedAWSConfig() - Impact: Container auth now properly isolated from external env vars
4. Whoami with Noop Keyring (pkg/auth/manager.go)
- Problem:
Whoami()expected credentials from keyring, but noop keyring returnsErrCredentialsNotFoundby design - Fix: Added check for
ErrCredentialsNotFoundand fallback tobuildWhoamiInfoFromEnvironment() - Impact:
atmos auth whoaminow works in containerized environments
5. Test Coverage (internal/exec/terraform_test.go)
- Added
TestExecuteTerraform_AuthPreHookErrorPropagationto verify auth errors properly abort execution - Test validates that terraform doesn't continue on auth failure
- Updated test fixture to include required
name_patternconfiguration
Technical Details
The key insight is that Atmos sets AWS_PROFILE=identity-name (in pkg/auth/cloud/aws/setup.go:59) but the previous isolation approach cleared ALL AWS env vars including AWS_PROFILE. This caused the SDK to look for a non-existent [default] section.
The new LoadAtmosManagedAWSConfig preserves AWS_PROFILE while still preventing external credential conflicts.
Test Plan
-
go build .- Build succeeds -
go test ./internal/exec -run TestExecuteTerraform- All terraform tests pass -
TestExecuteTerraform_AuthPreHookErrorPropagation- New test passes - Verified test fails when fix is removed (terraform continues execution)
- Verified test passes when fix is restored (terraform aborts on auth error)
References
Fixes authentication issues in containerized environments with mounted credentials.
🤖 Generated with Claude Code
Summary by CodeRabbit
-
New Features
- Added --login and cached-credentials-first flows across auth commands; whoami now shows validation and expiry.
- Atmos-managed credentials moved to XDG-compliant locations; improved shell enter/exit messages.
- Geodesic helper script for building/testing in containerized environments.
-
Bug Fixes
- Terraform pre-hook errors now abort execution.
- Improved propagation of user-abort during authentication.
-
Documentation
- XDG migration guides and Geodesic/CLI docs updated.
-
Tests
- Broad expansion of auth, AWS credential, auth-context and output-propagation tests.
fix: Relax stack config requirement for commands that don't operate on stacks @osterman (#1717)
## SummaryFixes stack configuration requirement for 6 commands that don't actually operate on stack manifests. These commands were incorrectly requiring stacks.base_path and stacks.included_paths to be configured, causing errors like:
Error: failed to initialize atmos config
stack base path must be provided in 'stacks.base_path' config or ATMOS_STACKS_BASE_PATH' ENV variable
What
Updated 6 commands to use processStacks=false in InitCliConfig:
Auth Commands (Commit 1)
atmos auth env- Export cloud credentials as environment variablesatmos auth exec- Execute commands with cloud credentialsatmos auth shell- Launch authenticated shell
List/Docs Commands (Commit 2)
atmos list workflows- List workflows from workflows/ directoryatmos list vendor- List vendor configurations from component.yaml filesatmos docs <component>- Display component README files
Why
These commands only need:
- Auth configuration from
atmos.yaml - Component base paths (terraform, helmfile, etc.)
- Workflow or vendor configurations
They do NOT need:
- Stack manifests to exist
stacks.base_pathto be configuredstacks.included_pathsto be configured
This makes Atmos more flexible for use cases like:
- CI/CD pipelines that only need auth or vendor management
- Development environments without full stack setup
- Documentation browsing without infrastructure configs
- Workflow management separate from stack operations
Technical Details
Changes Made
-
InitCliConfigparameter: ChangedprocessStacksfromtruetofalse- Prevents validation requiring
stacks.base_pathandstacks.included_paths - Skips processing of stack manifest files
- Prevents validation requiring
-
checkAtmosConfigoption (forlist vendoronly): AddedWithStackValidation(false)- Prevents checking if stacks directory exists
- Required because
list vendorcallscheckAtmosConfig()with additional validation
Files Changed
cmd/auth_env.gocmd/auth_exec.gocmd/auth_shell.gocmd/list_workflows.gocmd/list_vendor.gocmd/docs.go
Commands That Still Require Stacks (Unchanged)
These were NOT modified because they genuinely need stack manifests:
atmos list stacksatmos list componentsatmos list settingsatmos list valuesatmos list metadata
Testing
✅ All existing tests pass
✅ Linter passes with 0 issues
✅ Pre-commit hooks pass
✅ Manual testing confirms commands work without stack directories
✅ No regressions in existing functionality
References
Addresses user issue where atmos auth exec -- aws sts get-caller-identity failed with stack configuration error.
🤖 Generated with Claude Code
Co-Authored-By: Claude noreply@anthropic.com
Summary by CodeRabbit
-
New Features
- Auth and utility commands (auth env, auth exec, auth shell, list workflows, list vendor, docs) now run without requiring stack configuration, enabling use in CI/CD, vendor management, and documentation workflows.
-
Documentation
- Added a blog post describing the change, usage examples, migration tips, and CI/CD benefits.
Change runner type in nightly builds workflow @goruha (#1713)
## what * Use `large`runson runners for the go relaserwhy
- Go releaser need more disk space
Summary by CodeRabbit
- Chores
- Updated GitHub Actions runner specifications across feature release, nightly build, and test workflows to standardize build infrastructure configuration.
Update nightlybuilds.yml @goruha (#1711)
## what * Run go releaser on RunsOn runnerwhy
- Default runners have out of space
Summary by CodeRabbit
- Chores
- Updated nightly release workflow to change how runner selection is provided: the workflow now accepts a JSON-like array of runner specifications, improving and broadening which runner(s) can be targeted for nightly builds.
Fix Terraform state authentication by passing auth context @osterman (#1695)
## what - Add authentication context parameter to Terraform backend operations - Refactor PostAuthenticate interface to use parameter struct - Extract nested logic to reduce complexity - Fix test coverage for backend functionswhy
- Terraform state operations need proper AWS credentials when accessing S3 backends
- Multi-identity scenarios require passing auth context through the call chain
- Reduces function parameter count from 6 to 2 (using PostAuthenticateParams struct)
- Simplifies nested conditional logic for better maintainability
references
- Part of multi-identity authentication context work
- Follows established authentication context patterns
- Related to docs/prd/auth-context-multi-identity.md
Summary by CodeRabbit
-
New Features
- Centralized per-command AuthContext enabling multiple concurrent identities (AWS, GitHub, Azure, etc.) and making in-process SDK and Terraform calls use Atmos-managed credentials.
- Console session duration configurable via provider console.session_duration with CLI flag override.
-
Bug Fixes
- More reliable in-process authentication for SDK and Terraform state reads.
-
Documentation
- Added design doc, blog post, and CLI docs describing AuthContext and session-duration behavior.
-
Tests
- Expanded tests for auth flows, AWS config loading, and YAML/Terraform tag auth propagation.
Add circular dependency detection for YAML functions @osterman (#1708)
## what - Implement universal circular dependency detection for all Atmos YAML functions (!terraform.state, !terraform.output, atmos.Component) - Add goroutine-local resolution context for cycle tracking - Create comprehensive error messages showing dependency chains - Fix missing perf.Track() calls in Azure backend wrapper methods - Refactor code to meet golangci-lint complexity limitswhy
- Users experiencing stack overflow panics from circular dependencies in component configurations
- Need to detect cycles before they cause panics and provide actionable error messages
- Performance tracking required for all public functions per Atmos conventions
- Reduce cyclomatic complexity and function length for maintainability
Implementation Details
Architecture
- Goroutine-local storage using sync.Map with goroutine IDs to maintain isolated resolution contexts
- O(1) cycle detection using visited-set pattern with Push/Pop operations
- Call stack tracking for building detailed error messages showing dependency chains
- Zero performance impact (<10 microseconds overhead, <0.001% of total execution time)
Test Coverage
- 27 comprehensive tests across 4 test files
- 100% coverage on core resolution context logic
- ~75-80% overall coverage (excluding benchmark and integration tests)
- Benchmark tests proving negligible performance impact
- Integration tests for real-world scenarios (currently skipped - require state backends)
Performance
- Push operation: ~266 nanoseconds
- Pop operation: ~70 nanoseconds
- GetGoroutineID: ~2,434 nanoseconds
- Total overhead: <10 microseconds (<0.001% of execution time)
Error Messages
Before (stack overflow panic):
runtime: goroutine stack exceeds 1000000000-byte limit
fatal error: stack overflow
After (actionable error with dependency chain):
circular dependency detected
Dependency chain:
1. Component 'vpc' in stack 'core'
→ !terraform.state transit-gateway core transit_gateway_id
2. Component 'transit-gateway' in stack 'core'
→ !terraform.state vpc core vpc_id
3. Component 'vpc' in stack 'core' (cycle detected)
→ !terraform.state transit-gateway core transit_gateway_id
To fix this issue:
- Review your component dependencies and break the circular reference
- Consider using Terraform data sources or direct remote state instead
- Ensure dependencies flow in one direction only
references
- Fixes community-reported stack overflow issue in YAML function processing
- See
docs/prd/circular-dependency-detection.mdfor complete architecture and design decisions - See
docs/circular-dependency-detection.mdfor user documentation and troubleshooting - See
CIRCULAR_DEPENDENCY_DETECTION_SUMMARY.mdfor implementation summary
Files Changed
- Core implementation:
internal/exec/yaml_func_resolution_context.go(161 lines) - Tests: 4 test files (1,093 lines total)
- Modified:
yaml_func_utils.go,yaml_func_terraform_state.go,yaml_func_terraform_output.go - Documentation: PRD, user docs, summary
- Test fixtures: 7 YAML files + 2 Terraform components
- Additional: Fixed Azure backend perf.Track() issues
🤖 Generated with Claude Code
Co-Authored-By: Claude noreply@anthropic.com
Summary by CodeRabbit
- New Features
- Automatic circular dependency detection for YAML functions including terraform.state, terraform.output, and custom component functions. The system detects cycles early before runtime failures occur, providing comprehensive error messages that display the full dependency chain and component relationships. Users receive actionable remediation guidance and suggested fixes to resolve circular dependencies in their infrastructure configurations.
fix: Remove exclude directive to enable go install @osterman (#1709)
## what - Removed `exclude` directive from go.mod that was blocking `go install github.com/cloudposse/atmos@main` - Updated go install compatibility test to check for both `replace` and `exclude` directiveswhy
- The
excludedirective in go.mod prevents users from installing Atmos viago install - Go modules with
excludedirectives cannot be used as dependencies (by design) - This breaks a documented installation method and creates user friction
- The excluded version (godbus/dbus v0.0.0-20190726142602-4481cbc300e2) is already superseded by explicitly required versions (v4.1.0 and v5.1.0)
references
- Related Go issues: golang/go#44840, golang/go#69762, golang/go#50698
- The test now prevents future regressions for both
replaceandexcludedirectives
🤖 Generated with Claude Code
fix: Upgrade to Go 1.25 and make test logging respect -v flag @osterman (#1706)
## what - Upgraded Go version from 1.24.8 to 1.25.0 - Configured Atmos logger in tests to respect `testing.Verbose()` flag - Tests are now quiet by default, verbose with `-v` flag - Added missing `perf.Track()` calls to Azure backend wrapper methodswhy
- Go 1.24.8 had a runtime panic bug in
unique_runtime_registerUniqueMapCleanupon macOS ARM64 (golang/go#69729) - This caused
TestGetAffectedComponentsto panic during cleanup on macOS CI - Test output was always verbose because logger was set to
InfoLevelunconditionally - Go 1.25.0 fixes the runtime panic bug
- Linter enforcement requires
perf.Track()on all public functions
changes
- go.mod: Upgraded from
go 1.24.8togo 1.25.0 - tests/cli_test.go:
- Moved logger level configuration from
init()toTestMain() - Logger now respects
-vflag using switch statement:ATMOS_TEST_DEBUG=1:DebugLevel(everything)-vflag:InfoLevel(info, warnings, errors)- Default:
WarnLevel(only warnings and errors)
- Removed debug pattern logging loop (was spam)
- All helpful
t.Logf()messages preserved (work correctly with-v)
- Moved logger level configuration from
- internal/terraform_backend/terraform_backend_azurerm.go:
- Added
perf.Track()toGetBody()wrapper method - Added
perf.Track()toDownloadStream()wrapper method
- Added
testing
go test ./tests→ Quiet (no logger output)go test ./tests -v→ Verbose (shows INFO logs)go test ./internal/exec -run TestGetAffectedComponents→ Passes without panic
references
- Fixes the macOS panic from https://github.com/cloudposse/atmos/actions/runs/18656461566/job/53187085704
- Related Go issue: golang/go#69729
Add Azure Blob Storage (azurerm) backend support for !terraform.state function @jamengual (#1610)
## what - Implemented Azure Blob Storage backend support for the `!terraform.state` YAML function - Added comprehensive unit tests with 100% coverage for the new backend - Updated error definitions, registry, and documentationwhy
- The
!terraform.statefunction previously only supportedlocalands3backends - Azure users needed native azurerm backend support to read Terraform state directly from Azure Blob Storage
- This provides the fastest way to retrieve Terraform outputs without Terraform initialization overhead
changes
-
New Implementation:
internal/terraform_backend/terraform_backend_azurerm.go- Implements azurerm backend reader following S3 backend patterns
- Uses Azure SDK with DefaultAzureCredential for authentication (Managed Identity, Service Principal, Azure CLI, etc.)
- Supports workspace-based blob paths (
env:/{workspace}/{key}for non-default workspaces) - Includes client caching, retry logic (2 retries with exponential backoff), and proper error handling
- Handles 404 (blob not found) gracefully by returning nil (component not provisioned yet)
- Handles 403 (permission denied) with descriptive error messages
-
Comprehensive Tests:
internal/terraform_backend/terraform_backend_azurerm_test.go- 8 test functions covering all scenarios with mocked Azure SDK client
- Tests workspace handling (default vs non-default), blob not found, permission denied, network errors, retry logic, and error cases
- All tests pass with no external dependencies required
-
Error Definitions:
errors/errors.go- Added 7 new Azure-specific static errors following project patterns
- ErrGetBlobFromAzure, ErrReadAzureBlobBody, ErrCreateAzureCredential, ErrCreateAzureClient, ErrAzureContainerRequired, ErrStorageAccountRequired, ErrAzurePermissionDenied
-
Registry Update:
internal/terraform_backend/terraform_backend_registry.go- Registered ReadTerraformBackendAzurerm in the backend registry
-
Error Message Update:
internal/terraform_backend/terraform_backend_utils.go- Updated supported backends list to include
azurerm
- Updated supported backends list to include
-
Documentation Update:
website/docs/functions/yaml/terraform.state.mdx- Added azurerm to the list of supported backend types
- Updated warning message to reflect azurerm support
-
Dependencies:
go.mod- Moved
github.com/Azure/azure-sdk-for-go/sdk/storage/azblobfrom indirect to direct dependency (already present in project)
- Moved
implementation notes
- Follows established patterns from S3 backend implementation
- Uses wrapper pattern (AzureBlobAPI interface) to enable testing without actual Azure connectivity
- Implements proper workspace path handling matching Azure backend behavior (
env:/{workspace}/{key}) - All comments end with periods (enforced by golangci-lint)
- Imports organized in 3 groups (stdlib, 3rd-party, atmos) as per CLAUDE.md
- Performance tracking added with
defer perf.Track()on all functions - Cross-platform compatible using Azure SDK (not CLI commands)
test results
=== RUN TestReadTerraformBackendAzurermInternal_Success
=== RUN TestReadTerraformBackendAzurermInternal_Success/successful_read_default_workspace
=== RUN TestReadTerraformBackendAzurermInternal_Success/successful_read_dev_workspace
=== RUN TestReadTerraformBackendAzurermInternal_Success/successful_read_prod_workspace
=== RUN TestReadTerraformBackendAzurermInternal_Success/successful_read_empty_workspace
=== RUN TestReadTerraformBackendAzurermInternal_Success/successful_read_default_key
--- PASS: TestReadTerraformBackendAzurermInternal_Success (0.00s)
=== RUN TestReadTerraformBackendAzurermInternal_BlobNotFound
--- PASS: TestReadTerraformBackendAzurermInternal_BlobNotFound (0.00s)
=== RUN TestReadTerraformBackendAzurermInternal_PermissionDenied
--- PASS: TestReadTerraformBackendAzurermInternal_PermissionDenied (0.00s)
=== RUN TestReadTerraformBackendAzurermInternal_NetworkError
--- PASS: TestReadTerraformBackendAzurermInternal_NetworkError (4.00s)
=== RUN TestReadTerraformBackendAzurermInternal_RetrySuccess
--- PASS: TestReadTerraformBackendAzurermInternal_RetrySuccess (2.00s)
=== RUN TestReadTerraformBackendAzurermInternal_MissingContainerName
--- PASS: TestReadTerraformBackendAzurermInternal_MissingContainerName (0.00s)
=== RUN TestReadTerraformBackendAzurermInternal_ReadBodyError
--- PASS: TestReadTerraformBackendAzurermInternal_ReadBodyError (0.00s)
PASS
ok github.com/cloudposse/atmos/internal/terraform_backend 7.011s
Summary by CodeRabbit
-
New Features
- Azure Blob Storage (azurerm) support for reading Terraform state with workspace-aware paths, authentication, retries, and client caching.
-
Documentation
- Added detailed docs and a blog post covering Azure backend usage, examples, migration guidance, and “Try It Now” steps.
-
Improvements
- Clearer permission/not-found reporting and added Azure-specific error signals for more precise error handling.
-
Tests
- Extensive unit and integration tests plus Azure credential precondition checks.
-
Chores
- Updated .gitignore with developer tool patterns.
test(auth): Increase auth test coverage from 6% to 80% with mock provider @osterman (#1702)
## what - Add comprehensive unit and integration tests for Atmos auth system using the existing mock provider - Increase test coverage from **6% to ~80%** (target: 80-90% ✅) - Add regression tests to prevent recurrence of user-reported browser authentication issue - Achieve **100% coverage** for mock provider implementationwhy
- Current auth test coverage was critically low (6%), making it difficult to catch bugs
- User complaint (Bogdan) about browser authentication triggering on every command needed verification and regression protection
- Mock provider was implemented but had zero test coverage
- Need confidence that auth system works correctly without requiring real cloud credentials
Coverage Improvements
| Package | Before | After | Improvement |
|---|---|---|---|
| pkg/auth | 6.2% | 84.6% | +78.4pp |
| pkg/auth/providers/mock | 0% | 100.0% | +100pp |
| pkg/auth/utils | 0% | 100.0% | +100pp |
| pkg/auth/validation | 0% | 90.0% | +90pp |
| pkg/auth/list | 0% | 89.5% | +89.5pp |
| pkg/auth/cloud/aws | 0% | 79.2% | +79.2pp |
| pkg/auth/providers/github | 0% | 78.3% | +78.3pp |
| pkg/auth/factory | 0% | 77.8% | +77.8pp |
| pkg/auth/credentials | 0% | 75.8% | +75.8pp |
| pkg/auth/providers/aws | 0% | 67.8% | +67.8pp |
| pkg/auth/identities/aws | 2.3% | 62.5% | +60.2pp |
Overall: ~6% → ~80% ✅
Key Additions
1. Mock Provider Unit Tests (100% coverage)
pkg/auth/providers/mock/provider_test.go- 15 comprehensive testspkg/auth/providers/mock/identity_test.go- 13 comprehensive tests- Tests cover: authentication, expiration, concurrency, interface compliance
2. Credential Caching Regression Tests
cmd/auth_caching_test.go- 4 test functions with multiple subtests- Verifies credentials are cached after login and reused
- Ensures fast execution (< 2s) vs browser auth (5-30s)
- Tests multi-identity scenarios
3. Integration Test Scenarios
tests/test-cases/auth-mock.yaml- 20+ test scenarios- Auth login, whoami, env, exec, list, logout commands
- Multiple output formats (json, bash, dotenv)
- Error handling and edge cases
User Issue: Browser Auth on Every Command
Status: LIKELY FIXED ✅
The issue where browser authentication was triggered on every command appears to have been resolved by recent PRs (#1655, #1653, #1640). This PR adds comprehensive regression tests to:
- Verify credentials are cached after authentication
- Ensure subsequent commands use cached credentials
- Confirm fast execution without browser prompts
- Prevent regression of this issue
Testing
# Run mock provider tests
$ go test ./pkg/auth/providers/mock/... -v
=== RUN TestNewProvider
=== RUN TestProvider_Authenticate
=== RUN TestProvider_Concurrency
... 28 tests PASS
coverage: 100.0% of statements
# Run auth package tests
$ go test -cover ./pkg/auth/...
pkg/auth: 84.6% coverage ✅
pkg/auth/providers/mock: 100% coverage ✅
pkg/auth/utils: 100% coverage ✅
... all passingBenefits
- No cloud credentials needed for auth testing
- Fast test execution (milliseconds vs seconds)
- Deterministic results (fixed expiration dates)
- CI/CD ready (no secrets required)
- Regression protection for caching issue
- 80% coverage meets industry standards
references
Add auth console command for web console access @osterman (#1684)
## what - Add `atmos auth console` command to open cloud provider web consoles using authenticated credentials - Implement AWS console access via federation endpoint (similar to aws-vault login) - Add 100+ AWS service destination aliases for convenient access - Create dedicated `pkg/http` package for HTTP client utilities - Add pretty formatted output using lipgloss with Atmos theme colors - Consolidate browser opening functionality to existing `OpenUrl` helperwhy
- Provides convenient browser access to cloud consoles without manually copying credentials
- Eliminates context switching between terminal and browser for console access
- Uses provider-native federation endpoints for secure temporary access
- Extensible interface pattern supports future Azure/GCP implementations
features
- Service Aliases: Use shorthand like
s3,ec2,lambdainstead of full console URLs - Autocomplete: Shell completion for destination and identity flags
- Session Control: Configurable duration (up to 12 hours for AWS) with expiration display
- Clean Output: URL only shown on error or with
--no-openflag - Scriptable:
--print-onlyflag for piping URLs to other tools - Provider-Agnostic: Interface design ready for multi-cloud support
implementation
- Created
ConsoleAccessProviderinterface inpkg/auth/types/interfaces.go - Implemented
ConsoleURLGeneratorfor AWS using federation endpoint - Added
ResolveDestination()with case-insensitive alias lookup - Moved HTTP utilities from
pkg/utilsto dedicatedpkg/httppackage - Used existing
OpenUrl()function for cross-platform browser opening - Added comprehensive tests achieving 85.9% coverage
testing
- Unit tests for console URL generation (15 test cases)
- Unit tests for destination alias resolution (100+ aliases tested)
- Mock HTTP client for testing without network calls
- Table-driven tests with edge case coverage
documentation
- CLI reference:
website/docs/cli/commands/auth/console.mdx - Blog post:
website/blog/2025-10-20-auth-console-web-access.md - Proposal document:
docs/proposals/auth-web-console.md - Embedded markdown usage examples
references
- Similar to aws-vault's console login feature
- AWS Federation Endpoint: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_enable-console-custom-url.html
Summary by CodeRabbit
-
New Features
- Added atmos auth console: opens cloud provider web consoles via temporary sign-in URLs (AWS supported now; Azure/GCP planned).
- Supports service aliases (s3, ec2, etc.), full destination URLs, session duration (AWS up to 12h), issuer, --print-only, --no-open and identity selection/completion.
-
Documentation
- New CLI docs, usage guide, PRD and blog post with examples and troubleshooting.
-
Tests
- Expanded tests and CI snapshots for the new command and destination resolution.
fix: Only log verbose test output on failure @osterman (#1704)
## what - Replace unconditional `t.Log()` calls with `t.Cleanup()` handlers that only output verbose YAML/data when tests fail - Eliminate noisy stderr output during successful test runs while preserving debug information when tests fail - Add fallback to raw data output (`%+v`) when YAML conversion produces empty stringswhy
- CI test runs were showing verbose YAML dumps to stderr even when tests passed
- This cluttered test output and made it difficult to identify actual issues
- Debug information is still valuable when tests fail, but shouldn't appear during successful runs
- Go's
t.Log()always outputs to stderr, regardless of test success/failure
demo
Finally clean output!
go mod download
Running tests with subprocess coverage collection
ok github.com/cloudposse/atmos 7.020s coverage: 14.8% of statements in ./...
ok github.com/cloudposse/atmos/cmd 7.581s coverage: 20.7% of statements in ./...
ok github.com/cloudposse/atmos/cmd/about 0.134s coverage: 0.1% of statements in ./...
ok github.com/cloudposse/atmos/cmd/internal 0.099s coverage: 0.1% of statements in ./...
? github.com/cloudposse/atmos/cmd/markdown [no test files]
ok github.com/cloudposse/atmos/cmd/version 1.802s coverage: 1.4% of statements in ./...
ok github.com/cloudposse/atmos/errors 0.213s coverage: 0.4% of statements in ./...
ok github.com/cloudposse/atmos/internal/aws_utils 0.120s coverage: 0.1% of statements in ./...
ok github.com/cloudposse/atmos/internal/exec 84.175s coverage: 32.9% of statements in ./...
ok github.com/cloudposse/atmos/internal/terraform_backend 32.223s coverage: 0.9% of statements in ./...
github.com/cloudposse/atmos/internal/tui/atmos coverage: 0.0% of statements
github.com/cloudposse/atmos/internal/tui/components/code_view coverage: 0.0% of statements
ok github.com/cloudposse/atmos/internal/tui/templates 0.125s coverage: 0.5% of statements in ./...
github.com/cloudposse/atmos/internal/tui/templates/term coverage: 0.0% of statements
ok github.com/cloudposse/atmos/internal/tui/utils 0.218s coverage: 0.2% of statements in ./...
github.com/cloudposse/atmos/internal/tui/workflow coverage: 0.0% of statements
ok github.com/cloudposse/atmos/pkg/atlantis 1.434s coverage: 10.3% of statements in ./...
ok github.com/cloudposse/atmos/pkg/auth 0.141s coverage: 2.1% of statements in ./...
ok github.com/cloudposse/atmos/pkg/auth/cloud/aws 0.113s coverage: 0.8% of statements in ./...
ok github.com/cloudposse/atmos/pkg/auth/credentials 0.316s coverage: 0.9% of statements in ./...
ok github.com/cloudposse/atmos/pkg/auth/factory 0.141s coverage: 0.2% of statements in ./...
ok github.com/cloudposse/atmos/pkg/auth/identities/aws 0.139s coverage: 1.7% of statements in ./...
ok github.com/cloudposse/atmos/pkg/auth/list 0.138s coverage: 1.5% of statements in ./...
ok github.com/cloudposse/atmos/pkg/auth/providers/aws 0.098s coverage: 1.6% of statements in ./...
ok github.com/cloudposse/atmos/pkg/auth/providers/github 0.072s coverage: 0.3% of statements in ./...
ok github.com/cloudposse/atmos/pkg/auth/providers/mock 0.133s coverage: 0.1% of statements in ./...
ok github.com/cloudposse/atmos/pkg/auth/types 0.075s coverage: 0.2% of statements in ./...
ok github.com/cloudposse/atmos/pkg/auth/utils 0.099s coverage: 0.0% of statements in ./...
ok github.com/cloudposse/atmos/pkg/auth/validation 0.150s coverage: 0.7% of statements in ./...
ok github.com/cloudposse/atmos/pkg/aws 0.199s coverage: 2.4% of statements in ./...
ok github.com/cloudposse/atmos/pkg/component 0.898s coverage: 10.1% of statements in ./...
ok github.com/cloudposse/atmos/pkg/component/mock 0.178s coverage: 0.4% of statements in ./...
ok github.com/cloudposse/atmos/pkg/config 3.247s coverage: 5.7% of statements in ./...
ok github.com/cloudposse/atmos/pkg/config/homedir 0.073s coverage: 0.2% of statements in ./...
ok github.com/cloudposse/atmos/pkg/convert 0.048s coverage: 0.0% of statements in ./...
ok github.com/cloudposse/atmos/pkg/datafetcher 0.228s coverage: 0.2% of statements in ./...
ok github.com/cloudposse/atmos/pkg/describe 29.214s coverage: 13.2% of statements in ./...
ok github.com/cloudposse/atmos/pkg/downloader 1.115s coverage: 1.6% of statements in ./...
ok github.com/cloudposse/atmos/pkg/filematch 0.135s coverage: 0.3% of statements in ./...
github.com/cloudposse/atmos/pkg/filesystem coverage: 0.0% of statements
ok github.com/cloudposse/atmos/pkg/filetype 0.078s coverage: 0.4% of statements in ./...
ok github.com/cloudposse/atmos/pkg/generate 0.685s coverage: 7.7% of statements in ./...
ok github.com/cloudposse/atmos/pkg/git 0.164s coverage: 0.3% of statements in ./...
ok github.com/cloudposse/atmos/pkg/github 2.462s coverage: 0.4% of statements in ./...
ok github.com/cloudposse/atmos/pkg/hooks 0.264s coverage: 7.5% of statements in ./...
ok github.com/cloudposse/atmos/pkg/list 2.193s coverage: 12.0% of statements in ./...
ok github.com/cloudposse/atmos/pkg/list/errors 0.073s coverage: 0.1% of statements in ./...
ok github.com/cloudposse/atmos/pkg/list/flags 0.072s coverage: 0.1% of statements in ./...
ok github.com/cloudposse/atmos/pkg/list/format 0.119s coverage: 0.6% of statements in ./...
ok github.com/cloudposse/atmos/pkg/list/utils 0.187s coverage: 0.2% of statements in ./...
ok github.com/cloudposse/atmos/pkg/logger 0.161s coverage: 0.3% of statements in ./...
ok github.com/cloudposse/atmos/pkg/merge 0.227s coverage: 1.7% of statements in ./...
ok github.com/cloudposse/atmos/pkg/pager 0.076s coverage: 0.9% of statements in ./...
ok github.com/cloudposse/atmos/pkg/perf 1.238s coverage: 0.5% of statements in ./...
ok github.com/cloudposse/atmos/pkg/pro 0.177s coverage: 0.8% of statements in ./...
ok github.com/cloudposse/atmos/pkg/pro/dtos 0.051s coverage: 0.0% of statements in ./...
ok github.com/cloudposse/atmos/pkg/profiler 1.861s coverage: 0.4% of statements in ./...
ok github.com/cloudposse/atmos/pkg/provenance 0.130s coverage: 1.8% of statements in ./...
ok github.com/cloudposse/atmos/pkg/retry 0.176s coverage: 0.2% of statements in ./...
ok github.com/cloudposse/atmos/pkg/schema 0.070s coverage: 0.3% of statements in ./...
ok github.com/cloudposse/atmos/pkg/spacelift 0.787s coverage: 8.4% of statements in ./...
ok github.com/cloudposse/atmos/pkg/stack 0.346s coverage: 4.3% of statements in ./...
ok github.com/cloudposse/atmos/pkg/store 0.139s coverage: 1.7% of statements in ./...
ok github.com/cloudposse/atmos/pkg/telemetry 0.518s coverage: 2.7% of statements in ./...
github.com/cloudposse/atmos/pkg/telemetry/mock coverage: 0.0% of statements
ok github.com/cloudposse/atmos/pkg/ui/heatmap 0.129s coverage: 0.9% of statements in ./...
ok github.com/cloudposse/atmos/pkg/ui/markdown 0.138s coverage: 0.4% of statements in ./...
? github.com/cloudposse/atmos/pkg/ui/theme [no test files]
ok github.com/cloudposse/atmos/pkg/utils 0.743s coverage: 4.8% of statements in ./...
ok github.com/cloudposse/atmos/pkg/validate 1.354s coverage: 14.5% of statements in ./...
ok github.com/cloudposse/atmos/pkg/validator 0.116s coverage: 0.2% of statements in ./...
ok github.com/cloudposse/atmos/pkg/vender 3.308s coverage: 3.9% of statements in ./...
ok github.com/cloudposse/atmos/pkg/version 0.069s coverage: 0.0% of statements in ./...
ok github.com/cloudposse/atmos/pkg/xdg 0.046s coverage: 0.1% of statements in ./...
ok github.com/cloudposse/atmos/tests 174.022s coverage: 14.3% of statements in ./...
ok github.com/cloudposse/atmos/tests/testhelpers 90.419s coverage: 1.1% of statements in ./...
Coverage report generated: coverage.outreferences
- Affects 9 test files with 29 cleanup handlers added
- Modified files:
pkg/component/component_processor_test.gopkg/describe/describe_affected_test.gopkg/describe/describe_component_test.gopkg/describe/describe_dependents_test.gopkg/describe/describe_stacks_test.gopkg/list/list_components_test.gopkg/merge/merge_test.gopkg/spacelift/spacelift_stack_processor_test.gopkg/stack/stack_processor_test.go
🤖 Generated with Claude Code
Add linter rule for missing defer perf.Track() calls @osterman (#1698)
## what - Added new `perf-track` linter rule to catch missing `defer perf.Track()` calls - Enabled by default with explicit package and type exclusions - Integrated with existing lintroller custom linter frameworkwhy
- Enforces coding guidelines requiring performance tracking on all public functions
- Catches violations early in development before code review
- Prevents missing perf tracking that would be tedious to find manually
- Uses explicit exclusions for infrastructure code (logger, profiler, perf, store, ui, tui)
references
- Follows coding guidelines in
CLAUDE.mdfor mandatorydefer perf.Track()usage - Addresses hundreds of potential violations by catching them at lint time
- Exclusions prevent infinite recursion and avoid tracking overhead in low-level code
🤖 Generated with Claude Code
Summary by CodeRabbit
-
New Features
- Added a lint rule that enforces a defer-based performance-tracking call at the start of exported functions/methods; enabled by default with a config toggle to disable.
-
Tests
- Added unit tests and example cases demonstrating compliant and non-compliant exported functions/methods for the new rule.
-
Documentation
- Updated lint configuration docs to mention the new performance-tracking check and its settings.
Add condition to skip Docker build for prerelease @goruha (#1700)
## what * Add condition to skip Docker build for prereleasewhy
- Exclude prerelease versions from Homebrew workflows
Summary by CodeRabbit
- Chores
- Build workflow updated so Docker image build/push steps are skipped for prerelease releases.
- Dependency review job runner specification changed to a composite runner configuration with additional runner attributes.
feat: Add `atmos auth shell` command @osterman (#1640)
## what - Add `atmos auth shell` command to launch an interactive shell with authentication environment variables pre-configured - Implement shell detection that respects `$SHELL` environment variable with fallbacks to bash/sh - Add `--shell` flag with viper binding to `ATMOS_SHELL` and `SHELL` environment variables - Support `--` separator for passing custom shell arguments to the launched shell - Track shell nesting level with `ATMOS_SHLVL` environment variable - Propagate shell exit codes back to Atmos process - Set `ATMOS_IDENTITY` environment variable in the shell sessionwhy
- Users need an easy way to work interactively with cloud credentials without manually managing environment variables
- Similar to
atmos terraform shell, this provides a consistent experience for authenticated sessions - Allows running multiple commands in a single authenticated session without re-authenticating
- Supports custom shell configurations and arguments for flexibility
references
- Similar to existing
atmos terraform shellcommand implementation - Follows authentication patterns from
atmos auth execandatmos auth env
testing
- Comprehensive unit tests with 80-100% coverage on testable functions
- 25 passing tests covering:
- Shell detection and fallback logic (100% coverage)
- Environment variable management (100% coverage)
- Shell nesting level tracking (83-100% coverage)
- Exit code propagation (tested with codes 0, 1, 42)
- Flag parsing and viper integration
- Cross-platform support (Unix and Windows)
- All linting checks passing (0 issues)
- Pre-commit hooks passing
documentation
- Added
website/docs/cli/commands/auth/auth-shell.mdxwith full command documentation - Created
cmd/markdown/atmos_auth_shell_usage.mdwith usage examples - Includes purpose note, usage patterns, examples, and environment variable reference
Summary by CodeRabbit
-
New Features
- Interactive authenticated shell with shell selection, argument passthrough, nested-shell tracking, and identity selection.
- Pluggable credential storage: system, file (path/password) and memory backends selectable via config/env.
- Deterministic mock auth provider for testing.
-
Documentation
- New auth-shell docs, usage examples, blog posts, keyring-backends guide, XDG docs, and PRD.
-
Tests
- Expanded unit/integration coverage for shell flows, keyring backends, XDG, and credential stores.
-
Chores
- Added keyring-related dependencies, CI/workflow and tooling adjustments.
Improve auth login with identity selection @osterman (#1655)
## what- Modified the
auth logincommand to automatically prompt for an identity when no--identityflag is provided. - This leverages the existing
authManager.GetDefaultIdentity()which handles interactive selection and fallback logic. - Updated documentation to reflect this new behavior.
why
- Users were prompted to manually select an identity in interactive sessions when no default was set.
- This change simplifies the login process by automatically invoking the interactive selector or using the default identity when available, improving user experience and reducing manual input.
references
- No specific issue linked - this is a user experience enhancement.
Replace deny-licenses with allow-licenses and remove redundant workflow @osterman (#1692)
## what - Delete redundant `.github/workflows/dependabot.yml` workflow file - Update `dependency-review.yml` to use `allow-licenses` instead of deprecated `deny-licenses` parameter - Maintain PR commenting functionality with `comment-summary-in-pr: always` - Allow only permissive licenses: MIT, Apache-2.0, BSD-2-Clause, BSD-3-Clause, ISC, MPL-2.0, 0BSD, Unlicense, CC0-1.0why
- GitHub deprecated the
deny-licensesparameter in favor ofallow-licensesfor better security posture - The
dependabot.ymlworkflow was redundant - we already havedependency-review.ymlthat provides more comprehensive dependency review - Using an allow-list approach is more secure than a deny-list approach
- Consolidating to a single dependency review workflow reduces maintenance overhead
references
Summary by CodeRabbit
- Chores
- Implemented a 2-week minimum age requirement for automated dependency updates
- Updated dependency review workflow to enforce permissive open-source licenses only
- Consolidated dependency management configurations
Compress CLAUDE.md and add size limit enforcement @osterman (#1693)
## what - Compressed CLAUDE.md from 40.3k chars to 6.3k chars (84% reduction) - Added GitHub action to enforce 40k character limit on CLAUDE.md - Refactored into reusable composite action patternwhy
- Large CLAUDE.md files impact performance and token usage
- Need automated enforcement to prevent file bloat
- Reusable action pattern improves maintainability
Compression Details
Metrics:
- Size: 40,300 chars → 6,301 chars (84.4% reduction)
- Lines: 1,183 → 165 (86.0% reduction)
- Current usage: 15% of 40k limit
Techniques Applied:
- Removed verbose explanations, kept terse requirements
- Consolidated redundant examples
- Merged related sections
- Preserved all MANDATORY rules
What's Preserved:
✅ All MANDATORY requirements
✅ Code patterns and conventions
✅ Error handling strategies
✅ Testing requirements
✅ CLI command structure
✅ Development workflows
✅ Cross-platform compatibility rules
✅ Git and PR guidelines
GitHub Action Structure
.github/
├── actions/
│ └── check-claude-md-size/
│ ├── action.yml # Composite action with all logic
│ └── README.md # Action documentation
└── workflows/
└── claude.yml # Simple 16-line workflow
Action Features:
- Validates file size on PR changes
- Posts/updates intelligent PR comments
- Fails CI if limit exceeded
- Configurable file path and size limit
- Provides outputs: size, exceeds-limit, usage-percent
Triggers:
- Pull requests modifying CLAUDE.md
- Changes to workflow or action files
references
- Follows composite action best practices
- Pattern similar to existing actions in the ecosystem
- Maintains consistency with project's CI/CD approach
Summary by CodeRabbit
-
New Features
- Automated CLAUDE.md size validation with configurable limits; posts and updates PR comments when limits are exceeded or resolved.
-
Documentation
- Reworked CLAUDE.md to emphasize architecture and mandatory design patterns instead of granular step-by-step procedures.
- Added user-facing documentation for the CLAUDE.md size-check action and its usage.
Add auth logout command @osterman (#1656)
## whatThis pull request introduces the atmos auth logout command, enabling users to securely remove locally cached credentials. The command supports:
- Identity-specific logout: Removes credentials for a given identity and its entire authentication chain.
- Provider-specific logout: Removes all credentials associated with a particular provider.
- Interactive mode: Prompts the user to select what to logout when no arguments are provided.
- Dry-run mode: Previews what would be removed without making changes.
- Comprehensive cleanup: Deletes credentials from the system keyring and provider-specific files (e.g., AWS credentials).
- Best-effort error handling: Continues cleanup even if individual steps fail, reporting all encountered errors.
why
This feature addresses several key pain points:
- Security: Allows users to securely remove stale credentials, reducing the risk of unauthorized access.
- Developer Experience: Simplifies switching between different identities or environments by providing a clean way to remove existing credentials.
- Compliance: Enables auditing of credential removal and ensures adherence to security policies.
- Troubleshooting: Provides a straightforward method to clear authentication caches when debugging.
The implementation uses native Go operations for file system cleanup and integrates with go-keyring for cross-platform credential store access. It leverages Charmbracelet libraries for a polished interactive user experience and styled output.
references
closes #735
Summary by CodeRabbit
Release Notes
-
New Features
- Added
atmos auth logoutCLI command to remove stored credentials - Supports logout by identity, by provider, or all identities at once
- Interactive mode to select which credentials to remove
- Dry-run mode to preview credential removals without executing
- Browser session warning displayed after successful logout
- Added
-
Documentation
- Added guides and reference documentation for logout workflows and usage
Replace custom license-check with GitHub dependency-review-action @osterman (#1690)
## what- Replaced custom license-check action (308 lines) with GitHub's native
dependency-review-action - Simplified workflow from 44 lines to 18 lines with better functionality
- Added automated NOTICE file generation and validation to CI
- Workflow now:
- Validates licenses using GitHub's dependency graph
- Blocks PRs with forbidden licenses (GPL, AGPL, etc.)
- Generates NOTICE file using
go-licenses - Fails CI if NOTICE file is out of date
why
- Reduce maintenance burden: GitHub's native action requires zero maintenance vs custom bash fighting
go-licensesbugs - Better reliability: Native GitHub solution works across all ecosystems, not just Go
- Automated NOTICE updates: Ensures NOTICE file stays in sync with dependencies automatically
- Clearer error messages: Developers get actionable feedback when NOTICE file needs updating
- Industry standard: Uses same tooling as thousands of other repositories
references
- GitHub dependency-review-action
- google/go-licenses - Still used for NOTICE generation
- Replaces
.github/actions/license-check/(264 lines) and custom workflow (44 lines)
Troubleshooting Notes
autofix.ci Artifact Upload Errors (RESOLVED)
Error encountered:
Attempt 4 of 5 failed with error: Unexpected token 'O', "Original A"... is not valid JSON
Error: Failed to CreateArtifact: Failed to make request after 5 attempts
Root Cause:
When using RunsOn self-hosted runners with extras=s3-cache, the runs-on/action@v2 step is required for artifact uploads to work. Without it, the artifact API receives HTML error pages instead of JSON responses.
Fix Applied:
- Added
runs-on/action@v2as first step in autofix.yml (required for S3 cache compatibility) - Added
permissions: { contents: read, actions: write }(was empty{}which grants NO permissions) - Upgraded autofix-ci/action from v1.3.1 to v1.3.2
Reference:
- RunsOn S3 Cache Documentation
- Key quote: "If you have enabled the
s3-cacheextra and are using theactions/upload-artifact@v4action in your workflows, you must ensure that you have also included theruns-on/action@v2action in your jobs."
Time saved for future developers: ~2 hours of debugging 🎯
Summary by CodeRabbit
-
New Features
- Added automatic dependency license review to flag restricted licenses (GPL, LGPL, AGPL) on pull requests.
- Added vulnerability severity checks to the dependency review process.
- Introduced comprehensive NOTICE file documenting all third-party dependencies and their licenses.
-
Documentation
- Added documentation for license generation utilities and scripts.
Add Component Registry Pattern and Mock Component @osterman (#1648)
## whatThis Pull Request introduces the Component Registry Pattern to Atmos, enabling extensible support for various component types. It lays the foundation for adding new infrastructure tools as plugins in the future.
Key changes include:
- ComponentProvider Interface: A new interface defining the contract for all component providers.
- Component Registry: A thread-safe global registry to manage component providers.
- Mock Component Provider: A proof-of-concept implementation for testing the registry and component lifecycle without external dependencies. It demonstrates inheritance, merging, and cross-component dependencies.
- Hybrid Configuration Schema:
pkg/schema/schema.gois updated to support both statically defined built-in component types (Terraform, Helmfile, Packer) and dynamically registered plugin types via thePluginsmap. - Sentinel Errors: New sentinel errors related to component providers and configurations are added to
errors/errors.go. - JSON Schema Updates: Schemas in
pkg/datafetcher/schema/are modified to allow additional properties for component types, accommodating the hybrid configuration. - Developer Guide: A new markdown file
docs/developing-component-plugins.mdis added, detailing how to create new component plugins.
why
The existing hardcoded approach for component types (Terraform, Helmfile, Packer) limits extensibility and maintainability. This PR introduces a more robust and flexible pattern:
- Extensibility: Allows easy addition of new component types (e.g., Pulumi, CDK, CloudFormation) without modifying core Atmos code.
- Plugin Support: Paves the way for external component plugins in future phases.
- Testability: The mock component enables thorough testing of the registry pattern, configuration inheritance, and dependency resolution without requiring external tools or cloud provider access.
- Consistency: Adopts a pattern similar to the existing command registry, promoting a unified architectural approach.
- Maintainability: Centralizes component logic within providers, reducing code duplication and improving clarity.
- Backward Compatibility: Existing configurations and functionality remain unaffected. The hybrid schema ensures existing component types continue to work seamlessly while introducing the new pattern.
- Enhanced Testing: Introduces specific test coverage requirements (90%+) for the registry and mock component, including thread-safety and edge-case testing.
references
closes #589
closes #600
closes #601
Summary by CodeRabbit
-
New Features
- Adds a component registry, plugin-style component support, and a mock provider for testing; components can now be discovered at runtime and report available commands.
- Component configuration now accepts dynamic plugin entries (new Plugins field) for greater flexibility.
-
Documentation
- New developer guide for building component plugins, a registry migration pattern, and expanded development requirements and best practices.
-
Tests
- Comprehensive registry and mock-provider test suites and updated CLI snapshot to show Plugins field.
Fix blog post ordering and add explicit dates @osterman (#1689)
## what - Add explicit `date:` field to all blog post frontmatter for consistent ordering - Fix welcome post date to 2025-10-12 so it appears first in the changelog - Fix chdir post filename and date to 2025-10-19 (actual PR merge date) - Add `` markers to chdir and pager posts for proper summaries - Remove duplicate `index.md` that was causing routing conflictswhy
- Blog posts were displaying in incorrect chronological order
- Some posts were missing truncate markers, causing warnings during build
- Welcome post should appear first as it introduces the changelog
- Duplicate index.md was causing Docusaurus routing conflicts
references
- Fixes blog post ordering issues identified by user
Summary by CodeRabbit
- Documentation
- Added new blog posts covering Atmos authentication, provenance tracking, command registry patterns, AWS SSO verification, version list commands, and authentication tutorials.
- Updated blog post on pager default behavior with migration guidance and configuration instructions.
- Enhanced blog content metadata and organization.
Add license check workflow @osterman (#1680)
## what- Added a GitHub Actions workflow (
.github/workflows/license-check.yml) to automatically audit Go project dependencies for license compliance. - This workflow triggers on pull request events (opened, synchronize, reopened) that affect
go.mod,go.sum, or the workflow file itself. - It also includes scheduled runs (weekly on Mondays) and manual dispatch for flexibility.
- A new script (
scripts/check-licenses.sh) was introduced to perform the actual license check usinggo-licenses. - The script checks for "forbidden" license types and generates a summary report.
- The generated CSV report from
go-licenses reportis now uploaded as a GitHub Actions artifact.
why
- To proactively identify and prevent the introduction of dependencies with problematic licenses (e.g., GPL, AGPL) into the project.
- Automates the license auditing process, reducing manual effort and the risk of oversight.
- Ensures compliance with licensing requirements, especially important for open-source and commercial projects.
- The CI integration provides immediate feedback on PRs affecting dependencies.
- Uploading the report as an artifact allows for easy review of detailed license information.
references
Summary by CodeRabbit
- Chores
- Added automated license compliance checks that run on pull requests, weekly, and on demand, producing a downloadable CSV license report retained for 30 days.
- Added a license-audit workflow and scanning script that installs/checks the scanner as needed, handles known edge cases, summarizes license distribution, and emits clear pass/fail results.
Add atmos auth list command with multiple output formats @osterman (#1645)
## what - Add new `atmos auth list` command to list all configured authentication providers and identities - Support multiple output formats: table (default), tree, JSON, YAML, Graphviz, Mermaid, and Markdown - Implement filtering by providers or identities with optional name filtering - Add comprehensive documentation and usage exampleswhy
- Users need visibility into their authentication configuration to understand providers, identities, and their relationships
- Multiple output formats enable different use cases: interactive CLI (table/tree), automation (JSON/YAML), and documentation (Graphviz/Mermaid)
- Visual formats help understand complex authentication chains where identities assume roles through providers or other identities
references
- Implements feature request for authentication configuration visibility
- Follows existing Atmos patterns for command structure and output formatting
Summary by CodeRabbit
-
New Features
- Added an auth list command to view providers and identities with flexible filtering and multiple output formats (table, tree, JSON, YAML, Graphviz, Mermaid, Markdown)
- Added chain visualization outputs (graph/mermaid/markdown) for easier relationship tracing
-
Bug Fixes
- Support expanded tilde (~) paths for the CLI chdir flag
-
Documentation
- Comprehensive CLI docs, usage guide, and blog post added
-
Tests
- Extensive unit tests and format/diagram validation added
Update mockgen to go.uber.org/mock @osterman (#1681)
## what- Replaced the usage of the archived
github.com/golang/mockwithgo.uber.org/mock. - Updated all import paths from
github.com/golang/mock/gomocktogo.uber.org/mock/gomock. - Updated all
//go:generate mockgendirectives to usego run go.uber.org/mock/mockgen@v0.6.0(pinned version for reproducible builds). - Regenerated all mock files with the pinned version.
- Added a lint rule in
.golangci.ymlto disallow usage ofgithub.com/golang/mock. - Configured
.golangci.ymlto exclude generated mock files (mock_*.go) from godot linter checks.
why
github.com/golang/mockis an archived repository and should no longer be used.go.uber.org/mockis the maintained successor.- Pinning to
@v0.6.0ensures reproducible builds across different environments. - This change ensures the project uses actively maintained dependencies and prevents accidental use of the deprecated library through a new lint rule.
references
- closes #123
Fix go install compatibility by removing replace directive @osterman (#1685)
## what - Remove `replace` directive from `go.mod` that breaks `go install github.com/cloudposse/atmos@latest` - Update Atmos internal code to import from `pkg/config/homedir` directly instead of via replaced module path - Remove `go.mod` from `pkg/config/homedir` (no longer needed as separate module) - Add regression test `TestGoModNoReplaceDirectives` to prevent future breakage of `go install` compatibilitywhy
- The
replacedirective introduced in v1.195.0 (PR #1631) breaks a documented installation method go install cmd@versionintentionally does not support modules withreplaceorexcludedirectives- This is a fundamental design decision in Go (golang/go#44840, #69762, #50698) that won't be changed
- Users attempting
go install github.com/cloudposse/atmos@latestget errors and cannot install - Breaking this installation path creates user friction and support burden
tradeoffs
What we're giving up
The replace directive was added to ensure all transient dependencies (16+ packages) use Atmos's improved fork of the deprecated mitchellh/go-homedir package instead of the archived original.
Unfortunately, we must accept that transient dependencies will use the deprecated package because:
- There's no way to force transient dependencies to use our fork without
replace - We can't publish our fork as
github.com/mitchellh/go-homedir(we don't own that domain) - Requiring all 16+ transient dependencies to update their imports is not feasible
What we're keeping
- Atmos's own code still uses the improved
pkg/config/homedirimplementation with better error handling, refactoring, and security annotations - The deprecated
mitchellh/go-homedirpackage has no known security vulnerabilities (verified via Snyk) - The package is stable (last commit 2019, archived July 2024 as feature-complete, not broken)
The decision
Restoring go install compatibility is more important than forcing transient dependencies to use our improved fork. The deprecated package works fine, and Atmos's direct usage still benefits from our improvements.
testing
- Added
TestGoModNoReplaceDirectivesto catch future regressions - Verified
go buildsucceeds - Verified all existing tests pass
- Verified binary runs correctly with
./atmos version
references
- Original PR that introduced the
replacedirective: #1631 - User report: Slack thread from Jonathan Rose
- Go issues on
replacedirective limitation: golang/go#44840, golang/go#69762, golang/go#50698
Replace mitchellh/mapstructure with go-viper/mapstructure @osterman (#1678)
## what- Replaced direct usage of the archived
github.com/mitchellh/mapstructurewithgithub.com/go-viper/mapstructure/v2. - Added a
replacedirective ingo.modto force all transitive dependencies that usegithub.com/mitchellh/mapstructureto instead use the maintainedgithub.com/go-viper/mapstructurefork (v1.6.0).
why
- The
mitchellh/mapstructurelibrary has been archived, meaning it will no longer receive updates or security patches. github.com/go-viper/mapstructure/v2is the actively maintained and recommended fork, ensuring continued support and bug fixes.- Using the
replacedirective ensures that even indirect dependencies use the supported fork, eliminating reliance on the archived library.
references
- closes #123
Summary by CodeRabbit
- Chores
- Updated internal dependency management to use go-viper/mapstructure v2 instead of the previous mapstructure implementation across the codebase for improved compatibility and maintenance.
Add spinner and TTY dialog for AWS SSO auth @osterman (#1653)
## what- Enhances the AWS SSO authentication flow by introducing a visually appealing, interactive terminal dialog using the
charmbraceletlibrary. - Displays a colored, bordered dialog box in TTY environments showing the AWS SSO verification code and instructions.
- Integrates an animated spinner to indicate when the system is waiting for authentication.
- Gracefully degrades to plain text output in non-TTY environments (e.g., CI pipelines) to ensure compatibility.
why
- Improved User Experience: The charmbracelet dialog provides a more engaging and informative user experience during the AWS SSO authentication process, making it easier to understand and follow.
- Clearer Verification: The prominent display of the verification code with styling helps users visually confirm the code against what is shown in their browser.
- Real-time Feedback: The spinner provides immediate visual feedback that the system is actively waiting for authentication, reducing user uncertainty.
- Universal Compatibility: The graceful degradation ensures that the authentication flow remains functional and usable across all environments, including those without TTY capabilities.
- Enhanced Readability: Color-coded elements and clear messaging improve the readability of important information, especially the verification code and URLs.
references
- closes #123 (Assuming this is the issue being addressed)
- Further context on AWS SSO device authorization flow: AWS SSO Documentation
Summary by CodeRabbit
-
New Features
- Styled verification dialog with automated browser opening, animated spinner during SSO device authorization, and Ctrl+C cancellation.
- Unified display for authentication results with human-friendly expiration durations and visual expiring indicators.
-
Documentation
- Added detailed AWS IAM Identity Center / device-authorization flow docs and clarified device codes vs. MFA tokens.
-
Improvements
- Graceful degradation for non-TTY/CI environments and consistent UX across auth commands.
Fix segfault in TestGetAffectedComponents when error pointer is corrupted @osterman (#1677)
## what - Fix segmentation violation in TestGetAffectedComponents at line 247 - Safely convert error to string before passing to `t.Skipf()`why
- On macOS ARM64, when gomonkey patches fail, the real function gets called with invalid test data
- This can result in a corrupted error pointer being returned (observed address:
0x646e657065646b73) fmt.Sprintfwith%vtries to dereference the corrupt pointer, causing a segfault- Converting error to string first using
err.Error()avoids dereferencing the corrupt pointer
references
- Fixes GitHub Actions failure: https://github.com/cloudposse/atmos/actions/runs/18656461566/job/53187085704
- Stack trace showed fault at
terraform_affected_test.go:247
testing
- Verified test now passes without segfault on macOS ARM64
- Test gracefully skips when gomonkey mocking fails
Fix os.Args in tests with SetArgs @osterman (#1675)
## whatThis PR refactors various test files to replace direct manipulation of os.Args with Cobra's recommended RootCmd.SetArgs() method. This change standardizes how command-line arguments are tested across the codebase and improves test reliability by preventing global state pollution.
Specific changes include:
-
cmd/package:- Replaced
os.Argsassignments withRootCmd.SetArgs()incmd/root_test.go,cmd/auth_login_test.go. - Removed unnecessary manual save/restore of
os.Argsincmd/root_test.go. - Documented legitimate usage of
os.Argsincmd/cmd_utils_test.gowhere the function under test directly readsos.Args.
- Replaced
-
pkg/config/package:- Refactored
pkg/config/config.goto exposeparseFlagsFromArgs(args []string)for direct testing of flag parsing logic. - Updated
pkg/config/config_test.goto useparseFlagsFromArgs()where possible, reducingos.Argsmanipulation. - Documented the necessity of
os.Argsmanipulation for integration tests withinpkg/config/config_test.gothat call functions likesetLogConfig().
- Refactored
-
tests/package:- Replaced
os.Argsassignments withcmd.RootCmd.SetArgs()intests/cli_describe_component_test.go,tests/describe_test.go, andtests/validate_schema_test.go.
- Replaced
why
Directly manipulating os.Args in tests is an anti-pattern because:
- Global State Pollution:
os.Argsis global and can cause test leakage, leading to unpredictable failures, especially in parallel test runs. - Not the Cobra Way: Cobra provides
SetArgs()as the idiomatic and safe way to test command execution, managing its own state. - Manual Cleanup Required: Each
os.Argsmanipulation requires manualdeferstatements for restoration, adding boilerplate and potential for error.
By adopting RootCmd.SetArgs():
- Tests become more reliable and predictable.
- Boilerplate for argument setup and cleanup is removed.
- The codebase adheres to Cobra's best practices for testing.
- For legitimate uses of
os.Args(e.g., testing subprocesses that callos.Exit()or integration tests of themain()function), comments have been added to clarify why this approach is necessary.
references
closes #XYZ (if applicable)
Add step to get dependencies in Go setup workflow @goruha (#1679)
## what * Add step to get dependencies in Go setup workflowwhy
- To cache actual dependencies
Summary by CodeRabbit
- Chores
- CI workflow updated to run dependency fetching during build setup, ensuring dependencies are retrieved earlier and improving build preparation reliability.
Use run-os for setup-go @goruha (#1667)
## what * Use run-os for setup-gowhy
- Reduce cache
references
Summary by CodeRabbit
-
Chores
- CI runner selection switched to dynamic, configuration-driven runner entries across workflows; build/test job names now include target/flavor context and include conditional Linux-specific steps.
- Pre-commit, lint, autofix and other CI workflows updated to use the new runner configuration.
-
New Features
- Added a scheduled/manual workflow to warm up Go cache and prepare Go tooling.
- Added a workflow to clear PR-related caches on closed pull requests.
-
Tests
- CI exercises OS/target combinations using the new dynamic runner configuration; Acceptance Tests now depend on the build job.
Add Changelog link and remove old file @osterman (#1676)
## what- Added a "Changelog" link to the top navigation bar in
website/docusaurus.config.js. This link points to the/blogroute, making the blog more accessible to users. - Removed the old, unmaintained
CHANGELOG.mdfile from the root of the repository. This file contained outdated release notes and is no longer necessary as changelogs are now managed as blog posts.
why
- The "Changelog" link was added to the navigation bar as per user request to improve discoverability of blog content, which serves as the current changelog.
- The
CHANGELOG.mdfile was removed because it was obsolete and unmaintained, with changelogs now being published as blog posts. This cleans up the repository and avoids confusion.
references
- closes #123 (This is a placeholder, assuming the user implicitly wants to close an issue related to navigation and cleanup.)
- Link to blog: https://atmos.tools/blog/
Summary by CodeRabbit
-
Documentation
- Removed historical version entries from the changelog.
-
Chores
- Added "Changelog" navigation link to the website header for easier access to release information.
`auth` Leapp Migration Guide @Benbentwo (#1633)
This pull request adds documentation to help users migrate from Leapp to Atmos Auth for AWS IAM Identity Center authentication. The main changes introduce a new migration guide and organize authentication documentation under a dedicated category.Documentation improvements:
- Added a comprehensive migration guide (
migrating-from-leapp.mdx) that explains how to convert Leapp sessions and providers to Atmos Auth YAML configuration, including field mappings, step-by-step instructions, troubleshooting tips, and a comparison table.
Documentation structure:
- Created a new
_category_.jsonfile to group authentication documentation under "Authentication (atmos auth)" in the sidebar for improved discoverability.
Summary by CodeRabbit
- Documentation
- Removed the legacy Atmos Auth User Guide.
- Added a "Migrating from Leapp" tutorial with migration steps, field mappings, and verification commands.
- Added a Geodesic configuration tutorial for Atmos Auth integration.
- Introduced an Auth “Tutorials” category and two new blog posts introducing Atmos Auth and tutorials.
- Reorganized Auth CLI docs: updated ordering, labels, slugs, subcommand links, and sidebar positions.
- Expanded the Auth usage guide with AWS Permission Set account specification guidance and examples.
Update homedir README with fork details @osterman (#1673)
## what- Appended a detailed section to
pkg/config/homedir/README.mddescribing the "Atmos Fork Enhancements". - This new section explains the fork's prioritization of environment variables for test compatibility with
t.Setenv(). - It also details cache management strategies, including disabling caching (
homedir.DisableCache = true) and resetting the cache (homedir.Reset()). - Provides code examples for using these features in Go tests.
why
- To clearly document the specific enhancements made in Atmos's vendored fork of the
mitchellh/go-homedirpackage. - To provide users, particularly those writing Go tests, with clear instructions on how to leverage the improved environment variable support and cache management for better testability.
- The original
mitchellh/go-homedirpackage is deprecated, and this fork is maintained to support these specific testing requirements.
references
closes #279
🚀 Enhancements
chore: Update Pro Instances API @milldr (#1721)
## what - Update endpoint format to include query params for stack & componentwhy
- We've updated the API for Atmos Pro so that we can support slashes in component names
references
Summary by CodeRabbit
- Chore
- Pro Instances API client now sends stack and component as query parameters for more reliable encoding and consistency.
- Documentation
- Added a blog post explaining the endpoint format change, impact, and that no configuration or workflow changes are required.
- Bug Fixes
- Cleaned up authentication output spacing for more compact, consistent display.
fix: Consolidate credential retrieval logic to fix terraform auth @osterman (#1720)
## SummaryThis PR fixes a critical bug where atmos terraform plan and other Terraform commands failed to use file-based credentials, while atmos auth whoami and similar commands worked correctly.
The root cause was duplicate credential retrieval code across three methods with inconsistent fallback behavior. Two methods had keyring → identity storage fallback logic, but one (retrieveCachedCredentials) did not, causing Terraform commands to fail when credentials were in files instead of the keyring.
Root Cause Analysis
Three separate code paths retrieved credentials:
GetCachedCredentials- Had fallback ✓findFirstValidCachedCredentials- Had fallback ✓retrieveCachedCredentials- NO fallback ✗ (used by Terraform execution)
When users authenticated via AWS SSO, credentials were written to files, not cached in the keyring. Terraform commands would fail because the retrieveCachedCredentials path didn't check identity storage.
Solution
Extracted a shared retrieveCredentialWithFallback method as the single source of truth for credential retrieval:
- Fast path: Try keyring cache first (immediate)
- Slow path: Fall back to identity storage if not in keyring (AWS files, etc.)
- All three code paths now delegate to this single method
- Ensures consistent behavior across all operations
Changes
- Added
retrieveCredentialWithFallback()method (38 lines) - Refactored
GetCachedCredentials()- 40% code reduction - Refactored
findFirstValidCachedCredentials()- 57% code reduction - Refactored
retrieveCachedCredentials()- Now uses shared method - Fixed
TestManager_GetCachedCredentials_Pathsto use proper test data - Added regression test
TestManager_retrieveCachedCredentials_TerraformFlow_Regression - Added integration test
TestRetrieveCachedCredentials_KeyringMiss_IdentityStorageFallback - Show active identities
Testing
✅ All auth tests pass (12/12 test suites)
✅ Regression test reproduces original bug, passes with fix
✅ Integration tests verify fallback behavior works
✅ Code compiles successfully
Impact
✅ Terraform commands now work with file-based credentials
✅ ~110 lines of duplicate code eliminated
✅ Single source of truth for credential retrieval
✅ Impossible to have divergent fallback behavior in future
✅ Consistent behavior across all auth operations
References
This PR addresses the issue where valid authenticated sessions would fail during Terraform execution with "credentials not found" error, even though atmos auth whoami showed valid credentials.
See docs/prd/credential-retrieval-consolidation.md for detailed architectural analysis.
🤖 Generated with Claude Code
Summary by CodeRabbit
-
New Features
- Interactive identity selection when using --identity without a value (CLI and Terraform).
- New auth logout --all to sign out all identities.
- ATMOS_IDENTITY env var honored; CLI env outputs add AWS region defaults.
- Identity list now shows authentication status and credential expiry.
-
Bug Fixes
- More reliable credential retrieval with keyring → identity-storage fallback.
- Safer, clearer logout behavior and plain-text summaries.
- Default files display path updated to ~/.config/atmos.
-
Documentation
- Help pages and docs updated for interactive identity modes, logout options, and examples.
fix: Restore PATH inheritance in workflow shell commands @osterman (#1719)
## what - Refactored to **always** merge custom env vars with parent environment - Fixes workflow shell commands failing with "executable file not found in $PATH" - Adds comprehensive unit and integration tests demonstrating the bug and verifying the fixwhy
- After commit 9fd7d15 (PR #1543), workflow shell commands lost access to PATH environment variable
- Users reported workflows that worked in v1.189.0 failed in v1.195.0 with commands like
env,ls,grepnot found - This is a critical regression affecting any workflow using external executables
- Original fix conditionally replaced environment, which was inconsistent with
executeCustomCommandbehavior
Root Cause
The bug occurred in ExecuteShell() function in internal/exec/shell_utils.go:
- Workflow commands call
ExecuteShellwith empty env slice:[]string{} ExecuteShellappendsATMOS_SHLVLto the slice:[]string{"ATMOS_SHLVL=1"}ShellRunnerreceives a non-empty env, so it doesn't fall back toos.Environ()- Shell command runs with ONLY
ATMOS_SHLVLset, losing PATH and all other environment variables
Solution
Refactored ExecuteShell() to always merge custom env vars with parent environment:
// Always start with parent environment
mergedEnv := os.Environ()
// Merge custom env vars (overriding duplicates)
for _, envVar := range env {
mergedEnv = u.UpdateEnvVar(mergedEnv, key, value)
}
// Add ATMOS_SHLVL
mergedEnv = append(mergedEnv, fmt.Sprintf("ATMOS_SHLVL=%d", newShellLevel))This ensures:
- ✅ Empty env (workflows): Full parent environment including PATH
- ✅ Custom env (commands): Custom vars override parent, but PATH is preserved
- ✅ Consistent behavior: Matches
executeCustomCommandpattern (line 393 incmd_utils.go)
Testing
Unit Tests (internal/exec/shell_utils_test.go):
TestExecuteShell/empty_env_should_inherit_PATH_from_parent_process- Verifiesenvcommand worksTestExecuteShell/empty_env_should_inherit_PATH_for_common_commands- Testsls,env,pwd,echoTestExecuteShell/custom_env_vars_override_parent_env- Verifies custom vars properly override parent
Integration Test (tests/test-cases/workflows.yaml):
atmos workflow shell command with PATH- Full end-to-end workflow test usingenv | grep PATH
All tests pass, including existing workflow tests.
references
Summary by CodeRabbit
-
Bug Fixes
- Shell commands now correctly inherit environment variables (including PATH) from the parent process, with custom env vars properly overriding parent values.
-
Tests
- Added tests covering environment inheritance for commands that require PATH, shell builtins, and custom env var overrides.
-
Workflows / Snapshots
- Added a workflow demonstrating PATH-dependent shell commands and updated related test snapshots and test cases.
test: Improve test coverage for keyring fallback to 78.4% @osterman (#1705)
## what - Add comprehensive unit tests for no-op keyring and system keyring functionality - Improve test coverage from 71.2% to 78.4% (+7.2 percentage points) - Add Validate() method to test credential types to satisfy ICredentials interfacewhy
- Ensure critical business logic is properly tested (cache management, expiration checking, error handling)
- Meet 80% test coverage target for new features
- Prevent regressions in keyring fallback behavior introduced in bde37e334
references
- Related to commit bde37e334 which introduced graceful keychain fallback for containerized environments
- Implements test requirements from
docs/prd/keyring-fallback-containerized-environments.md
Test Coverage Improvements
Starting Coverage: 71.2%
Final Coverage: 78.4%
Improvement: +7.2 percentage points
Tests Added (8 new test functions):
TestNoopKeyringStore_ValidCache- Tests cache hit with valid credentialsTestNoopKeyringStore_ExpiredInCache- Tests cache hit with expired credentialsTestNoopKeyringStore_StoreWithMockCredentials- Tests storing mock credentialsTestNoopKeyringStore_ExpirationWarning- Tests expiration warning logicTestSystemKeyringStore_GetAny- Tests retrieving arbitrary data from system keyringTestSystemKeyringStore_GetAny_NotFound- Tests GetAny error handlingTestSystemKeyringStore_SetAny- Tests storing arbitrary data typesTestNewKeyringAuthStore- Tests deprecated backward-compatible function
Coverage by Function:
| File | Function | Before | After | Improvement |
|---|---|---|---|---|
keyring_noop.go
| Retrieve()
| 36.8% | 57.9% | +21.1% |
keyring_system.go
| GetAny()
| 0% | 85.7% | +85.7% |
keyring_system.go
| SetAny()
| 0% | 71.4% | +71.4% |
store.go
| NewKeyringAuthStore()
| 0% | 100% | +100% |
Remaining Uncovered (1.6% to reach 80%):
The uncovered code paths require real AWS credentials and live AWS STS API calls:
- AWS credential validation success paths (lines 79-95 in
Retrieve()) - AWS STS GetCallerIdentity success (lines 152-168 in
validateAWSCredentials())
These are integration-level scenarios better suited for E2E tests with real AWS infrastructure rather than unit tests.
What We Test
✅ Validation failure path - AWS SDK without credentials
✅ Cache behavior - Hits, misses, expiration, staleness
✅ Error handling - Expired/missing credentials
✅ Storage operations - Store, Retrieve, Delete, List
✅ GetAny/SetAny - Arbitrary data storage for all keyring types
✅ Backward compatibility - Deprecated functions
Fix `atmos describe affected --include-dependents --stack ` command to correctly process the dependents only from the provided stack @aknysh (#1703)
## ProblemWhen executing atmos describe affected --include-dependents --stack <stack>, the command was incorrectly processing dependent components from ALL stacks instead of only from the specified stack. This caused:
- Performance issues: YAML functions (
!terraform.output,!terraform.state,!env) were executed for components in all stacks, not just the filtered stack - Incorrect behavior: Dependents from other stacks were being included in the output
- Test gaps: Tests didn't catch this issue because fixtures lacked YAML functions that would fail when processed incorrectly
Root Cause
In internal/exec/describe_dependents.go, the ExecuteDescribeDependents function was calling ExecuteDescribeStacks with an empty string for the stack filter instead of passing the onlyInStack parameter. This caused all stacks to be loaded and processed.
Solution
1. Fixed Stack Filtering
- Added
OnlyInStackparameter toDescribeDependentsArgsstruct - Updated
ExecuteDescribeDependentsto pass the stack filter through toExecuteDescribeStacks - Ensured dependents are correctly filtered to only the specified stack
2. Refactored to Options Pattern
- Created
DescribeDependentsArgsstruct to replace 8 individual parameters - Improved code readability and maintainability
- Follows the Options Pattern from CLAUDE.md
3. Enhanced Test Coverage
- Added YAML functions (
!env) to test fixtures to detect the bug - Created new test
TestDescribeAffectedWithDependentsStackFilterYamlFunctionsto verify:- YAML functions are only executed for components in the specified stack
- Dependents are correctly filtered by stack
- Environment variables are not accessed for components in other stacks
4. Lintroller Improvements
Added comprehensive exclusions to the custom linter:
- 29 packages excluded from perf.Track() checks (one-time operations)
- 7 utility files excluded (not in hot paths)
- 15 hot-path functions instrumented with perf.Track()
- os.Args linter exclusions for legitimate test patterns
Testing
Manual Testing
# Test that dependents are filtered by stack
atmos describe affected --include-dependents --stack ue1-network
# Verify YAML functions only execute for the specified stack
ATMOS_TEST_VPC_UE1=test atmos describe affected --include-dependents --stack ue1-networkAutomated Testing
go test ./internal/exec -v -run TestDescribeAffectedWithDependentsStackFilterYamlFunctions
go test ./pkg/describe -vChanges
Core Functionality
internal/exec/describe_dependents.go: AddedDescribeDependentsArgsstruct, fixed stack filteringinternal/exec/describe_affected_utils_2.go: Updated to use new struct patterninternal/exec/atmos.go: Updated TUI integrationpkg/describe/describe_dependents_test.go: Updated integration tests
Test Fixtures
- Added
!envYAML functions to test fixtures in 4 files:tests/fixtures/scenarios/atmos-describe-affected-with-dependents-and-locked/stacks/deploy/network/us-east-1.yamltests/fixtures/scenarios/atmos-describe-affected-with-dependents-and-locked/stacks/deploy/network/us-west-2.yaml- And their
stacks-affectedversions
Tests
internal/exec/describe_affected_test.go: AddedTestDescribeAffectedWithDependentsStackFilterYamlFunctions- Updated all mock functions to use new struct signature
Performance Tracking
Added perf.Track() to hot-path functions:
- Stack processing:
ProcessYAMLConfigFiles,ProcessYAMLConfigFile,ProcessStackConfig - Component processing:
ProcessComponentInStack,ProcessComponentFromContext - Describe operations:
ExecuteDescribeStacks,ExecuteDescribeComponent - Core execution:
FilterEmptySections,IsComponentAbstract,FilterComputedFields - Template functions:
AtmosFuncs.Component,AtmosFuncs.GomplateDatasource
Lintroller
tools/lintroller/rule_perf_track.go: Added exclusions for non-hot-path packages and filestools/lintroller/rule_os_args.go: Added exclusions for legitimate os.Args usage in tests
Impact
✅ Performance: YAML functions no longer execute for components outside the filtered stack
✅ Correctness: Dependents are now correctly limited to the specified stack
✅ Test Coverage: New tests prevent regression
✅ Code Quality: Improved readability with Options Pattern
✅ Linter: All custom linter checks pass
Summary by CodeRabbit
-
New Features
- Stack-specific filtering for dependent discovery (OnlyInStack).
- New template helpers under the "atmos" namespace: component, datasource, store.
-
Bug Fixes
- YAML function execution now respects stack filtering in describe-affected with dependents.
-
Performance
- Added runtime performance tracking across various describe and processing commands.
-
Chores
- Updated Atmos version and PostHog dependency; docs updated.
-
Tests
- Added/updated tests and fixtures for stack-filtering and Terraform-state YAML scenarios.