cloudposse/atmos v1.195.0-rc.1 on GitHub

Fix: atmos auth login "hangs" when run in make targets @osterman (#1671)

## what

Replaced telemetry.IsCI() checks in authentication logic with a isInteractive() function that checks for TTY availability.
Modified pkg/telemetry/ci.go to require both JENKINS_URL and BUILD_ID to be present for Jenkins CI detection, preventing false positives when only JENKINS_URL is set.
Updated the AWS SSO device authorization prompt message to correctly state "verify code" instead of "enter code".
Added debug logging to pkg/telemetry/ci.go for better visibility into CI detection.
Configured AWS SDK to explicitly use aws.AnonymousCredentials{} when loading config for SSO, preventing hangs on default credential providers.

why

Runtime vs. Telemetry Separation: Previously, telemetry.IsCI() was used for runtime behavior decisions (e.g., showing interactive prompts). This is incorrect as telemetry functions should not dictate application behavior. The change separates these concerns by using isInteractive() for runtime decisions and improving CI detection accuracy.
False Jenkins Detection: The JENKINS_URL environment variable was being set by build-harness by default, leading to incorrect Jenkins CI detection in environments that were not actual Jenkins CI. Requiring both JENKINS_URL and BUILD_ID for Jenkins detection resolves this false positive.
Accurate User Guidance: The AWS SSO device flow requires users to verify a code displayed in the terminal against the browser prompt, not enter it. The message has been updated for clarity.
Preventing Authentication Hangs: In non-interactive environments (like make targets without a TTY), the authentication flow was hanging because it was waiting for terminal input that would never arrive. The isInteractive() check ensures prompts are only shown when a TTY is available. Explicitly providing aws.AnonymousCredentials{} for SSO config loading prevents the AWS SDK from attempting to find credentials from other sources that might hang.

False Jenkins Detection

CloudPosse build-harness sets JENKINS_URL=https://localhost/buildByToken/buildWithParameters by default
Old detection only checked JENKINS_URL existence → false positives in any project using build-harness
Changed to require both JENKINS_URL AND BUILD_ID (what real Jenkins sets)
Prevents false CI detection when running atmos auth login in make targets

Pre-commit Build Issues

Building custom-gcl during pre-commit can cause git corruption in worktrees
Changed to check for pre-built binary and fail with helpful message instead
Users run make custom-gcl once, then commits work without rebuilding

references

Related to build-harness Jenkins URL: https://github.com/cloudposse/build-harness/blob/master/modules/jenkins/Makefile#L17

Summary by CodeRabbit

Bug Fixes
- AWS SSO device authentication prompts now correctly show instructions, URL and code, and will attempt to open the browser in interactive sessions; non-interactive sessions return clear errors.
Refactor
- Authentication flow now uses interactive terminal detection instead of CI-only checks.
- CI detection enhanced with more comprehensive environment-variable handling and additional debug logging.
Tests
- Added stdin TTY mock support for testing interactive behavior.
Chores
- Updated lint/build scripts and Makefile steps with clearer user-facing messages and a new run script.

feat: implement atmos version list and show commands with enhanced UI @osterman (#1658)

## what - Enhanced version list and show commands with improved UI formatting - Added borderless table with header separator for version list output - Implemented markdown rendering for release titles with ANSI color preservation - Added terminal width detection with minimum width validation - Styled release assets with muted file sizes and underlined download links - Added spinner animation during GitHub API calls for better UX - Implemented platform-specific asset filtering (OS/architecture matching) - Added debug logging for terminal width detection - Refactored version commands to self-contained cmd/version package following command registry pattern - Created GitHubClient interface for improved testability - Updated environment variable binding to support ATMOS_GITHUB_TOKEN with GITHUB_TOKEN fallback

why

Improve user experience with cleaner, more readable version output
Make release information more accessible with markdown-rendered titles
Ensure proper display across different terminal widths
Provide visual feedback during network operations
Follow Atmos architectural patterns with self-contained command packages
Enable better testing through interface-based design
Support standard Atmos environment variable conventions

references

Related to version command improvements
Follows command registry pattern documented in docs/prd/command-registry-pattern.md

Summary by CodeRabbit

New Features
- New version commands: list and show — interactive spinner (TTY) with non‑TTY fallback, text/JSON/YAML outputs, pagination, date filtering, prerelease options, current-version indicators, markdown-rendered titles, platform-aware asset listings and tables.
Authentication
- GitHub token handling now prefers ATMOS_GITHUB_TOKEN over GITHUB_TOKEN and is bound earlier during startup.
Errors
- New clear sentinels for rate limits, invalid limits/offsets, unsupported formats, narrow terminals, and spinner failures.
Documentation
- PRDs, usage guides, and a blog post for the new commands.
Tests
- Extensive unit and integration tests for list/show, formatters, GitHub client, and edge cases.
Chores
- Increased cache lock retry attempts.

Isolate AWS env vars during authentication @osterman (#1654)

## what

Introduced a new utility module (pkg/auth/cloud/aws/env.go) to manage the isolation of problematic AWS environment variables during authentication.
Created WithIsolatedAWSEnv() function that temporarily clears a predefined list of AWS environment variables, executes a provided function, and then restores the original values.
Created LoadIsolatedAWSConfig() which wraps AWS SDK's config.LoadDefaultConfig() and utilizes WithIsolatedAWSEnv() to ensure environment variables do not interfere with AWS config loading.
Updated all AWS authentication and identity creation code paths to use LoadIsolatedAWSConfig() instead of config.LoadDefaultConfig() when initializing AWS SDK clients. This includes:
- pkg/auth/identities/aws/assume_role.go
- pkg/auth/identities/aws/permission_set.go
- pkg/auth/identities/aws/user.go
- pkg/auth/providers/aws/saml.go
- pkg/auth/providers/aws/sso.go
Added debug logging to report which AWS environment variables are being ignored during authentication when they are set externally.
Added comprehensive unit and integration tests to cover the environment isolation logic, including scenarios with set, unset, and partially set variables, error handling, and the new logging functionality.

why

Resolves DEV-3706: Previously, external AWS environment variables (like AWS_PROFILE, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, AWS_CONFIG_FILE, AWS_SHARED_CREDENTIALS_FILE) could interfere with Atmos's internal AWS authentication mechanisms, particularly when using AWS IAM Identity Center (SSO) or assuming roles. This often led to authentication failures or unexpected behavior.
Ensures Consistent Authentication: By isolating these environment variables during the authentication process, Atmos can reliably use its own credential management and configuration without external interference, regardless of the user's shell environment.
Improves User Experience: Provides transparency by logging which environment variables are being ignored during authentication, without exposing sensitive values.
Maintains Backward Compatibility: The internal/aws_utils/aws_utils.go file, which is used in contexts where external environment variables are expected to be honored (e.g., Terraform backend configuration), continues to use config.LoadDefaultConfig() to avoid breaking existing functionality.

references

closes #123
DEV-3706: https://linear.app/cloudposse/issue/DEV-3706
AWS SDK Go v2 Configuration: https://aws.github.io/aws-sdk-go-v2/docs/configuring-sdk/environment-variables/

Summary by CodeRabbit

New Features
- Added an AWS environment isolation utility to prevent external AWS env vars from affecting authentication flows.
- Switched AWS config loading throughout SSO, assume-role, STS and session-token flows to use the isolated loader.
Tests
- Added comprehensive tests verifying env var isolation, restoration after use, error handling, and successful authentication despite external AWS env vars.

Refactor: Use mockgen and improve test validation @osterman (#1670)

## what

Replaced manual mock implementation of storer.Storer with a mock generated by mockgen in internal/exec/describe_affected_utils_test.go.
Enhanced the test in internal/exec/template_funcs_test.go to more thoroughly validate the FuncMap and the returned AtmosFuncs instance.
Updated CLAUDE.md with mandatory guidelines for mock generation and testing production code paths.

why

Maintainability: Manual mock implementations are brittle and hard to maintain. Using mockgen ensures mocks are generated and updated automatically, reducing maintenance overhead.
Test Quality: The previous describe_affected_utils_test.go contained 81 lines of hand-written mock code. This is replaced by a cleaner, generated mock. The template_funcs_test.go test was trivial and has been expanded to provide more meaningful assertions.
Best Practices: CLAUDE.md now explicitly mandates the use of mockgen and emphasizes testing production code paths, preventing future occurrences of manual mocks and logic duplication in tests.

references

closes #123

Add global --chdir flag for changing working directory @osterman (#1644)

## what - Add new global `--chdir` / `-C` flag for changing working directory before command execution - Add `ATMOS_CHDIR` environment variable support as alternative to flag - Implement TestKit pattern following Go 1.15+ testing.TB interface for systematic test isolation - Fix StringSlice flag corruption using reflection-based cleanup - Improve error messages for empty or missing config paths - Create comprehensive test suite with 15+ test cases in dedicated `cmd/root_chdir_test.go` file - Update global flags documentation with examples - Create blog post announcing the feature

why

Enables using development builds of Atmos to work with infrastructure repositories without manipulating shell environment
Simplifies CI/CD workflows by avoiding directory changes in scripts
Provides consistent interface similar to other CLI tools (make, git, etc.)
Improves developer experience when working with multiple infrastructure repositories
Establishes idiomatic Go testing pattern for all cmd package tests
Prevents test pollution from global RootCmd state that was causing mysterious test failures
Fixes misleading "file not found" errors when config paths are actually empty

references

Addresses use case: Using development Atmos binaries to point at other infrastructure repos without changing directories manually
Flag processes before all other operations including config loading
CLI flag takes precedence over environment variable
Comprehensive error handling for invalid paths, non-existent directories, and file paths
TestKit pattern follows Go 1.15+ testing.TB interface idiom similar to t.Setenv() and t.Chdir()

testing

Chdir Flag Tests

15+ test cases in dedicated cmd/root_chdir_test.go file (separated to comply with file length lint rules)
Test coverage includes:
- Absolute and relative paths
- Short (-C) and long (--chdir) flag forms
- Environment variable usage and precedence
- Error conditions (invalid paths, non-existent directories, files)
- Integration with config loading and base-path
- Edge cases (symlinks, paths with spaces, parent directory references)

Test Isolation & TestKit

Implemented cmd.NewTestKit(t) wrapper following Go 1.15+ testing.TB interface pattern
Migrated all 55 test cleanup calls across 21 test files from CleanupRootCmd to TestKit
Comprehensive TestKit tests covering:
- Automatic cleanup functionality
- testing.TB interface compliance
- Table-driven test patterns
- Nested test scenarios
- StringSlice flag corruption prevention
Net reduction of 248 lines while improving maintainability

Test Results

All chdir-specific tests pass successfully
All cmd package tests migrated to TestKit pattern
Linting passes with 0 issues
Build succeeds
Refactored complex nested blocks to comply with nestif linter rules

documentation

Updated website/docs/cli/global-flags.mdx with flag description, usage, and examples
Created website/blog/2025-01-15-chdir-flag.md announcing the feature
Updated CLAUDE.md with TestKit pattern as the standard for all cmd tests
Examples include:
- Development workflows with local Atmos builds
- CI/CD pipelines with multiple directories
- Multi-repository infrastructure management
- Scripting automation

technical details

TestKit Implementation

Wraps testing.TB interface for composable test helpers
Automatic RootCmd snapshot/restore via t.Cleanup()
Works seamlessly with subtests and table-driven tests
Handles StringSlice flag corruption using reflection to clear underlying slice
All testing.TB methods pass through: Helper(), Log(), Setenv(), Cleanup(), etc.

Error Message Improvements

Changed misleading "file not found" errors to show actual empty paths
Distinguish between "does not exist" and other stat errors (permission denied, etc.)
Include actual file/directory path in all error messages for clarity

Code Quality

Deleted CleanupRootCmd and WithRootCmdSnapshot (never in main)
Single idiomatic pattern across entire test suite
Reduced code duplication and improved maintainability
All pre-commit hooks passing

Summary by CodeRabbit

New Features
- Added global --chdir / -C flag to run Atmos as if started in a specified directory; flag takes precedence over ATMOS_CHDIR and is applied before config loading.
Documentation
- Added CLI docs and a blog post with usage, examples, and guidance on combining --chdir with --base-path; CLI help updated to show the flag.
Bug Fixes / Validation
- Improved path/file validation with clearer error messages for missing, non-directory, or inaccessible paths.
Tests
- Extensive new tests and a test harness ensuring isolated, deterministic CLI and working-directory behavior.

Fix: Preserve exit codes from shell commands and workflows @osterman (#1660)

## what

Modified ShellRunner to capture and preserve exit codes from shell commands using interp.ExitStatus.
Updated CheckErrorPrintAndExit to correctly handle errUtils.ExitCodeError, ensuring preserved exit codes are used for program termination.
Adjusted workflow error handling to return original errors, thus preserving exit codes from workflow steps.
Addressed a regression in internal/exec/terraform.go where exit code 2 from terraform plan -detailed-exitcode was not being preserved.
Introduced comprehensive test cases for custom commands and workflow shell steps to validate exit code preservation.

why

Fixing ShellRunner: Previously, ShellRunner using the mvdan.cc/sh interpreter would lose specific exit codes, often resulting in a generic exit code of 1 for failures. This change ensures that any non-zero exit code from a shell command is captured and propagated.
Fixing CheckErrorPrintAndExit: The error handling function was not equipped to recognize or process the ExitCodeError type created by the ShellRunner fix. This update allows CheckErrorPrintAndExit to correctly identify and utilize the preserved exit code for program termination.
Fixing Workflow Error Handling: Workflow execution was previously masking specific errors by wrapping them in a generic ErrWorkflowStepFailed. This change allows the original error, including its exit code, to be returned, providing more accurate failure reporting.
Fixing Terraform Exit Code Regression: The refactoring for performance incorrectly handled terraform plan -detailed-exitcode when it returned exit code 2 (indicating changes were detected). This fix ensures that exit code 2 is preserved, allowing subsequent steps to correctly interpret the plan's status.
Test Coverage: Added dedicated tests to ensure that custom commands and workflow shell steps consistently preserve their exit codes across various scenarios (0, 2, and other non-zero codes). This prevents regressions in exit code handling.

references

closes #123 (Assuming this fixes an issue, replace #123 with the actual issue number)

Summary by CodeRabbit

New Features
- Added documentation for the authentication list command with usage, formats, flags, and examples.
Bug Fixes
- Preserve and propagate subcommand exit codes so workflows and commands return correct codes.
- Improved error messages to display specific exit codes (e.g., "subcommand exited with code X") for clearer troubleshooting.
Chores
- Updated build configuration (disk-cleanup hook, build flags, widened platforms) and bumped internal dependencies.
Tests
- Added comprehensive tests and fixtures for exit-code handling, shell/workflow execution, and error rendering.

fix: isolate golangci-lint custom build to prevent git corruption @osterman (#1666)

This PR fixes an issue where the golangci-lint custom build process was causing git corruption during pre-commit hooks.

Problem

The golangci-lint custom command modifies the git repository state during the build process, which conflicts with pre-commit hooks that expect a clean git state. This was causing git corruption and preventing the pre-commit hooks from running successfully.

Solution

Isolate the build process: Use a temporary directory for the custom golangci-lint build to prevent any git state modifications in the main repository
Shell compatibility: Changed pre-commit hook from bash to sh for better compatibility across different environments
Code cleanup: Extracted the complex Makefile shell logic into a dedicated script (scripts/build-custom-golangci-lint.sh) for better maintainability

Changes

Modified .pre-commit-config.yaml to use sh instead of bash
Updated Makefile custom-gcl target to build in an isolated temporary directory
Created scripts/build-custom-golangci-lint.sh to handle the build logic cleanly
The build process now copies necessary files to a temp directory, builds there, moves the binary back, and cleans up

Testing

Pre-commit hooks now run without git corruption
Custom golangci-lint binary builds successfully in isolation
No functional changes to the linting process itself

Remove unused --profile flag from auth commands @osterman (#1650)

## what

Removed the --profile flag and all associated references from the Atmos CLI auth commands.
This includes removing the flag definition in cmd/auth.go, its documentation, and related entries in test snapshots.

why

The --profile flag was added in a previous PR but was never implemented with any actual functionality.
This flag was causing confusion with other concepts of "profile" within Atmos and the broader AWS ecosystem (e.g., AWS CLI profiles, performance profiling).
Removing this dead code cleans up the codebase and improves clarity for users.

references

closes #1475
closes #1530

Summary by CodeRabbit

Breaking Changes
- Removed the --profile flag from authentication commands. Users can no longer specify a profile for authentication via this flag.
Documentation
- Updated help text and documentation to reflect removal of profile flag support.

Atmos Performance Optimization @aknysh (#1639)

## what - Comprehensive performance optimizations for Atmos achieving 5.2x (420%) faster execution and 92% memory reduction - Additional optimizations for `atmos describe affected` command achieving 70-85% performance improvement

why

Large-scale infrastructure configurations with hundreds of stacks and thousands of components experience slow processing times
High memory usage limits scalability and increases CI/CD costs
The atmos describe affected command was particularly slow when processing many stacks in CI/CD pipelines
Sequential processing and repeated file operations created bottlenecks

Performance Results

Core Atmos Operations (760 YAML files, 533 stacks, 8k components)

Execution time: 16 seconds → 3 seconds (5.2x faster, 80.9% reduction)
Heap allocations: 4.8 GB → 385 MB (92% reduction)
CPU utilization: ~180% → 261% (improved multi-core usage)

`atmos describe affected` Command

Overall improvement: 70-85% faster execution time
Parallel processing gain: 40-60% improvement from concurrent stack processing
File indexing gain: 60-80% reduction in PathMatch operations
Combined optimizations: Multiplicative performance improvements across all operations

Optimization Strategies

1. Algorithm Optimizations

O(1) YAML tag lookup replacing O(n) searches
Optimized deep merge operations reducing redundant checks
Early exit for custom tags preventing unnecessary processing
Custom deep comparison (15-25% faster than reflect.DeepEqual)

2. Caching Optimizations

Inheritance caching - Prevents recomputation of component inheritance chains
Parsed YAML caching - Reuses parsed YAML documents across operations
FindStacksMap caching - Caches expensive stack map operations
JSON schema compilation caching - Reuses compiled validation schemas
PathMatch caching - Caches glob pattern matching results
Sprig function caching - Memoizes expensive template function results
String interning - Reduces memory for duplicate strings
Component path pattern caching (P9.2) - 10-15% improvement by eliminating repeated path construction
Terraform module pattern caching (P9.7) - Avoids expensive tfconfig.LoadModule() calls

3. Concurrency Optimizations

Parallel import processing - Processes YAML imports concurrently
Worker pools for stack processing - Bounded concurrency with optimal resource usage
Parallel component processing - Concurrent analysis with race-free design
Parallel stack processing (P9.1) - 40-60% improvement from goroutines and channels
Thread-safe caching - sync.RWMutex for concurrent reads with minimal contention

4. I/O and Memory Optimizations

I/O batching - Reduces filesystem operations
Compiled template caching - Reuses parsed templates
YAML buffer pooling - Reduces GC pressure
Right-sized allocations - Pre-allocates with known capacities
Performance tracking optimization - Minimal overhead instrumentation
Changed files indexing (P9.4) - 60-80% reduction in PathMatch operations by filtering files by component type

5. Cache Correctness

Content-aware cache keys using SHA-256 hashing
Composite cache keys preventing false cache hits
Cache invalidation on content changes

`atmos describe affected` Optimizations (P9.1 - P9.7)

The recent optimizations specifically target the atmos describe affected command, which is critical for CI/CD pipelines:

P9.1: Parallel Stack Processing (40-60% improvement)

Replaces sequential stack processing with concurrent goroutines
Uses buffered channels and sync.WaitGroup for coordination
Processes multiple stacks simultaneously for dramatic speedup
Each stack processed in isolated goroutine with thread-safe result aggregation

P9.2: Component Path Pattern Caching (10-15% improvement)

Caches component path patterns to eliminate repeated string construction
Uses sync.RWMutex for thread-safe concurrent access
Reduces allocations and CPU overhead from pattern building
Cache keys based on component name and type

P9.4: Changed Files Indexing (60-80% reduction in PathMatch operations)

Pre-indexes changed files by component type (Terraform, Helmfile, Packer)
Filters files before pattern matching to reduce comparisons
Handles unmatched files with fallback to all base paths for safety
Dramatically reduces expensive glob pattern matching operations

P9.5: Custom Deep Comparison (15-25% improvement)

Replaces reflect.DeepEqual with optimized type-specific comparison
Uses type assertions for common types (maps, slices, strings, numbers)
Fallback to reflect.DeepEqual only for uncommon types
Reduces reflection overhead for component configuration comparisons

P9.7: Terraform Module Pattern Caching

Caches tfconfig.LoadModule() results to avoid repeated filesystem operations
Stores module patterns for each component
Thread-safe cache with sync.RWMutex
Eliminates expensive Terraform module discovery on every comparison

Combined Effect

These optimizations compound multiplicatively:

Parallel processing + file indexing = 70-85% overall improvement
Pattern caching + custom comparison = Additional 25-40% on top
All optimizations work together without conflicts

Code Quality

Testing

18 test functions with 51+ sub-tests for new optimizations
Thread-safety tests with concurrent goroutines and race detector
Integration tests verifying end-to-end functionality
Benchmark tests measuring performance improvements
100% test pass rate - All existing and new tests passing

Code Organization

Clear separation between reference implementations and optimized versions
Comprehensive nolint comments explaining complexity trade-offs
Helper function extraction to reduce cyclomatic complexity
Constants for magic strings improving maintainability
Thread-safe data structures with proper synchronization

Linting & Quality

✅ 0 linting issues - All golangci-lint checks pass
✅ Proper error handling with static errors from errors/errors.go
✅ Performance tracking with defer perf.Track() on all functions
✅ Comment standards - All comments end with periods
✅ Import organization - Three-group structure (stdlib, 3rd-party, Atmos)

Supporting changes:

Multiple files with algorithm, caching, and concurrency optimizations
Test coverage for all optimization strategies
Documentation updates

Status

✅ All tests passing (100% pass rate)
✅ All linting checks passing (0 issues)
✅ Thread-safety verified with race detector
✅ No functional regressions
✅ Backwards compatible - no breaking changes

Migration Notes

No breaking changes - all optimizations are internal improvements. Users will automatically benefit from:

Faster atmos describe stacks execution
Faster atmos describe affected execution
Lower memory usage
Better multi-core CPU utilization
Reduced CI/CD pipeline execution times

Summary by CodeRabbit

Performance Improvements
- Parallelized component processing, added multiple caches, and reduced allocations via capacity hints and string interning for faster, lower-memory operations.
- TTY-aware fast-paths for console output to avoid expensive formatting when piped.
New Features
- Provenance tracking surfaced in describe outputs.
- Faster YAML/JSON “simple” output modes for non-interactive use.
- Improved multi-component type handling and safer deep-copying of merged data.
Chores
- Dependency updates and VCS ignore adjustments.

fix: lintroller not enforcing violations due to || true in Makefile @osterman (#1636)

## what - Removed `|| true` from the `make lintroller` target in Makefile to properly enforce linting violations - Updated lintroller comment to include os.MkdirTemp rule - Removed unnecessary `grep -v "^#"` filter - **Integrated lintroller as a golangci-lint module plugin** for CI workflows - Added lintroller configuration in `.golangci.yml` settings.custom section - Updated CodeQL workflow to build and use custom golangci-lint binary with lintroller - Added comprehensive test coverage for all lintroller rules (os-setenv-in-test, os-mkdirtemp-in-test) - Renamed testdata file to `bad_test.go` so lintroller rules properly apply

why

Makefile bug: The || true was causing lintroller to always exit with code 0 (success), even when violations were detected, so pre-commit hooks would pass
Plugin integration: Lintroller was only running as standalone binary, not integrated with golangci-lint CI workflow
GitHub Advanced Security: Plugin integration enables SARIF output for CodeQL integration, showing lintroller findings in GitHub's security dashboard alongside other linters
Unified linting: Custom golangci-lint binary with lintroller provides single tool for all linting (standard + custom rules)
PR #1634 initially contained os.MkdirTemp violations that weren't caught because pre-commit lintroller had || true
The developer had to manually fix the violations in a second commit after realizing the issue
Test coverage was incomplete - only covered t.Setenv in defer, missing os.Setenv and os.MkdirTemp tests

references

Related to PR #1634 which had os.MkdirTemp violations that weren't caught by CI
The lintroller rules themselves work correctly (added in PR #1630)
Two issues fixed:
1. Makefile || true prevented pre-commit hooks from failing
2. Lintroller plugin wasn't integrated into golangci-lint CI workflow
Now both pre-commit hooks and golangci-lint CI will properly catch violations
Lintroller findings appear in GitHub Advanced Security with SARIF integration

Summary by CodeRabbit

Chores
- CI and local tooling now build and run a custom lint binary (with the new plugin), ensure it’s executable, filter packages from scans, and upload SARIF results; workflows, pre-commit, and Makefile updated to use this custom lint flow. Linter config enables new env/temp-dir checks.
Tests
- Test data refreshed with compliant/non-compliant examples; many tests refactored to use testing helpers and t.Setenv; benchmarks adjusted.
Documentation
- README expanded with plugin/module guidance, build/run CI examples, and config templates.

Add technical blog post for provenance tracking feature @osterman (#1635)

## what - Add technical documentation blog post announcing the provenance tracking feature - Explain the problem, solution, and practical use cases for developers - Include examples, symbol explanations, and comparison with existing `sources` system - Link to full documentation at `/cli/commands/describe/component`

why

Document the recently released --provenance flag feature that users have been requesting
Provide practical examples and use cases for developers
Create technical (not marketing) content showing the value of provenance tracking
Help users understand how to debug configuration inheritance issues

references

Related PR: #1584 (original provenance implementation)
Documentation: website/docs/cli/commands/describe/describe-component.mdx
PRD: docs/prd/import-provenance.md

Summary by CodeRabbit

Documentation
- Added a comprehensive guide on configuration provenance tracking: introduces the new --provenance option for describe, explains line-level provenance for nested values/arrays/maps, shows terminal rendering (TTY vs non-TTY) with ANSI symbols and color by import depth, and documents YAML/JSON outputs including provenance metadata. Includes usage examples, real-world use cases (debugging, audits, automation), performance notes, a Get Started walkthrough, and future enhancements.

feat: Add command registry pattern foundation with 100% test coverage @osterman (#1643)

## what - Introduce command registry pattern for organizing Atmos built-in commands - Implement `cmd/internal` package with `CommandProvider` interface and registry - Migrate `about` command as proof-of-concept using new pattern - Add comprehensive documentation (PRD, developer guide, test coverage report) - Achieve 100% test coverage for all new components (14 tests passing)

why

Modular organization: Each command family will live in its own package (e.g., cmd/terraform/, cmd/describe/)
Plugin readiness: Foundation for future external plugin support
Backward compatibility: Coexists seamlessly with custom commands from atmos.yaml
Self-registering: Commands register via init() functions - no manual command list maintenance
Future-proof: Enables incremental migration of 115+ existing commands in separate PRs

changes

New Infrastructure

cmd/internal/command.go - CommandProvider interface definition
cmd/internal/registry.go - Thread-safe command registry implementation
cmd/internal/registry_test.go - Comprehensive tests (12 test cases, 100% coverage)

Proof-of-Concept Migration

cmd/about/about.go - Migrated command using registry pattern
cmd/about/about_test.go - Tests (2 test cases, 100% coverage)
cmd/about/markdown_about.md - Embedded markdown content
cmd/about.go - Marked deprecated (for cleanup in future PR)
cmd/about_test.go - Marked deprecated (for cleanup in future PR)

Root Command Integration

cmd/root.go - Added registry registration and blank import for about command

Documentation

docs/prd/command-registry-pattern.md (1,192 lines) - Complete PRD with:
- Architecture overview and design decisions
- Integration with custom commands from atmos.yaml
- Three nested command patterns (static, dynamic, deeply nested)
- Migration guide for future command families
- Testing strategy and FAQ
docs/developing-atmos-commands.md (540 lines) - Developer guide with:
- Quick start tutorial for creating new commands
- Four command patterns with code examples
- Best practices and common issues
- Testing guidelines and checklist
docs/prd/command-registry-test-coverage.md - Test coverage report
CLAUDE.md - Updated "Adding New CLI Command" section with registry pattern

testing

Test Coverage: 100%

$ go test ./cmd/internal/... ./cmd/about/... -cover
ok  	github.com/cloudposse/atmos/cmd/internal	0.192s	coverage: 100.0%
ok  	github.com/cloudposse/atmos/cmd/about	    0.594s	coverage: 100.0%

All Tests Passing

✅ 12/12 registry tests passing
✅ 2/2 about command tests passing
✅ Binary builds successfully
✅ atmos about command verified working

Test Categories

Command registration (single, multiple, override)
Provider retrieval and grouping
Batch operations and error handling
Nested and deeply nested command hierarchies
Thread safety (concurrent registration)
CommandProvider interface implementation
Command execution with output verification

backward compatibility

✅ Zero breaking changes

All existing commands continue working unchanged
Custom commands from atmos.yaml still override built-in commands
Command aliases still functional
Execution order preserved: built-in → custom → aliases

next steps

This PR establishes the foundation. Future command migrations will follow this process:

Pick a command family (list, describe, terraform, etc.)
Follow migration guide in docs/prd/command-registry-pattern.md
Submit independent PR per command family

Suggested migration order:

Simple commands: version, support, completion
Static subcommands: list, validate
Complex families: describe, terraform, helmfile
Cloud integrations: aws, atlantis

references

Command Registry Pattern PRD
Developing Atmos Commands Guide
Test Coverage Report
Implements patterns from Docker CLI and kubectl command organization

Summary by CodeRabbit

New Features
- Built-in commands now load automatically at startup, improving consistency of CLI behavior and help organization.
- About command remains available with embedded markdown content and clearer output.
Refactor
- Migrated CLI commands to a registry-based pattern for more modular, maintainable command management.
Documentation
- Added comprehensive guides on creating and organizing commands, including command groups, patterns, best practices, and migration notes.
- Introduced a product requirements document outlining the command registry approach.
Tests
- Expanded unit tests for the About command and command registration flows.

Add Announcements Blog @osterman (#1161)

## what - Provide an area for announcements

why

A lot is changing in atmos and following releases is difficult to get the big picture

Summary by CodeRabbit

New Features
- Added a dedicated release-notes blog (Atmos Changelog) with pagination (10 posts/page), MD/MDX inclusion, and a welcome post introducing the changelog.
Chores
- Added a CI workflow that validates PRs require changelog entries and posts a guidance comment when a required entry is missing.
Documentation
- Added contributor guidelines describing when and how to add release-note blog posts.

Replace os.Chdir with t.Chdir in all tests @osterman (#1638)

## what - Updates all test files to use `t.Chdir()` introduced in Go 1.24 - Adds new lintroller rule to detect `os.Chdir` usage in test files - Adds forbidigo pattern to catch `os.Chdir()` usage - Updates 42 test files across the codebase - Removes manual directory cleanup code (`os.Getwd`/`defer`/`os.Chdir`)

why

t.Chdir() automatically restores the working directory when the test completes, providing better test isolation
Simplifies test code by eliminating 5-6 lines of boilerplate per directory change
Provides fail-fast behavior on directory change errors
Maintains consistency with existing patterns for t.Setenv() and t.TempDir()
Reduces net lines of code by ~600 lines while improving test quality

references

Follows the same pattern as t.Setenv() and t.TempDir() migrations
Go 1.24 feature: https://tip.golang.org/doc/go1.24#testing

🤖 Generated with Claude Code

fix: `!exec` returning `executable not found` - `!exec` should preserve `PATH` @ohaibbq (#1634)

## what

#1543 introduced a breaking change to the !exec yaml function whereby the parent process's environment was not inherited if an environment was explicitly passed. !exec uses ExecuteShellAndReturnOutput which passes the shell level as an environment variable, causing the PATH not to be inherited.

atmos/pkg/utils/shell_utils.go

Lines 47 to 55 in 82ff97a

 env = append(env, fmt.Sprintf("ATMOS_SHLVL=%d", newShellLevel)) 

 log.Debug("Executing", "command", command) 

 if dryRun { 

 return "", nil 

 } 

 err = ShellRunner(command, name, dir, env, &b)

why

Stacks using !exec with custom binaries fail with executable not found errors

references

#1543

Summary by CodeRabbit

Bug Fixes
- Shell-executed commands now inherit environment variables, improving compatibility with scripts and external tools.
Tests
- Added comprehensive cross-platform unit tests for execution outputs (strings, numbers, empty), complex JSON parsing (objects, arrays, booleans, null), invalid JSON handling, and error scenarios (invalid commands, malformed input) using temporary executables and PATH manipulation.

`auth` docs improvement - `account.id` vs `account.name` in type `aws/permissionset` @Benbentwo (#1632)

## what

This pull request updates documentation for specifying AWS account references in identity configurations, improving clarity and flexibility for users. The most important changes provide guidance on using either account names or IDs when configuring AWS identities, and update examples to reflect these options.

Documentation improvements for AWS account specification:

Added explanations in pkg/auth/docs/PRD/PRD-Atmos-Auth.md about specifying AWS accounts using either account.name (recommended) or account.id, with examples for both methods. [1] [2]
Updated example configurations in pkg/auth/docs/ARCHITECTURE.md and pkg/auth/docs/UserGuide.md to use descriptive account names (e.g., "sandbox", "production", "development") instead of numeric IDs, and included comments showing how to use the account ID directly. [1] [2] [3] [4]

Minor documentation adjustments:

Minor formatting and clarification changes in pkg/auth/docs/PRD/PRD-Atmos-Auth.md to improve readability of the auth: section.

why

Improve docs for better getting started experience

references

#1475

Summary by CodeRabbit

Documentation
- Updated authentication guides to use environment-based account names (e.g., “production”, “development”) instead of numeric IDs in examples.
- Added explicit guidance for configuring AWS Permission Set identities.
- Provided parallel example configurations for specifying accounts by name or by ID (with an alternative commented option).
- Restored environment variable blocks in relevant examples and refined example formatting.
- Minor text/whitespace cleanups.
- No functional or behavioral changes.

Migrate tests from `os.MkdirTemp` to `t.TempDir` for automatic cleanup @osterman (#1630)

## what - Migrated 42 instances across 15 test files from manual temporary directory management (`os.MkdirTemp` + `defer os.RemoveAll`) to automatic cleanup using `t.TempDir()` - Added new lintroller rule `os-mkdirtemp-in-test` to prevent future violations - Removed helper functions that were only wrapping `os.MkdirTemp` - Fixed variable declaration issues where `err` was undefined or redeclared

why

t.TempDir() provides automatic cleanup that runs even when tests fail or panic
Improves test hygiene and prevents temp directory leaks
Reduces boilerplate code in tests
Follows modern Go testing best practices (available since Go 1.15)
Lintroller rule enforces this pattern going forward, similar to our os-setenv-in-test rule
No deliberate directory reuse patterns were found - all instances used immediate defer cleanup

references

Part 1: Migrated 22 instances across 6 files
- pkg/downloader: Removed createTempDir helper, migrated 4 instances
- pkg/generate: Migrated 1 instance, removed unused imports
- pkg/filematch: Migrated 1 instance
- internal/exec/copy_glob_test.go: Migrated 11 instances
- internal/exec/describe_affected_utils_2_test.go: Migrated 1 instance
- internal/exec/describe_stacks_test.go: Migrated 4 instances
Lintroller rule:
- Added tools/lintroller/rule_os_mkdirtemp.go to detect os.MkdirTemp in test files
- Registered in both standalone and golangci-lint plugin modes
- Updated README with documentation and examples
- Allows exceptions for benchmark functions (similar to os-setenv-in-test rule)
Part 2: Migrated 20 instances across 9 files
- internal/exec/oci_utils_test.go: 1 instance
- internal/exec/terraform_output_utils_integration_test.go: 2 instances
- internal/exec/terraform_plan_diff_main_test.go: 1 instance
- internal/exec/terraform_plan_diff_test.go: 2 instances
- internal/exec/terraform_utils_test.go: 3 instances
- internal/exec/validate_component_test.go: 3 instances
- tests/cli_test.go: 1 instance
- tests/testhelpers/sandbox_test.go: 9 instances
Note: pkg/list/list_vendor_test.go was intentionally kept using os.MkdirTemp("", "atmos-test-vendor") because the code has hardcoded logic that checks for this specific directory name pattern to enable test mode behavior

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added a linter rule that flags use of os.MkdirTemp in tests, recommending t.TempDir. Enabled by default and configurable via plugin settings.
Documentation
- Updated linting documentation to include the new rule and revised defaults.
Tests
- Migrated tests to use t.TempDir for automatic cleanup, simplifying setup and removing manual cleanup code.
- Minor scoping cleanups; no changes to test behavior or public APIs.

Improve test quality: fix tests without assertions and convert skip-only tests @osterman (#1621)

## what - Fix 13 tests that had no assertions (always passed) - Convert 16 skip-only tests to build tags for proper platform-specific testing - Add test coverage for infrastructure wrappers (80%+ patch coverage) - Fix flaky performance test timing assertion

why

Tests without assertions provide false confidence and always pass
Skip-only tests inflate coverage metrics without actually testing anything
Platform-specific tests should use build tags, not runtime.GOOS checks
Infrastructure code needs test coverage to meet CodeCov requirements

Phase 4.1: Tests Without Assertions (13 tests fixed)

cmd/cmd_utils_test.go - Added error checking for missing config
cmd/describe_affected_test.go - Added gomock assertions
cmd/describe_dependents_test.go - Added gomock assertions
cmd/describe_stacks_test.go - Added error checks
cmd/describe_workflows_test.go - Added error checks
cmd/root_heatmap_test.go - Added TTY state assertions
internal/exec/oci_utils_test.go - Added directory existence checks
internal/exec/terraform_plan_diff_test.go - Added diff format assertions
internal/exec/version_test.go - Added version output checks
pkg/auth/utils/env_test.go - Added panic recovery for nil-safety
pkg/config/load_error_paths_test.go - Documented untestable error path
pkg/describe/describe_stacks_test.go - Added gomock expectations
pkg/downloader/get_git_test.go - Fixed timeout test assertions

Phase 4.2: Skip-Only Tests (16 tests converted to build tags)

pkg/config (2 tests)

load_test.go - Moved 2 Windows case-insensitive path tests to separate file with Windows build tag

pkg/downloader (3 tests)

custom_git_detector - Created Unix symlink test file
get_git - Created Windows git version parsing test
gogetter_downloader - Created Unix file:// URL test

pkg/utils (3 tests)

file_utils - Added platform-specific path tests to existing Unix/Windows files
component_path_absolute - Created Unix component path test file

internal/exec (6 tests)

copy_glob - Created Unix file with 5 platform-specific tests (directory exclusion, symlink handling, glob patterns, permissions, prefix exclusion)
template_utils - Created Unix permission check test file

Infrastructure Test Coverage (NEW)

Added comprehensive test coverage for all infrastructure wrappers to meet CodeCov 80% patch coverage requirement:

New Test Files (426 lines)

internal/exec/stacks_processor_test.go - Tests DefaultStacksProcessor wrapper delegation
pkg/config/interface_test.go - Tests DefaultLoader wrapper delegation
pkg/git/interface_test.go - Tests DefaultRepositoryOperations wrapper (with/without git repos)
pkg/pro/interface_test.go - Tests DefaultClientFactory wrapper (success and error cases)
tests/testhelpers/cobra_test.go - Tests mock command creation helpers (string/bool/mixed flags)
tests/testhelpers/filesystem_test.go - Tests MockFSBuilder fluent interface

Coverage Results

internal/exec: 59.7%
pkg/config: 76.8%
pkg/git: 70.2%
pkg/pro: 81.7%
tests/testhelpers: 67.4%

Test Stability Fix

Fixed flaky TestMultipleRecursiveFunctionsIndependent:

Increased recursion depth difference (100 vs 20 instead of 50 vs 25)
Changed from strict inequality to ratio-based assertion (>2.0x)
Added tolerance for scheduling jitter and timing variance

Testing

All modified tests passing on macOS
Windows compilation verified for all packages
Pre-commit hooks passing
56 files modified/created total
Patch coverage improved from 36% to 80%+

references

Phase 4.1: Fix tests without assertions (100% complete)
Phase 4.2: Convert skip-only tests to build tags (100% complete)
Phase 4.3: Enable errcheck linter - tracked in issue #1620

Summary by CodeRabbit

Refactor
- Added pluggable interfaces and dependency-injection hooks to improve testability with no user-visible behavior changes.
Tests
- Expanded unit and platform-specific tests, added panic-safety checks, error-path validations, and benchmarks for instance processing.
- New reusable test helpers for command and filesystem mocking; many tests reorganized by OS.
Documentation
- Added guidance for the new test helpers.
Chores
- Updated ignore patterns to exclude local test-quality docs.
Bug Fixes
- Improved robustness via additional test coverage and explicit error assertions.

Replace deprecated go-homedir with local vendored copy @osterman (#1631)

## what - Add `replace` directive in go.mod to redirect all `mitchellh/go-homedir` imports to `pkg/config/homedir` - Update Atmos code to import `github.com/mitchellh/go-homedir` instead of internal path `github.com/cloudposse/atmos/pkg/config/homedir` - Create go.mod in `pkg/config/homedir` to make it a valid replacement module - All transient dependencies now use Atmos's maintained fork instead of the deprecated package

why

Mitchell Hashimoto's go-homedir package is deprecated and no longer maintained
Multiple transient dependencies (google/go-containerregistry, hashicorp/*, hairyhenderson/gomplate, etc.) were pulling in the deprecated package
Using a replace directive allows us to maintain a single, updated implementation while remaining compatible with dependencies that still reference the deprecated import path
This eliminates the deprecated dependency from the module graph while maintaining full API compatibility

references

Deprecated package: https://github.com/mitchellh/go-homedir
Related discussion about replace directive approach

🤖 Generated with Claude Code

Summary by CodeRabbit

Refactor
- Standardized the home directory resolution across the application using an external library; behavior remains unchanged for end-users.
Documentation
- Clarified comments regarding the home directory provider implementation.
Chores
- Updated module configuration and vendoring to align with the new dependency structure.
- Cleaned up imports to remove legacy references and ensure consistency.

No user-facing functionality changes are expected as a result of these updates.

Replace `os.Setenv` with `t.Setenv` in test files for proper cleanup @osterman (#1625)

## what - Replaced all `os.Setenv` calls with `t.Setenv` in test files (50+ files modified) - Created **lintroller** - custom Go static analysis linter for Atmos-specific rules - Integrated lintroller with golangci-lint module plugin system - Removed unnecessary manual cleanup code (defer blocks with `os.Unsetenv`) - Refactored over-engineered tests to use proper Go testing conventions - Added comprehensive test quality improvements

why

t.Setenv automatically restores environment variables after each test/subtest
Eliminates ~750 lines of boilerplate cleanup logic
Provides better isolation between parallel subtests
Prevents manual cleanup errors and improves test reliability
Prevents future regressions with automated linting
Enables automatic GitHub PR annotations for t.Setenv violations
Follows Go 1.17+ testing best practices

lintroller - Custom Static Analysis Linter

Created a production-ready custom linter with two rules:

1. `tsetenv-in-defer` Rule

Detects t.Setenv called inside defer or t.Cleanup blocks (will panic or have no effect):

// ❌ BAD - Will panic
defer func() {
    t.Setenv("FOO", "bar")
}()

// ✅ GOOD - Automatically restored
t.Setenv("FOO", "bar")

2. `os-setenv-in-test` Rule

Detects os.Setenv in test files (should use t.Setenv instead):

// ❌ BAD - Manual cleanup needed
os.Setenv("PATH", "/test/path")

// ✅ GOOD - Automatic cleanup
t.Setenv("PATH", "/test/path")

Exceptions: Allows os.Setenv in defer blocks, t.Cleanup, and benchmark functions.

Lintroller Integration (Three Modes)

Standalone binary: make lintroller
Pre-commit hook: Automatic via .pre-commit-config.yaml
golangci-lint plugin: golangci-lint custom && ./custom-gcl run

Architecture Highlights

Interface-based design for extensibility (easy to add new rules)
Dual-mode support: Standalone CLI + golangci-lint plugin
Module plugin system: Implements register.LinterPlugin interface
Auto-registration: Uses init() with register.Plugin("lintroller", New)
Comprehensive docs: tools/lintroller/README.md

Test Quality Improvements

Refactored over-engineered tests that were doing manual environment manipulation:

shell_utils_test.go: Removed os.Clearenv() and 20+ lines of manual save/restore
packer_test.go: Removed manual PATH manipulation in subtests
homedir_test.go: Removed patchEnv helper function entirely
git_getter_test.go: Simplified PATH manipulation

Files Changed

Core Migration

50+ test files updated to use t.Setenv
Removed ~750 lines of manual cleanup boilerplate

Lintroller Tool

tools/lintroller/ - Complete custom linter implementation
- plugin.go - golangci-lint integration
- rule_tsetenv_in_defer.go - Defer detection rule
- rule_os_setenv.go - Test file detection rule
- cmd/lintroller/main.go - Standalone CLI
- README.md - Comprehensive documentation
.custom-gcl.yml - golangci-lint custom build config
Makefile - Added lintroller target
.pre-commit-config.yaml - Added lintroller hook
.gitignore - Added lintroller binaries

references

🤖 Generated with Claude Code

chore: upgrade bearsh/hid to v1.6.0 to fix macOS deprecation warning @osterman (#1629)

## what - Upgrade `bearsh/hid` transitive dependency from v1.3.0 to v1.6.0 - Add `replace` directive in `go.mod` to force the upgraded version - Eliminate macOS 12+ deprecation warning for `kIOMasterPortDefault`

why

The older bearsh/hid v1.3.0 uses the deprecated kIOMasterPortDefault constant

This causes build warnings on macOS 12 and later:

warning: 'kIOMasterPortDefault' is deprecated: first deprecated in macOS 12.0

The latest version (v1.6.0) no longer uses the deprecated constant
Since bearsh/hid is a transitive dependency via versent/saml2aws/v2 (used for Atmos auth), we use a replace directive to ensure the upgraded version is used

references

Dependency chain: atmos → versent/saml2aws/v2@v2.36.19 → marshallbrekka/go-u2fhost → bearsh/hid@v1.3.0
The replace directive overrides the version to use bearsh/hid@v1.6.0

Summary by CodeRabbit

Chores
- Updated the HID library to v1.6.0 to standardize module resolution.
- Improves build reproducibility and reduces variations between installations.
- No changes to UI, workflows, or runtime behavior.
- Internal maintenance only; end-user experience remains unchanged.

Fix flaky TestMultipleRecursiveFunctionsIndependent test @osterman (#1626)

## what - Fix flaky `TestMultipleRecursiveFunctionsIndependent` test in `pkg/perf/perf_test.go` - Simplify test by removing unnecessary recursion complexity - Rename test to `TestMultipleFunctionsTrackIndependently` to match its actual purpose

why

The test was failing intermittently on Windows with: func1: expected non-zero duration, got 0s
The test name suggested it was about recursive functions, but recursion added complexity without testing value
Other tests already cover recursive tracking behavior (TestRecursiveFunctionTracking, TestRecursiveFunctionWrongPattern)
The actual purpose is to verify that multiple functions track independently without interfering with each other's metrics

changes

Removed all recursion logic - simplified to two basic functions
Renamed test from TestMultipleRecursiveFunctionsIndependent to TestMultipleFunctionsTrackIndependently
Each function sleeps 1ms to ensure measurable duration on all platforms
Test verifies core behavior:
- Each function has count=1 (independent tracking)
- Both functions tracked separately in registry (2 entries)
- Both functions have non-zero duration (duration tracking works)
Removed flaky timing comparisons that caused intermittent failures

references

Fixes flaky test introduced in #1611
Simpler approach than #1621 (which still had duration issues on Windows)
Test now passes consistently across all platforms

🤖 Generated with Claude Code

Summary by CodeRabbit

Tests
- Renamed a test to better reflect independent tracking behavior.
- Simplified the scenario from recursive calls to straightforward function calls.
- Updated assertions to verify two distinct entries with non-zero durations and single-call counts.
- Improved error messages and comments for clarity.
Refactor
- Streamlined test flow by removing recursion and using simpler work units, improving readability and maintainability without changing user-facing functionality.

Update screengrabs for v1.194.1 @[cloudposse-internal[bot]](https://github.com/apps/cloudposse-internal) (#1627)

This PR updates the screengrabs for Atmos version v1.194.1.

🚀 Enhancements

Fix: Respect --no-color flag in syntax highlighting @osterman (#1652)

## what

The HighlightCodeWithConfig function in pkg/utils/highlight_utils.go has been updated to correctly respect the --no-color flag and Settings.Terminal.Color configuration.
Previously, this function did not check these settings, leading to colored output even when colors were explicitly disabled for JSON and YAML formatting.
This fix ensures that all syntax highlighting, including for JSON and YAML outputs, adheres to the terminal color settings.

why

This addresses a regression introduced on May 14, 2025, when the --no-color flag was added to the CLI.
While the markdown renderer was correctly updated to respect the flag, the core syntax highlighting function (HighlightCodeWithConfig) was overlooked.
This resulted in an inconsistency where JSON and YAML outputs continued to display color codes despite the user's intent to disable them, impacting commands like atmos describe stacks.
The fix aligns the behavior of all code highlighting with the user's --no-color preference and terminal color settings.

references

closes #1651
DEV-3701
Audit Report: NO_COLOR_BUG_AUDIT.md

Summary by CodeRabbit

Bug Fixes
- The --no-color flag now works correctly across all syntax highlighting utilities, ensuring ANSI color codes are excluded from output when the flag is enabled.
- Improved handling of nil configurations to prevent errors during output highlighting.
Tests
- Added comprehensive tests to verify no-color behavior across highlighting features and the Describe Stacks command.

fix: correct pager default to false as intended by PR #1430 @osterman (#1642)

## what - Fix regression where pager was enabled by default despite PR #1430 intending to disable it - Change `v.SetDefault("settings.terminal.pager", true)` to `false` in `pkg/config/load.go`

why

PR #1430 (commit 08a44dd) documented a BREAKING CHANGE: "Pager is now disabled by default"
However, the actual default value in setDefaultConfiguration() was never changed from true to false
This caused the pager to remain enabled by default for users without explicit pager: false in their atmos.yaml
The repository's own atmos.yaml has pager: false which masked the issue locally

references

Regression introduced in: commit 2325ab9 (PR #1203)
Intended fix that was incomplete: commit 08a44dd (PR #1430)
Related file: pkg/config/load.go:159

Summary by CodeRabbit

Changes
- Terminal pager is now disabled by default so command output displays directly in the terminal for easier piping, copying, and quick scanning.
- You can re-enable paging via configuration, a CLI flag, or an environment variable.
Bug Fixes
- Removed an extraneous debug message from CLI stderr to reduce noise.
Documentation
- Added a blog post explaining the pager default correction, impact, and how to opt back in.

Fix ANSI logging in auth and add completion support @osterman (#1637)

## what - Fixed ANSI escape sequences leaking into structured logs from auth manager - Added shell completion support for auth command flags - Added comprehensive test coverage (80-90%) for completion functionality

why

The auth manager was passing lipgloss-styled text (containing ANSI codes like \x1b[1m) to the logger, which breaks structured logging
Users need shell completion for --identity and --format flags to improve CLI usability
Test coverage ensures completion functionality works correctly across edge cases

changes

Fix ANSI Logging Issue

Removed lipgloss styling from log.Info() call in pkg/auth/manager.go:594
Logger should handle all colorization internally, not receive pre-styled text
Removed unused lipgloss import

Add Completion Support

Added identityFlagCompletion() function that reads identities from atmos.yaml
Added AddIdentityCompletion() helper function to register completion for identity flags
Applied identity completion to auth command (persistent --identity/-i flag)
Applied identity completion to auth exec command
Format flag completion for auth env was already implemented

Test Coverage

Added 17 comprehensive test cases across unit and integration tests
Unit tests in cmd/cmd_utils_test.go:
- TestIdentityFlagCompletion - core completion logic with valid/invalid configs
- TestIdentityFlagCompletionWithNoAuthConfig - edge case handling
- TestIdentityFlagCompletionPartialMatch - partial input handling
- TestAddIdentityCompletion - helper function registration
- TestStackFlagCompletion - existing completion testing
- TestAddStackCompletion - existing helper testing
Integration tests in cmd/auth_integration_test.go:
- TestAuthCommandCompletion - 5 sub-tests for auth command completion
- TestAuthEnvFormatCompletion - 2 sub-tests for format flag completion
Coverage: identityFlagCompletion 87.5%, AddIdentityCompletion 83%, overall 85-90%

testing

# Test completion generation
atmos completion bash | head -20

# Test identity completion (requires atmos.yaml with auth config)
cd examples/demo-auth
../../atmos __complete auth login --identity ""
# Output: oidc sso superuser saml

# Test format completion  
../../atmos __complete auth env --format ""
# Output: json bash dotenv

# Run tests
go test ./cmd -run "TestIdentityFlagCompletion|TestAddIdentityCompletion|TestAuthCommandCompletion"

references

Fixes issue with ANSI sequences in logs (screenshot provided by user)
Follows CLAUDE.md conventions: proper imports, comments ending with periods, performance tracking, error handling

Summary by CodeRabbit

New Features
- Added shell completion for the --identity flag across auth commands (including subcommands and auth exec); completions are drawn from your config, support partial input, and avoid file completions.
- Added shell completion for the whoami/output flag with json suggestion and no file completion.
Tests
- Extensive tests added for completion behavior, flag inheritance, sorting, partial matches, and error paths.
Style
- Simplified authentication log output by removing bold styling from chain step names.

`auth` Allow override of AWS Resolver URL @Benbentwo (#1624)

## what - Allows users to override AWS resolver URL for identities and providers of AWS cloud.

why

Allows us to override the resolver URL. This is useful in cases where you need to stub out the AWS endpoint.
This will be useful for testing and for usage with localstack.

Summary by CodeRabbit

New Features
- Added support for a configurable custom AWS endpoint resolver used by AWS authentication flows; can be configured at identity or provider level (identity takes precedence).
Documentation
- Added guidance and YAML examples for configuring custom AWS endpoints, including a LocalStack usage example.
Tests
- Added unit tests covering resolver precedence, empty/nil configs, extraction logic, and edge cases.
Examples
- Demo updated with a default LocalStack superuser identity for local testing.

🐛 Bug Fixes

fix: Changelog check detects blog posts across all PR commits @osterman (#1646)

## what - Fixed the changelog check workflow to correctly detect blog posts added in any commit of a PR, not just the latest commit - Changed from `git diff` to `gh pr diff` to capture all files modified in the entire PR - Excluded `tags.yml` from blog post detection (it's a meta file, not a blog post)

why

The workflow was failing on PRs with multiple commits because it only compared HEAD against the base branch
If a blog post was added in commit 1, then commits 2, 3, 4 were added, the workflow would fail on commit 4 because the blog post was no longer "new" relative to the comparison
This caused false positive failures, including on PR #1643 which had correctly added a blog post but got additional commits afterward

references

Fixes the failure on https://github.com/cloudposse/atmos/actions/runs/18566118238/job/52927568164
Related to PR #1643 which triggered this issue

Summary by CodeRabbit

Chores
- Improved changelog verification workflow: only PRs targeting the main branch with minor/major labels now require a changelog entry, with clearer messaging about requirements.
- Enhanced detection of new blog posts by using the full PR diff for file discovery, improving accuracy for content submissions and preserving existing paths for “no changelog required” or error reporting.

	env = append(env, fmt.Sprintf("ATMOS_SHLVL=%d", newShellLevel))

	log.Debug("Executing", "command", command)

	if dryRun {
	return "", nil
	}

	err = ShellRunner(command, name, dir, env, &b)

cloudposse/atmos v1.195.0-rc.1 on GitHub

why

False Jenkins Detection

Pre-commit Build Issues

references

Summary by CodeRabbit

why

references

Summary by CodeRabbit

why

references

Summary by CodeRabbit

why

references

why

references

testing

Chdir Flag Tests

Test Isolation & TestKit

Test Results

documentation

technical details

TestKit Implementation

Error Message Improvements

Code Quality

Summary by CodeRabbit

why

references

Summary by CodeRabbit

Problem

Solution

Changes

Testing

why

references

Summary by CodeRabbit

why

Performance Results

Core Atmos Operations (760 YAML files, 533 stacks, 8k components)

atmos describe affected Command

Optimization Strategies

1. Algorithm Optimizations

2. Caching Optimizations

3. Concurrency Optimizations

4. I/O and Memory Optimizations

5. Cache Correctness

atmos describe affected Optimizations (P9.1 - P9.7)

P9.1: Parallel Stack Processing (40-60% improvement)

P9.2: Component Path Pattern Caching (10-15% improvement)

P9.4: Changed Files Indexing (60-80% reduction in PathMatch operations)

P9.5: Custom Deep Comparison (15-25% improvement)

P9.7: Terraform Module Pattern Caching

Combined Effect

Code Quality

Testing

Code Organization

Linting & Quality

Status

Migration Notes

Summary by CodeRabbit

why

references

Summary by CodeRabbit

why

references

Summary by CodeRabbit

why

changes

New Infrastructure

Proof-of-Concept Migration

Root Command Integration

Documentation

testing

Test Coverage: 100%

All Tests Passing

Test Categories

backward compatibility

next steps

references

Summary by CodeRabbit

cloudposse/atmos v1.195.0-rc.1
on GitHub

`atmos describe affected` Command

`atmos describe affected` Optimizations (P9.1 - P9.7)

1. `tsetenv-in-defer` Rule

2. `os-setenv-in-test` Rule