feat: Add configuration profiles support @osterman (#1752)
what
- Add comprehensive test coverage for profile CLI commands (
cmd/profile/show.goandcmd/profile/list.go) - Improve test coverage from 0% to 59.8% for profile command layer
- Add 1,003 lines of table-driven tests across 2 new test files
- Fix cross-platform compatibility issues (Windows path separators)
- Remove unreachable dead code in existing tests
- Correct misleading test case names to match actual behavior
why
- CodeCov analysis identified profile CLI commands with 0% patch coverage
- While overall project coverage is at 70.04%, new profile commands lacked any tests
- Without tests, we cannot verify correct behavior across different scenarios
- Cross-platform issues and misleading test names reduce code quality
- Comprehensive test coverage ensures profile commands work reliably in production
references
- Addresses CodeCov patch coverage gaps identified in recent commits
- Follows Go testing best practices with table-driven tests
- Uses testify library for assertions (assert/require)
- Tests cover all output formats (JSON, YAML, table), error cases, and edge cases
- Ensures cross-platform compatibility with
filepath.Joinfor Windows support
Test Coverage Improvements
Files Added
cmd/profile/show_test.go (477 lines)
- Format flag completion tests
- Profile name completion tests
- Error builder tests (profile not found, invalid format)
- JSON rendering tests (basic profiles, profiles with metadata)
- YAML rendering tests (basic profiles, deprecated profiles)
- Format dispatcher tests (text/json/yaml formats, invalid formats)
- Profile info retrieval tests (existing/non-existent profiles)
- Edge case tests (no files, many files, nil/empty metadata)
cmd/profile/list_test.go (526 lines)
- Format flag completion tests
- JSON rendering tests (empty lists, single profile, multiple profiles with metadata)
- YAML rendering tests (empty lists, single profile, multiple profiles)
- Profile discovery error builder tests
- Format dispatcher tests (table/json/yaml formats, invalid formats)
- Empty profile list tests (all formats)
- Complex profile tests (long paths, many files, rich metadata)
- Integration test skeleton (skipped, covered in tests/ directory)
Coverage Results
- Before: cmd/profile/show.go - 0% coverage
- After: cmd/profile/show.go - 59.8% coverage
- Before: cmd/profile/list.go - 0% coverage
- After: cmd/profile/list.go - 59.8% coverage
Fixes Applied
-
Misleading Test Name (cmd/profile/show_test.go:300)
- Changed "empty format defaults to text" → "empty format is invalid"
- Test validates error behavior, name now matches implementation
-
Unreachable Dead Code (pkg/profile/list/formatter_table_test.go:234-239)
- Removed conditional check for path length exceeding width
- Prior assertion already guaranteed path ≤ pathWidth, making code unreachable
- Removed unused
stringsimport
-
Windows Path Separator (pkg/profile/manager_test.go:607)
- Changed hardcoded "stacks/dev.yaml" →
filepath.Join("stacks", "dev.yaml") - Ensures tests pass on Windows (backslashes) and Unix (forward slashes)
- Changed hardcoded "stacks/dev.yaml" →
Testing Strategy
All tests follow table-driven pattern with:
- Descriptive test case names
- Multiple scenarios per function
- Validation functions for complex assertions
- JSON/YAML unmarshaling verification
- Error type checking with
assert.ErrorIs - Edge cases (empty inputs, nil values, large datasets)
Tests verify:
- ✅ Shell completion for format flags
- ✅ JSON output (valid structure, correct field names)
- ✅ YAML output (valid structure, correct field names)
- ✅ Error messages and error types
- ✅ Profile discovery and retrieval
- ✅ Complex metadata handling
- ✅ Cross-platform compatibility
Summary by CodeRabbit
-
New Features
- Configuration profiles: multi-location discovery, precedence-based composition, activate via --profile or ATMOS_PROFILE; new profile commands: list and show (table/json/yaml) and alias "atmos list profiles".
-
Documentation
- PRDs, CLI docs, blog post, examples and test fixtures for developer/CI/production profiles and migration guidance.
-
UI/Style
- Profile list/show renderers and a new Notice theme style for messaging.
-
User-facing errors
- Clear, actionable errors for profile discovery, missing profiles, load/merge failures and invalid output formats.
-
Tests
- Extensive unit and end-to-end CLI tests, snapshots, and fixtures covering profiles and help output.
-
Chores
- Minor dependency updates.
feat: Remove deep exits from version command @osterman (#1783)
Summary
Refactors the version command to remove deep exits and allow execution even with invalid or missing configuration. Version is a diagnostic tool that must always work, making it the first step in troubleshooting.
Key Changes:
- Removes
log.Fatal()deep exit in version execution logic - Adds version command detection in
PersistentPreRunEto suppress config errors - Moves
InitializeMarkdown()after config error checking to prevent deep exits - Adds 10 comprehensive integration tests for invalid configs (YAML, aliases, schema)
- Documents version command design and avoiding deep exits pattern in PRDs
This establishes the pattern for refactoring all commands to remove deep exits, enabling proper error handling, testability, and composability.
Test Plan
- ✅ All existing version command tests passing
- ✅ 10 new integration tests verify version works with invalid YAML syntax
- ✅ 10 new integration tests verify version works with invalid command aliases
- ✅ 10 new integration tests verify version works with invalid config schema
- ✅ Both
atmos versionandatmos --versionwork with broken configs
Summary by CodeRabbit
-
New Features
- Version command reliably outputs formatted JSON/YAML and performs update checks even when user config is invalid.
-
Bug Fixes
- Reduced abrupt exits: startup and version flows return enriched, recoverable errors and emit clearer non-fatal messages/warnings.
-
Documentation
- Added PRDs and guides for the version command, avoiding deep-exit patterns, unified flag handling, error message examples, and format usage examples.
-
Tests
- Added fixtures and extensive tests for invalid YAML, schema, aliases, and version scenarios (formatting, checks, and error paths).
-
Chores
- Expanded YAML check exclusions and linter rules to enforce unified flag parsing.
test: add comprehensive env inheritance tests @osterman (#1789)
what
- Fix bug where
!envYAML function only read from OS environment variables, ignoring env variables defined in stack manifests - Add comprehensive test coverage for env variable inheritance across stack manifests and components
- Update
!envdocumentation to explain resolution order and limitations
why
- Users expected
!env FOOto read fromenv: { FOO: "bar" }defined in globals.yaml or component env sections, but it only checked OS environment variables - Env variable inheritance was undocumented and untested, making it unclear how env sections merge across globals, imports, and components
- Single-pass YAML function processing creates limitations that need to be clearly documented
references
- Fixes the bug where
!envcouldn't access stack manifest env variables - Validates that env inheritance follows merge priority:
GlobalEnv<BaseComponentEnv<ComponentEnv<ComponentOverridesEnv - Documents that YAML functions cannot reference results from other YAML functions in the same component section (single-pass limitation)
refactor: Migrate list commands to flag handler pattern @osterman (#1788)
## Summary Migrated all 10 `atmos list` subcommands to the StandardParser flag handler pattern. Reorganized into `cmd/list/` directory following command registry pattern. Added comprehensive environment variable support and eliminated deep exits for better testability.Changes
- Moved 10 list commands from root cmd/ to cmd/list/ directory
- Replaced legacy flag handling with StandardParser + Options structs
- Added ATMOS_* environment variable support for all flags
- Created newCommonListParser() factory to eliminate flag duplication
- Refactored checkAtmosConfig() to return errors instead of calling Exit()
Test Plan
- ✓ All existing unit tests pass
- ✓ Commands compile without errors
- ✓ Help text shows all flags with proper descriptions
- ✓ Environment variables work end-to-end
Summary by CodeRabbit
- New Features
- Adds a unified "list" command with dedicated subcommands (stacks, components, instances, workflows, metadata, settings, vendor, themes, values) and consistent output formatting.
- Enhanced filtering (stack/component/query), flag + env-var support, shell completion for stack flags, formats/delimiters, max-columns, and home-directory obfuscation.
- Clearer UX: consistent validation and user-friendly "No results" messages.
- Documentation
- Adds test-coverage report and a test-coverage improvement plan.
feat: Safe logout preserves keychain credentials by default @osterman (#1791)
Summary
Implement safe-by-default logout behavior where session data (tokens, cached credentials) is cleared, but keychain credentials (access keys, service account credentials) are preserved for faster re-authentication. Add --keychain flag with interactive confirmation for permanent credential deletion.
Changes
- Safe-by-default behavior:
atmos auth logoutclears only session data - New
--keychainflag to permanently delete credentials with confirmation - New
--forceflag for CI/CD environments to bypass confirmation - Cloud-provider-agnostic implementation working with all auth providers
- Comprehensive documentation and blog post
References
See blog post and documentation for usage examples and migration guide for existing scripts.
Summary by CodeRabbit
-
New Features
- Added --keychain to optionally delete stored credentials during logout (preserved by default).
- Added --force to bypass interactive confirmation.
- Interactive confirmation prompts for keychain deletion in TTY; non‑TTY and force handling supported.
- Detects and warns about external cloud provider credentials (AWS/Azure/GCP env vars) after logout.
- Extended dry‑run to preview keychain-related items when --keychain is used.
-
Documentation
- Updated CLI docs and examples, added PRD and blog post explaining the change and migration guidance.
feat: add Version Management Patterns documentation @osterman (#1499)
## what - Add comprehensive documentation for Version Management Patterns in Atmos - Document five distinct patterns for managing component versions across environments - Provide practical examples, implementation details, and migration strategieswhy
- Teams struggle with balancing stability and velocity when managing infrastructure versions
- "Just pin everything" often creates more problems than it solves at scale
- No clear guidance exists on when to use which versioning strategy
- Teams need to understand trade-offs between reproducibility, convergence, and operational overhead
references
- Addresses common questions about version management in infrastructure-as-code
- Provides alternatives to traditional strict pinning approaches
- Helps teams choose patterns based on their scale and operational maturity
Summary
This PR adds comprehensive documentation for Version Management Patterns in Atmos, covering five distinct strategies:
📚 Patterns Documented
-
Version Management Overview (
version-management.mdx)- Explains deployment vs. release concepts
- Provides pattern comparison table
- Offers selection criteria
-
Strict Version Pinning (
strict-version-pinning.mdx)- Traditional per-environment pinning
- High reproducibility, high overhead
- Requires lockstep promotions
-
Release Tracks/Channels (
release-tracks-channels.mdx)- Environments subscribe to moving tracks
- Promotes convergence and feedback
- Medium operational overhead
-
Folder-Based Versioning (
folder-based-versioning.mdx)- Version through repository structure
- Explicit boundaries for major changes
- Gradual migration support
-
Vendoring Components (
vendoring-components.mdx)- Local control over dependencies
- Predictable update windows
- Custom patch support
-
Git Flow: Branches as Channels (
git-flow-branches-as-channels.mdx)- Long-lived branches as channels
- Familiar Git workflows
- Strong CI/CD integration
✨ Key Features
Each pattern includes:
- Use cases and problem analysis
- Detailed Atmos implementation
- Real-world examples
- Benefits and drawbacks
- Best practices
- Migration strategies
🎯 Impact
This documentation helps teams:
- Choose appropriate versioning strategies
- Avoid common pitfalls with version management
- Implement patterns correctly with Atmos
- Migrate between patterns as needs evolve
Summary by CodeRabbit
-
New Features
- Expanded Atmos CLI help coverage for Terraform subcommands (apply, destroy, import, init, output, plan, refresh, state).
-
Documentation
- Added extensive Version Management guidance (continuous deployment, folder-based versioning, release tracks/channels, strict pinning, vendoring, versioning schemes) and integrated examples across core docs.
- Added Terraform configuration guidance (clean, deploy, shell prompts, workspace naming) and updated Component Catalog docs and navigation with redirects.
-
Chores
- Increased CI/tools timeout.
feat: Add ui.Toast() pattern for status notifications @osterman (#1794)
what
- Add new
ui.Toast()andui.Toastf()functions for flexible toast-style status notifications - Extract toast functionality from the larger toolchain PR (#1686) for independent review and merge
- Update documentation to show Toast pattern as the primary approach for status notifications
why
- The toolchain PR (#1686) is large and complex - extracting independent features allows faster review and merge
- Toast pattern provides a unified, flexible approach for all user-facing status messages
- Custom icon support enables better visual communication without creating new wrapper functions
- Improves code maintainability by establishing a clear pattern for status notifications
Implementation Details
New Functions:
// Primary toast pattern with custom icons
ui.Toast("📦", "Using latest version: 1.2.3")
ui.Toastf("🔧", "Tool %s is not installed", toolName)
// Existing convenience wrappers (now documented as Toast wrappers)
ui.Success("Done!") // ✓ Done! (green)
ui.Error("Failed!") // ✗ Failed! (red)
ui.Warning("Deprecated") // ⚠ Deprecated (yellow)
ui.Info("Processing...") // ℹ Processing... (cyan)Benefits:
- ✅ Consistent pattern for all toast notifications
- ✅ Flexible icon support (custom emojis or themed icons)
- ✅ Automatic channel routing (stderr for UI)
- ✅ Automatic secret masking via I/O layer
- ✅ Zero breaking changes - all existing functions work as before
Documentation Updates:
- Updated
docs/io-and-ui-output.mdwith Toast API reference - Updated
docs/prd/io-handling-strategy.mdwith Toast pattern examples - Enhanced comments in
pkg/ui/formatter.goto clarify Toast pattern
Testing
- ✅ All existing tests pass
- ✅ Builds successfully
- ✅ Linter checks pass
- ✅ No breaking changes to existing API
references
- Extracted from #1686 (toolchain PR)
- Part of ongoing UI/UX improvements for Atmos CLI
Summary by CodeRabbit
-
New Features
- Added toast-style notifications with icon support, multiline toasts, and consistent icon+text formatting.
- Introduced convenience functions for success, error, warning, and info messages with formatting variants.
-
Documentation
- Updated UI API docs with toast examples and a new Plain UI Text section (Write/Writef/Writeln examples).
-
Tests
- Added comprehensive tests covering toast outputs, multiline/unicode handling, and formatting variants.
fix: Use YAML !env function in Sentry config examples @osterman (#1793)
Summary
- Fixed incorrect shell-style variable expansion to proper YAML !env function syntax in Sentry configuration examples
- Removed redundant blog post sections (Real-World Impact, Why This Matters, What's Next)
Changes
Updated configuration examples to use !env VARIABLE_NAME instead of ${VARIABLE_NAME} syntax across documentation and blog post.
Summary by CodeRabbit
- Documentation
- Configuration syntax examples have been updated throughout documentation to provide improved clarity and consistency across all setup instructions and configuration best practices.
- Blog post has been streamlined and refined with condensed narrative sections while fully preserving essential configuration examples, comprehensive technical guidance, and practical recommendations for users.
docs: Add identity provider file isolation PRDs @osterman (#1792)
what
- Create universal identity provider file isolation pattern PRD defining canonical pattern for all providers (AWS, Azure, GCP, etc.)
- Document AWS authentication file isolation as reference implementation showing how existing code implements the pattern
- Document Azure authentication file isolation as planned implementation following the universal pattern
- Establish clear separation between Atmos-managed enterprise/customer credentials and developer's personal hobby accounts
why
- Protect developer's personal credentials: Most developers have personal AWS/Azure/GCP accounts for hobby projects that are manually configured with
aws configure,az login,gcloud init. Atmos must never modify these personal accounts. - Critical multi-customer use case: When managing infrastructure for multiple customers (Cloud Posse use case), need physically separate credential files to make it "provably impossible" to accidentally use wrong customer's credentials.
- Establish universal pattern: All identity providers (AWS, Azure, GCP) must follow the same XDG-compliant file isolation pattern for consistency.
- Enable clean logout: Deleting an Atmos identity removes all work credentials without affecting personal hobby accounts.
- Azure needs implementation: Current Azure implementation writes to
~/.azure/which breaks developer's personal Azure CLI setup. This PRD documents the required changes to match AWS pattern.
Key architectural decision: Atmos-managed credentials go in ~/.config/atmos/{cloud}/, personal credentials stay in default locations (~/.aws/, ~/.azure/, ~/.config/gcloud/).
references
- Implements XDG Base Directory Specification for credential storage
- Documents existing AWS implementation that successfully isolates credentials using
AWS_SHARED_CREDENTIALS_FILEandAWS_CONFIG_FILE - Plans Azure implementation using
AZURE_CONFIG_DIRenvironment variable for isolation - Related to ongoing Azure authentication work
Summary by CodeRabbit
- Documentation
- Added a universal authentication file isolation pattern covering per-provider credential isolation, logout/cleanup semantics, XDG-compliant storage, environment variable wiring, security guidance, testing strategy, and migration steps
- Added AWS-specific implementation and environment mappings
- Added Azure-specific implementation guidance, XDG storage guidance, environment mappings, migration guidance, and testing recommendations
fix: Upgrade CodeQL Action from v3 to v4 @osterman (#1790)
what
- Upgrade all CodeQL Action references from deprecated v3 to v4
- Updates github/codeql-action/init, autobuild, analyze, and upload-sarif actions
- Resolves deprecation warning about v3 being removed in December 2026
why
CodeQL Action v3 is deprecated and will be removed on December 28, 2026. This PR ensures the workflow continues to function with the supported version.
references
Summary by CodeRabbit
- Chores
- Updated GitHub Actions workflow dependencies to latest compatible versions for improved reliability and security in the continuous integration pipeline.
feat: Migrate theme commands to StandardFlagParser @osterman (#1772)
## SummaryThis PR has two main components:
- Theme Command Migration: Migrated theme list and show commands to use the modern StandardFlagParser pattern
- Error Handling Documentation: Comprehensive documentation improvements for the Atmos error handling system
Changes
Theme Commands
- Theme list command: Removed global variables, added type-safe options struct, enabled environment variable support (
ATMOS_RECOMMENDED), and implemented proper flag precedence (CLI > env > config > default) - Theme show command: Established consistent StandardFlagParser pattern for future flag additions
- Error handling: Improved theme command errors to use builder pattern with actionable hints
- Test coverage: Added 30 comprehensive test cases validating flag handling, Viper integration, and flag precedence behavior
Error Handling Documentation
- atmos-errors agent: Created comprehensive agent guide (14.7KB) for designing user-friendly error messages
- Key principles documented:
- Hints = WHAT TO DO (actionable steps) - NOT "what happened"
- Explanations = WHAT HAPPENED (educational context)
- Context = WHERE/HOW (debugging details, non-redundant)
- Critical patterns:
- Subprocess exit code preservation with
exec.ExitError errors.Join()order non-preservation warning- Error builder pattern with formatted methods (
WithHintf,WithExplanationf) - Avoiding redundancy across builder methods
- Subprocess exit code preservation with
- Error docs improvements: Updated
docs/errors.mdwith formatted builder methods and clearer examples
Benefits
Theme Commands
- Type-safe options with proper encapsulation
- Full flag precedence support (CLI > env > config > default)
- Environment variable support for all flags
- Consistency with other Atmos commands
- Better error messages with actionable hints
Error Handling System
- Clear guidance for developers on creating user-friendly errors
- Prevents common anti-patterns (explanatory hints, redundancy, wrong exit codes)
- Ensures consistent error experience across Atmos
- Proactive agent that reviews error handling code
Testing
- 30+ theme command test cases covering flag handling and precedence
- All error documentation examples validated for correctness
- Error builder pattern verified with proper separation of hints/explanations/context
Summary by CodeRabbit
-
New Features
- Theme commands now respect ATMOS_THEME/THEME with consistent precedence (CLI > env > config > default).
- Added a "recommended only" option and unified flag parsing for theme list/show.
- New public theme errors for clearer "not found" and "invalid" theme cases.
-
Tests
- Expanded coverage for flag/env/config precedence, option parsing, theme resolution, and command executions.
-
Documentation
- Blog post documenting env-var support and usage examples.
-
Chores
- CI: pin Helm version for Helmfile steps.
feat: Add native Azure authentication support @jamengual (#1768)
Summary
This PR adds comprehensive native Azure authentication support to Atmos, enabling seamless authentication to Azure with full Terraform provider compatibility.
Features
Three Authentication Methods
- ✅ Device Code Flow: Browser-based authentication for interactive developer sessions with MFA support
- ✅ OIDC: Workload identity federation for GitHub Actions, GitLab CI, and Azure DevOps pipelines
- ✅ Service Principals: Client credential authentication for automation and service accounts
Full Terraform Provider Support
Works seamlessly with all Azure Terraform providers out of the box:
- azurerm - Complete Azure Resource Manager support including KeyVault operations
- azuread - Azure Active Directory management
- azapi - Alternative Azure management interface
Identical to az login Behavior
- Writes credentials to
~/.azure/msal_token_cache.json(Azure CLI MSAL cache) - Updates
~/.azure/azureProfile.jsonwith subscription configuration - Sets
ARM_USE_CLI=truefor Terraform providers - Drop-in replacement - existing Terraform code works without changes
- Provides all three token scopes for complete Azure functionality:
https://management.azure.com/.default- Azure Resource Managerhttps://graph.microsoft.com/.default- Azure AD operationshttps://vault.azure.net/.default- Azure KeyVault operations
Multi-Subscription & Multi-Region Support
Easy switching between different Azure subscriptions, regions, and environments:
auth:
identities:
azure-dev:
kind: azure/subscription
principal:
subscription_id: "DEV_SUBSCRIPTION_ID"
location: "eastus"
azure-prod:
kind: azure/subscription
principal:
subscription_id: "PROD_SUBSCRIPTION_ID"
location: "westus"Quick Start
Configuration
auth:
providers:
azure-dev:
kind: azure/device-code
tenant_id: "12345678-1234-1234-1234-123456789012"
subscription_id: "87654321-4321-4321-4321-210987654321"
location: "eastus"
identities:
azure-dev-subscription:
default: true
kind: azure/subscription
via:
provider: azure-dev
principal:
subscription_id: "87654321-4321-4321-4321-210987654321"
location: "eastus"Usage
# Authenticate to Azure
atmos auth login
# Use with Terraform
atmos terraform plan my-component -s my-stack
atmos terraform apply my-component -s my-stack
# Switch subscriptions
atmos terraform apply my-component -s prod --identity azure-prodImplementation Details
New Packages
pkg/auth/providers/azure/
device_code.go- Device code flow authentication with interactive browser flowoidc.go- OIDC workload identity federation for CI/CDservice_principal.go- Client credentials authenticationcli.go- Azure CLI compatibility utilitiesdevice_code_cache.go- Token caching and MSAL cache management
pkg/auth/identities/azure/
subscription.go- Azure subscription identity with location support
pkg/auth/cloud/azure/
setup.go- MSAL cache and Azure profile file managementenv.go- Environment variable configuration for Terraformfiles.go- Credential file operations with proper lockingconsole.go- Azure Portal URL generation
pkg/auth/types/
azure_credentials.go- Azure credential type implementation
Architecture
Follows Atmos architectural patterns:
- Registry Pattern: Azure providers/identities register via factory
- Interface-Driven: All components implement Provider/Identity interfaces
- Provider-Agnostic Core: No Azure-specific code in core auth manager
- Testable: Comprehensive unit tests with mocked dependencies
Complete Token Support
Atmos provides all three Azure token scopes (matching az login exactly):
-
Management Token (
https://management.azure.com/.default)- Used by azurerm and azapi providers
- Enables all Azure Resource Manager operations
-
Graph API Token (
https://graph.microsoft.com/.default)- Used by azuread provider
- Enables Azure AD operations (users, groups, service principals)
-
KeyVault Token (
https://vault.azure.net/.default)- Used by azurerm provider for KeyVault operations
- Enables secret, key, and certificate management
This comprehensive token support ensures all Terraform resources work correctly, including KeyVault certificate contacts, secret management, and AD group operations.
CI/CD Integration Examples
GitHub Actions with OIDC
name: Deploy Infrastructure
on:
push:
branches: [main]
permissions:
id-token: write # Required for OIDC
contents: read
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install Atmos
uses: cloudposse/github-action-setup-atmos@v2
- name: Authenticate to Azure
run: atmos auth login --identity azure-prod-ci
env:
AZURE_CLIENT_ID: ${{ secrets.AZURE_CLIENT_ID }}
AZURE_TENANT_ID: ${{ secrets.AZURE_TENANT_ID }}
AZURE_SUBSCRIPTION_ID: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Deploy
run: atmos terraform apply my-component -s prodService Principal Authentication
auth:
providers:
azure-automation:
kind: azure/service-principal
tenant_id: "YOUR_TENANT_ID"
client_id: "YOUR_SERVICE_PRINCIPAL_CLIENT_ID"
subscription_id: "YOUR_SUBSCRIPTION_ID"
identities:
azure-automation-prod:
kind: azure/subscription
via:
provider: azure-automation
principal:
subscription_id: "YOUR_SUBSCRIPTION_ID"Testing
Unit Test Coverage
Added comprehensive test coverage (3,401 lines of test code):
pkg/auth/cloud/azure/
files_test.go- File manager, locking, permissions (112 lines)setup_test.go- MSAL cache updates, JWT extraction, profile management (336 lines)
pkg/auth/providers/azure/
device_code_test.go- Device code provider, validation, spinner UI (302 lines)device_code_cache_test.go- Token caching, expiration, MSAL updates (286 lines)cli_test.go- CLI provider validation and environment prep (179 lines)oidc_test.go- OIDC provider with workload identity federationservice_principal_test.go- Service principal authentication
pkg/auth/identities/azure/
subscription_test.go- Subscription identity, location overrides (141 lines)
pkg/auth/types/
azure_credentials_test.go- Credential type, expiration, validation
Coverage Results
- pkg/auth/cloud/azure: 81.9% ✅ (exceeded 80% target)
- pkg/auth/providers/azure: 54.7%
- pkg/auth/identities/azure: 62.5%
- Overall patch coverage: 64.42% (comparable to AWS SSO at 62.64%)
Coverage gap is primarily in Azure SDK integration code (device code authentication flow, token acquisition) which follows the same pattern as AWS implementation (no SDK mocking).
Manual Testing
Verified with:
- ✅ Device code authentication flow with browser interaction
- ✅ Multi-subscription workflows with location overrides
- ✅ Terraform azurerm provider with KeyVault resources
- ✅ Terraform azuread provider with AD group operations
- ✅ Terraform azapi provider
- ✅ Token caching and automatic reuse
- ✅ MSAL cache format compatibility with Azure CLI
- ✅ Cross-platform testing (macOS, Linux, Windows)
Documentation
Comprehensive Tutorial
Created detailed Azure authentication guide:
website/docs/cli/commands/auth/tutorials/azure-authentication.mdx(689 lines)- Covers all three authentication methods with step-by-step examples
- Multi-subscription workflows and CI/CD patterns
- Troubleshooting guide and common scenarios
- Security best practices
Updated Command Documentation
- Updated
website/docs/cli/commands/auth/auth-login.mdxwith Azure examples - Added authentication methods comparison (AWS vs Azure)
- Added provider-specific configuration examples
Feature Announcement Blog Post
website/blog/2025-11-07-azure-authentication-support.mdx(447 lines)- Feature announcement with usage examples
- Migration guide from
az login - CI/CD integration patterns (GitHub Actions, service principals)
- Implementation details (MSAL cache, token scopes)
- Security features and best practices
Migration from az login
Atmos is a drop-in replacement for az login:
Before:
az login
az account set --subscription "YOUR_SUBSCRIPTION_ID"
terraform applyAfter:
atmos auth login --identity azure-dev
atmos terraform apply my-component -s my-stackBoth write to the same Azure CLI files (~/.azure/msal_token_cache.json and ~/.azure/azureProfile.json), so existing Terraform code works without any changes.
Security Features
- Secure Storage: Credentials stored in OS keyring (Keychain on macOS, Secret Service on Linux, Credential Manager on Windows)
- MSAL Cache Compatibility: Tokens also written to Azure CLI MSAL cache for Terraform provider compatibility
- Token Expiration: Automatic detection and handling of expired tokens (1-hour default)
- File Permissions: Credential files created with 0600 permissions (user read/write only)
- Least Privilege: Supports Azure RBAC for minimal access configuration
- No Plaintext Secrets: Service principal secrets stored in keyring, not on disk
Files Changed
New Implementation Files (27 files)
Core Azure Auth
pkg/auth/types/azure_credentials.go- Azure credential typepkg/auth/cloud/azure/*.go- Azure cloud utilities (5 files)pkg/auth/providers/azure/*.go- Azure providers (5 files)pkg/auth/identities/azure/*.go- Azure identities (1 file)
Tests
pkg/auth/types/azure_credentials_test.gopkg/auth/cloud/azure/*_test.go(2 files)pkg/auth/providers/azure/*_test.go(3 files)pkg/auth/identities/azure/*_test.go(1 file)
Integration
pkg/auth/factory/factory.go- Register Azure providers/identitiespkg/auth/types/constants.go- Azure provider kind constantspkg/schema/schema.go- Azure auth context schemaerrors/errors.go- Azure error definitions
Documentation Files (4 files)
website/docs/cli/commands/auth/tutorials/azure-authentication.mdxwebsite/docs/cli/commands/auth/auth-login.mdx(updated)website/blog/2025-11-07-azure-authentication-support.mdxwebsite/blog/authors.yml(updated)
Modified Core Files (6 files)
internal/exec/terraform_generate_backend.go- Azure backend authinternal/exec/terraform_utils.go- Azure provider authinternal/exec/utils.go- Azure auth context handlingcmd/auth_console.go- Azure console URL supportgo.mod/go.sum- Dependencies already present
Breaking Changes
None. This is a new feature that doesn't affect existing functionality.
Checklist
- Code compiles successfully
- All existing tests pass
- Added comprehensive unit tests (3,401 lines)
- Test coverage >80% on core packages
- Cross-platform compatibility (macOS, Linux, Windows)
- Manual testing with all Azure Terraform providers
- Documentation added (tutorial + command docs + blog post)
- Blog post required for minor feature ✅
- No breaking changes
- Follows conventional commits format
- CodeQL security scan passing
Future Enhancements
Potential additions:
- Azure Managed Identity support for VM/container workloads
- Azure Government Cloud / sovereign cloud support
- Azure CLI credential migration/import tools
- Enhanced Azure-specific debugging and logging
- Certificate-based service principal authentication
References
- Azure Device Code Flow Docs
- Azure OIDC/Workload Identity
- Azure Service Principals
- Terraform azurerm Provider
- Terraform azuread Provider
- Terraform azapi Provider
- Azure CLI MSAL Cache Format
Co-Authored-By: Claude noreply@anthropic.com
Summary by CodeRabbit
-
New Features
- Native Azure authentication (device-code, CLI, OIDC), subscription-scoped identities, tenant-aware portal sign-in links, in-process credential handling, secure on-disk credential management, MSAL/token cache and Azure CLI profile sync, and environment preparation for Terraform/tool compatibility.
-
Bug Fixes
- Azure console access now returns tenant-scoped portal links instead of an error.
-
Documentation
- Added Azure auth guide, tutorials, CLI docs, and a blog post.
-
Tests
- Extensive unit tests covering Azure providers, identities, file/cache, MSAL cache, console URLs, and env prep.
fix: Pin Helm to v3.19.2 to avoid Helm 4.0 plugin verification issues @osterman (#1785)
what
- Pins Helm to v3.19.2 (latest 3.x version) in CI workflows
- Updates helmfile-action to use pinned helm-version parameter
- Replaces apt-get helm installation with azure/setup-helm action for version control
why
Helm 4.0 was released with breaking changes to plugin verification that causes the helm-diff plugin installation to fail with "Error: plugin source does not support verification". Pinning to Helm 3.x ensures compatibility with the existing helm-diff plugin until it's updated to support Helm 4.0.
references
- Closes plugin verification issues in Helm 4.0
- Maintains compatibility with current helm-diff plugin
Summary by CodeRabbit
- Chores
- Updated test workflow configuration to explicitly specify Helm version for improved consistency and reliability in CI/CD testing infrastructure.
feat: add error handling infrastructure with context-aware capabilities @osterman (#1763)
## what- Add comprehensive error handling infrastructure extracted from PR #1599
- Provide foundation for rich, user-friendly error messages without migrating existing code
- Add error builder with fluent API for hints, context, exit codes, and explanations
- Add smart error formatting with TTY detection, markdown rendering, and color support
- Add verbose mode with
--verboseflag for context table display - Add Sentry integration for optional error reporting
- Add Claude agent for error message design expertise
- Add linter rules to enforce error handling best practices
why
- Lower Risk: Extract infrastructure only, no migration of existing code (20 files vs 100 in original PR)
- Enable Future Work: Provides foundation for incremental migration in focused follow-up PRs
- Better Developer Experience: Rich error messages with actionable hints guide users toward solutions
- Testable: 78.8% test coverage, all tests passing
- Well Documented: Complete developer guide, architecture PRDs, and Claude agent
- Enforced Patterns: Linter rules prevent bad error handling patterns
Infrastructure Added
Error Builder Pattern
Fluent API for constructing enriched errors:
err := errUtils.Build(errUtils.ErrComponentNotFound).
WithHintf("Component '%s' not found in stack '%s'", component, stack).
WithHint("Run 'atmos list components -s %s' to see available components", stack).
WithContext("component", component).
WithContext("stack", stack).
WithExitCode(2).
Err()Smart Formatting
- TTY-aware rendering with markdown support
- Automatic color degradation (TrueColor → 256 → 16 → None)
- Context table display in verbose mode
- Markdown-rendered error messages
Verbose Mode
New --verbose / -v flag (also ATMOS_VERBOSE=true) enables:
- Context table display showing structured error details
- Full stack traces for debugging
- Detailed error information
Normal output:
✗ Component 'vpc' not found
💡 Component 'vpc' not found in stack 'prod/us-east-1'
💡 Run 'atmos list components -s prod/us-east-1' to see available components
Verbose output (--verbose):
✗ Component 'vpc' not found
💡 Component 'vpc' not found in stack 'prod/us-east-1'
💡 Run 'atmos list components -s prod/us-east-1' to see available components
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┓
┃ Context ┃ Value ┃
┣━━━━━━━━━━━╋━━━━━━━━━━━━━━━━━━━━━┫
┃ component ┃ vpc ┃
┃ stack ┃ prod/us-east-1 ┃
┗━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━┛
Sentry Integration
Optional error reporting with:
- Automatic context extraction (hints → breadcrumbs, context → tags)
- PII-safe reporting
- Atmos-specific tags (component, stack, exit code)
- Configurable in
atmos.yaml
Static Sentinel Errors
Type-safe error definitions in errors/errors.go:
- Enables
errors.Is()checking across wrapped errors - Prevents typos and inconsistencies
- Makes error handling testable
Exit Code Support
- Custom exit codes (2 for config/usage errors, 1 for runtime errors)
- Proper error classification
- Consistent error signaling
Documentation
- Developer Guide:
docs/errors.md- Complete API reference with examples - Architecture PRD:
docs/prd/error-handling.md- Design decisions and rationale - Error Types:
docs/prd/error-types-and-sentinels.md- Error catalog - Exit Codes:
docs/prd/exit-codes.md- Exit code standards - Implementation Plan:
docs/prd/atmos-error-implementation-plan.md- Migration phases - CLAUDE.md Updates: Enhanced error handling patterns for contributors
- Claude Agent:
.claude/agents/atmos-errors.md- Expert system for error message design
Linter Enforcement
Added rules to .golangci.yml:
- Require static sentinel errors (no dynamic errors like
errors.New(fmt.Sprintf(...))) - Prevent deprecated
github.com/pkg/errorsusage - Encourage
WithHintf()overWithHint(fmt.Sprintf())
Schema Updates
Added to pkg/schema/schema.go:
ErrorsConfig- Error handling configurationErrorFormatConfig- Formatting options (verbose, color)SentryConfig- Sentry integration settings
Configuration example:
errors:
format:
verbose: false # Enable with --verbose flag
color: auto # auto|always|never
sentry:
enabled: true
dsn: "https://..."
environment: "production"Testing
- ✅ 78.8% test coverage for error handling components
- ✅ All tests passing
- ✅ Build succeeds
- ✅ No existing tests broken
- ✅ Zero integration with existing code (no risk)
What Was NOT Extracted (Deferred to Future PRs)
- ❌ Migration of existing errors to use new system
- ❌ Context-aware hints for specific error scenarios (can be added incrementally)
- ❌ Changes to existing error handling in
cmd/,internal/exec/,pkg/ - ❌ Race condition fixes (can be handled separately)
Risk Assessment
Risk Level: VERY LOW ✅
- No Breaking Changes - All new code, no existing code modified
- Zero Integration - Error package is standalone, not used by existing code yet
- Fully Tested - Complete test coverage, all tests passing
- Well Documented - Comprehensive documentation for developers
- Linter Enforced - Prevents bad patterns in new code
Files Changed
New Files (20):
errors/package (14 files): builder, formatter, exit codes, Sentry, testsdocs/errors.md- Developer guidedocs/prd/- 4 PRD documents.claude/agents/atmos-errors.md- Claude agent
Modified Files (6):
CLAUDE.md- Error handling documentation.golangci.yml- Linter rulescmd/root.go- Verbose flag onlypkg/schema/schema.go- Error config schemago.mod,go.sum- Dependencies
Total: 26 files (vs 100 files in original PR #1599)
Future Migration Strategy
With this infrastructure in place, future PRs can:
- Migrate specific commands incrementally - One command at a time (e.g.,
terraformcommands) - Add context-aware hints gradually - Low risk, focused changes for each error scenario
- Fix race conditions independently - Separate from error handling changes
Each follow-up PR will be:
- Small and focused (5-10 files)
- Easy to review
- Low risk to existing functionality
Dependencies Added
github.com/cockroachdb/errors- Core error handling library (drop-in replacement for stdliberrors)github.com/getsentry/sentry-go- Optional error reporting
references
- Extracted from PR #1599
- cockroachdb/errors: https://github.com/cockroachdb/errors
- Sentry Go SDK: https://docs.sentry.io/platforms/go/
Co-Authored-By: Claude noreply@anthropic.com
Summary by CodeRabbit
-
New Features
- New verbose flag (-v/--verbose); Markdown-styled error output with Error/Explanation/Hints/Examples/Context; error builder API; improved exit-code propagation; dedicated exec/workflow error types; configurable Sentry reporting with per-component clients and registry.
-
Bug Fixes
- More consistent, deterministic error presentation and reliable extraction/propagation of subprocess and wrapped error exit codes.
-
Documentation
- Extensive developer guides, PRDs, website docs, and a blog post on error handling and monitoring.
-
Tests
- Expanded coverage for formatting, builder, exit codes, Sentry integration, renderer, and CLI snapshots.
fix: Resolve changelog check failures on large PRs @osterman (#1782)
## SummaryReplace GitHub's diff API with local git diff to avoid API limitations that cause changelog check failures on large PRs. GitHub's diff API has hard limits (~300 files or ~3000 lines), but using local git diff with base/head SHAs works reliably for any PR size.
Test Plan
- Changelog check should pass on large PRs
- Changelog check should still work on normal-sized PRs
- Blog post detection logic unchanged
Summary by CodeRabbit
- Chores
- Optimized changelog verification workflow to more reliably detect blog file changes in pull requests while reducing dependency on external API limitations.
Refine homepage hero and typing animation @osterman (#1781)
Summary
- Remove excessive padding around hero demo image and screenshot container
- Redesign typing cursor as a thin, blinking block character with subtle CRT-style glow
- Remove "and more..." from typing animation word list
- Left-align feature card descriptions
- Constrain hero demo image to 70% viewport width for better layout
Summary by CodeRabbit
-
New Features
- Added an InstallWidget to the quick-start page for selecting and copying install commands.
-
Style
- Refined typing animation cursor with a stronger block glyph, glow pulse, and contrast-aware coloring
- Enhanced landing hero spacing, image sizing, and feature card alignment
- Shortened product list in typing animation
- Enabled smooth scrolling site-wide
-
UX
- Hero intro now animates into view with a subtle fade-in motion
Add custom sanitization map to test CLI for test-specific output rules @osterman (#1779)
This PR introduces a custom sanitization map for the test CLI to enable test-specific output rules. The implementation adds comprehensive test coverage for path sanitization across different formats (Unix, Windows, debug logs) and standardizes snapshot testing behavior.Tests verify sanitization of absolute paths, Windows-style backslashes, debug logs with import prefixes, multiple occurrences, and path normalization. Documentation updates explain the sanitization testing strategy.
🚀 Enhancements
fix: Prevent usage error after successful workflow TUI execution @aknysh (#1796)
what
- Fixed workflow command to return immediately after successful TUI execution
- Added comprehensive tests for
ExecuteWorkflowCmdfunction - Increased workflow test coverage from 1.7% to 2.8% (+64.7%)
- Added regression test to prevent future occurrences of this bug
why
- When using the workflow TUI (
atmos workflowwith no args), the command would show "Incorrect Usage" message after successfully selecting and displaying a workflow - This happened because after the TUI execution returned successfully, the code continued to check for the
--fileflag, which was never set when using the TUI - The fix adds an early
return nilafter successful TUI execution to prevent the unwanted usage error - The regression test ensures that workflow execution with the
--fileflag continues to work correctly
Summary by CodeRabbit
-
Bug Fixes
- Workflow command no longer displays spurious usage errors when run without arguments or without a file flag.
-
Tests
- Added comprehensive tests covering workflow execution, flag handling, path resolution, dry-run, stack/from-step/identity flags, and error cases.
-
Chores
- Bumped several dependency versions and updated license entries.
-
Fixtures
- Added a test workflow that exercises a failing shell command scenario.
fix: Propagate auth context through nested `!terraform.state` functions @aknysh (#1786)
what
- Fixed authentication context not propagating through nested
!terraform.stateand!terraform.outputYAML function evaluations - Added
AuthManagerfield toConfigAndStacksInfostruct to enable auth propagation through the execution pipeline - Implemented component-level authentication override for nested functions, allowing each component to optionally define its own
auth:configuration - Enhanced auth resolver to check for default identities before creating component-specific AuthManager
- Updated
TerraformStateGetterandTerraformOutputGetterinterfaces to acceptauthManagerparameter - Added comprehensive test fixtures and test suites for nested authentication scenarios (18 tests covering 5 scenarios)
- Fixed identity selector exit handling: Pressing Ctrl+C or ESC now immediately exits with proper POSIX exit code (130) instead of requiring multiple presses or continuing execution
- Fixed authentication prompt for invalid components: Component validation now occurs before authentication, preventing identity selection prompts when the component doesn't exist
why
Problem 1: Nested Authentication Propagation
When executing Atmos commands with authentication enabled, nested !terraform.state functions failed with IMDS timeout errors even though the top-level command had valid authenticated credentials. This occurred when a component's configuration contained !terraform.state functions that referenced other components which themselves contained !terraform.state functions.
Root Cause: The GetTerraformState() function received an authContext parameter but did not have access to the AuthManager. When processing nested components, it called ExecuteDescribeComponent() without an AuthManager, breaking the authentication chain at level 2+ of nesting.
Example Failure:
# Level 1: tgw/routes (top-level, ✅ works)
tgw/routes:
vars:
routes:
- attachment_id: !terraform.state tgw/attachment vpc_attachment_id
# Level 2: tgw/attachment (nested, ✅ works)
tgw/attachment:
vars:
transit_gateway_id: !terraform.state tgw/hub core-use2-network transit_gateway_id # ❌ FAILS - no auth
# Level 3: tgw/hub (nested within nested, ❌ fails)Solution: Added AuthManager to ConfigAndStacksInfo struct and threaded it through the entire execution pipeline, enabling all nested function evaluations to access authenticated credentials. Additionally implemented component-level auth override to support cross-account state reading in nested scenarios.
Problem 2: Identity Selector Exit Handling
When the identity selector appeared (either from --identity flag without value, or when processing YAML functions with no default identity configured), pressing Ctrl+C would not exit the program. Instead:
- First Ctrl+C press was consumed by the
huhTUI library but returnedErrUserAborted - The
autoDetectDefaultIdentity()function intentionally swallowed ALL errors (includingErrUserAborted) for "backward compatibility" - The function returned
("", nil), causing execution to continue without authentication - User had to press Ctrl+C a second time to actually exit
Root Cause: The error handling in pkg/auth/manager_helpers.go:autoDetectDefaultIdentity() was catching ErrUserAborted from the identity selector and converting it to a successful empty result for backward compatibility, preventing proper exit handling.
Solution:
- Modified
autoDetectDefaultIdentity()to propagateErrUserAbortedwhile preserving backward compatibility for other errors - Added exit handlers in
terraform.goandterraform_utils.goto immediately exit with code 130 when user aborts - Enhanced identity selector with custom KeyMap to support both Ctrl+C and ESC keys
- Added visible instruction: "Press ctrl+c or esc to exit"
- Created constant
ExitCodeSIGINT = 130for POSIX-compliant signal exit codes
Problem 3: Authentication Before Component Validation
When running a command with an invalid component name, Atmos would prompt for identity selection before checking if the component exists:
atmos terraform apply bad-component -s core-euc1-network
# Prompted for identity selection first
# Then showed error: Could not find the component 'bad-component' in the stackRoot Cause: The authentication flow in ExecuteTerraform() was calling CreateAndAuthenticateManager() before the component existence check, causing unnecessary user interaction for invalid components.
Solution: Modified the component auth config retrieval logic to immediately exit if ExecuteDescribeComponent() returns ErrInvalidComponent, preventing authentication attempts for non-existent components.
Benefits:
- ✅ Nested
!terraform.statefunctions now work at any depth with proper authentication - ✅ Components can override authentication at any nesting level for cross-account scenarios
- ✅ No IMDS timeout errors when processing nested component configurations
- ✅ Identity selector no longer shows incorrectly for components without default identity
- ✅ Cleaner debug logs with reduced noise from expected auth resolution paths
- ✅ Ctrl+C and ESC immediately exit the identity selector (single keypress, exit code 130)
- ✅ No error message displayed on user abort (clean exit)
- ✅ Clear exit instructions shown to users in the selector UI
- ✅ No authentication prompt for invalid components (validation happens first)
Summary by CodeRabbit
-
New Features
- Component-level authentication overrides for nested Terraform functions enable fine-grained control over credentials in multi-account setups.
-
Bug Fixes
- Fixed authentication context propagation through nested Terraform state and output evaluations.
- Improved user experience when canceling interactive identity selection.
-
Documentation
- Added comprehensive guides on authentication flows for Terraform YAML functions and nested authentication handling.
-
Chores
- Updated AWS SDK and Go dependencies to latest versions.
fix: Reduce log spam for imports outside base directory @osterman (#1780)
what
- Changed import path validation logging from WARN to TRACE level
- Added test coverage for imports outside base directory
why
When using import paths that resolve outside the base directory, the warning message was being logged repeatedly during CI/CD workflows, creating excessive log spam. This is particularly noticeable in GitHub Actions where atmos is invoked multiple times.
The message "Import path is outside of base directory" is informational trace data about the import resolution process, not a user-actionable warning. Imports outside the base directory are a valid use case for shared configurations.
references
- Consistent with other import-related logging in the same file (lines 108, 111, 114 of
pkg/config/imports.go) which use TRACE level - Follows the logging level pattern:
- TRACE: Detailed import flow (what's happening during resolution)
- DEBUG: Error conditions (what went wrong)
- WARN: User-actionable problems (what needs fixing)
Summary by CodeRabbit
-
Tests
- Added test coverage for local imports outside the base directory to ensure proper resolution behavior.
-
Bug Fixes
- Reduced log verbosity by changing import path warnings to trace-level logs, decreasing warning-level noise in logs.