github cloudposse/atmos v1.199.0

7 hours ago
feat: Add configuration profiles support @osterman (#1752)

what

  • Add comprehensive test coverage for profile CLI commands (cmd/profile/show.go and cmd/profile/list.go)
  • Improve test coverage from 0% to 59.8% for profile command layer
  • Add 1,003 lines of table-driven tests across 2 new test files
  • Fix cross-platform compatibility issues (Windows path separators)
  • Remove unreachable dead code in existing tests
  • Correct misleading test case names to match actual behavior

why

  • CodeCov analysis identified profile CLI commands with 0% patch coverage
  • While overall project coverage is at 70.04%, new profile commands lacked any tests
  • Without tests, we cannot verify correct behavior across different scenarios
  • Cross-platform issues and misleading test names reduce code quality
  • Comprehensive test coverage ensures profile commands work reliably in production

references

  • Addresses CodeCov patch coverage gaps identified in recent commits
  • Follows Go testing best practices with table-driven tests
  • Uses testify library for assertions (assert/require)
  • Tests cover all output formats (JSON, YAML, table), error cases, and edge cases
  • Ensures cross-platform compatibility with filepath.Join for Windows support

Test Coverage Improvements

Files Added

cmd/profile/show_test.go (477 lines)

  • Format flag completion tests
  • Profile name completion tests
  • Error builder tests (profile not found, invalid format)
  • JSON rendering tests (basic profiles, profiles with metadata)
  • YAML rendering tests (basic profiles, deprecated profiles)
  • Format dispatcher tests (text/json/yaml formats, invalid formats)
  • Profile info retrieval tests (existing/non-existent profiles)
  • Edge case tests (no files, many files, nil/empty metadata)

cmd/profile/list_test.go (526 lines)

  • Format flag completion tests
  • JSON rendering tests (empty lists, single profile, multiple profiles with metadata)
  • YAML rendering tests (empty lists, single profile, multiple profiles)
  • Profile discovery error builder tests
  • Format dispatcher tests (table/json/yaml formats, invalid formats)
  • Empty profile list tests (all formats)
  • Complex profile tests (long paths, many files, rich metadata)
  • Integration test skeleton (skipped, covered in tests/ directory)

Coverage Results

  • Before: cmd/profile/show.go - 0% coverage
  • After: cmd/profile/show.go - 59.8% coverage
  • Before: cmd/profile/list.go - 0% coverage
  • After: cmd/profile/list.go - 59.8% coverage

Fixes Applied

  1. Misleading Test Name (cmd/profile/show_test.go:300)

    • Changed "empty format defaults to text" → "empty format is invalid"
    • Test validates error behavior, name now matches implementation
  2. Unreachable Dead Code (pkg/profile/list/formatter_table_test.go:234-239)

    • Removed conditional check for path length exceeding width
    • Prior assertion already guaranteed path ≤ pathWidth, making code unreachable
    • Removed unused strings import
  3. Windows Path Separator (pkg/profile/manager_test.go:607)

    • Changed hardcoded "stacks/dev.yaml" → filepath.Join("stacks", "dev.yaml")
    • Ensures tests pass on Windows (backslashes) and Unix (forward slashes)

Testing Strategy

All tests follow table-driven pattern with:

  • Descriptive test case names
  • Multiple scenarios per function
  • Validation functions for complex assertions
  • JSON/YAML unmarshaling verification
  • Error type checking with assert.ErrorIs
  • Edge cases (empty inputs, nil values, large datasets)

Tests verify:

  • ✅ Shell completion for format flags
  • ✅ JSON output (valid structure, correct field names)
  • ✅ YAML output (valid structure, correct field names)
  • ✅ Error messages and error types
  • ✅ Profile discovery and retrieval
  • ✅ Complex metadata handling
  • ✅ Cross-platform compatibility

Summary by CodeRabbit

  • New Features

    • Configuration profiles: multi-location discovery, precedence-based composition, activate via --profile or ATMOS_PROFILE; new profile commands: list and show (table/json/yaml) and alias "atmos list profiles".
  • Documentation

    • PRDs, CLI docs, blog post, examples and test fixtures for developer/CI/production profiles and migration guidance.
  • UI/Style

    • Profile list/show renderers and a new Notice theme style for messaging.
  • User-facing errors

    • Clear, actionable errors for profile discovery, missing profiles, load/merge failures and invalid output formats.
  • Tests

    • Extensive unit and end-to-end CLI tests, snapshots, and fixtures covering profiles and help output.
  • Chores

    • Minor dependency updates.
feat: Remove deep exits from version command @osterman (#1783)

Summary

Refactors the version command to remove deep exits and allow execution even with invalid or missing configuration. Version is a diagnostic tool that must always work, making it the first step in troubleshooting.

Key Changes:

  • Removes log.Fatal() deep exit in version execution logic
  • Adds version command detection in PersistentPreRunE to suppress config errors
  • Moves InitializeMarkdown() after config error checking to prevent deep exits
  • Adds 10 comprehensive integration tests for invalid configs (YAML, aliases, schema)
  • Documents version command design and avoiding deep exits pattern in PRDs

This establishes the pattern for refactoring all commands to remove deep exits, enabling proper error handling, testability, and composability.

Test Plan

  • ✅ All existing version command tests passing
  • ✅ 10 new integration tests verify version works with invalid YAML syntax
  • ✅ 10 new integration tests verify version works with invalid command aliases
  • ✅ 10 new integration tests verify version works with invalid config schema
  • ✅ Both atmos version and atmos --version work with broken configs

Summary by CodeRabbit

  • New Features

    • Version command reliably outputs formatted JSON/YAML and performs update checks even when user config is invalid.
  • Bug Fixes

    • Reduced abrupt exits: startup and version flows return enriched, recoverable errors and emit clearer non-fatal messages/warnings.
  • Documentation

    • Added PRDs and guides for the version command, avoiding deep-exit patterns, unified flag handling, error message examples, and format usage examples.
  • Tests

    • Added fixtures and extensive tests for invalid YAML, schema, aliases, and version scenarios (formatting, checks, and error paths).
  • Chores

    • Expanded YAML check exclusions and linter rules to enforce unified flag parsing.
test: add comprehensive env inheritance tests @osterman (#1789)

what

  • Fix bug where !env YAML function only read from OS environment variables, ignoring env variables defined in stack manifests
  • Add comprehensive test coverage for env variable inheritance across stack manifests and components
  • Update !env documentation to explain resolution order and limitations

why

  • Users expected !env FOO to read from env: { FOO: "bar" } defined in globals.yaml or component env sections, but it only checked OS environment variables
  • Env variable inheritance was undocumented and untested, making it unclear how env sections merge across globals, imports, and components
  • Single-pass YAML function processing creates limitations that need to be clearly documented

references

  • Fixes the bug where !env couldn't access stack manifest env variables
  • Validates that env inheritance follows merge priority: GlobalEnv < BaseComponentEnv < ComponentEnv < ComponentOverridesEnv
  • Documents that YAML functions cannot reference results from other YAML functions in the same component section (single-pass limitation)
refactor: Migrate list commands to flag handler pattern @osterman (#1788) ## Summary Migrated all 10 `atmos list` subcommands to the StandardParser flag handler pattern. Reorganized into `cmd/list/` directory following command registry pattern. Added comprehensive environment variable support and eliminated deep exits for better testability.

Changes

  • Moved 10 list commands from root cmd/ to cmd/list/ directory
  • Replaced legacy flag handling with StandardParser + Options structs
  • Added ATMOS_* environment variable support for all flags
  • Created newCommonListParser() factory to eliminate flag duplication
  • Refactored checkAtmosConfig() to return errors instead of calling Exit()

Test Plan

  • ✓ All existing unit tests pass
  • ✓ Commands compile without errors
  • ✓ Help text shows all flags with proper descriptions
  • ✓ Environment variables work end-to-end

Summary by CodeRabbit

  • New Features
    • Adds a unified "list" command with dedicated subcommands (stacks, components, instances, workflows, metadata, settings, vendor, themes, values) and consistent output formatting.
    • Enhanced filtering (stack/component/query), flag + env-var support, shell completion for stack flags, formats/delimiters, max-columns, and home-directory obfuscation.
    • Clearer UX: consistent validation and user-friendly "No results" messages.
  • Documentation
    • Adds test-coverage report and a test-coverage improvement plan.
feat: Safe logout preserves keychain credentials by default @osterman (#1791)

Summary

Implement safe-by-default logout behavior where session data (tokens, cached credentials) is cleared, but keychain credentials (access keys, service account credentials) are preserved for faster re-authentication. Add --keychain flag with interactive confirmation for permanent credential deletion.

Changes

  • Safe-by-default behavior: atmos auth logout clears only session data
  • New --keychain flag to permanently delete credentials with confirmation
  • New --force flag for CI/CD environments to bypass confirmation
  • Cloud-provider-agnostic implementation working with all auth providers
  • Comprehensive documentation and blog post

References

See blog post and documentation for usage examples and migration guide for existing scripts.

Summary by CodeRabbit

  • New Features

    • Added --keychain to optionally delete stored credentials during logout (preserved by default).
    • Added --force to bypass interactive confirmation.
    • Interactive confirmation prompts for keychain deletion in TTY; non‑TTY and force handling supported.
    • Detects and warns about external cloud provider credentials (AWS/Azure/GCP env vars) after logout.
    • Extended dry‑run to preview keychain-related items when --keychain is used.
  • Documentation

    • Updated CLI docs and examples, added PRD and blog post explaining the change and migration guidance.
feat: add Version Management Patterns documentation @osterman (#1499) ## what - Add comprehensive documentation for Version Management Patterns in Atmos - Document five distinct patterns for managing component versions across environments - Provide practical examples, implementation details, and migration strategies

why

  • Teams struggle with balancing stability and velocity when managing infrastructure versions
  • "Just pin everything" often creates more problems than it solves at scale
  • No clear guidance exists on when to use which versioning strategy
  • Teams need to understand trade-offs between reproducibility, convergence, and operational overhead

references

  • Addresses common questions about version management in infrastructure-as-code
  • Provides alternatives to traditional strict pinning approaches
  • Helps teams choose patterns based on their scale and operational maturity

Summary

This PR adds comprehensive documentation for Version Management Patterns in Atmos, covering five distinct strategies:

📚 Patterns Documented

  1. Version Management Overview (version-management.mdx)

    • Explains deployment vs. release concepts
    • Provides pattern comparison table
    • Offers selection criteria
  2. Strict Version Pinning (strict-version-pinning.mdx)

    • Traditional per-environment pinning
    • High reproducibility, high overhead
    • Requires lockstep promotions
  3. Release Tracks/Channels (release-tracks-channels.mdx)

    • Environments subscribe to moving tracks
    • Promotes convergence and feedback
    • Medium operational overhead
  4. Folder-Based Versioning (folder-based-versioning.mdx)

    • Version through repository structure
    • Explicit boundaries for major changes
    • Gradual migration support
  5. Vendoring Components (vendoring-components.mdx)

    • Local control over dependencies
    • Predictable update windows
    • Custom patch support
  6. Git Flow: Branches as Channels (git-flow-branches-as-channels.mdx)

    • Long-lived branches as channels
    • Familiar Git workflows
    • Strong CI/CD integration

✨ Key Features

Each pattern includes:

  • Use cases and problem analysis
  • Detailed Atmos implementation
  • Real-world examples
  • Benefits and drawbacks
  • Best practices
  • Migration strategies

🎯 Impact

This documentation helps teams:

  • Choose appropriate versioning strategies
  • Avoid common pitfalls with version management
  • Implement patterns correctly with Atmos
  • Migrate between patterns as needs evolve

Summary by CodeRabbit

  • New Features

    • Expanded Atmos CLI help coverage for Terraform subcommands (apply, destroy, import, init, output, plan, refresh, state).
  • Documentation

    • Added extensive Version Management guidance (continuous deployment, folder-based versioning, release tracks/channels, strict pinning, vendoring, versioning schemes) and integrated examples across core docs.
    • Added Terraform configuration guidance (clean, deploy, shell prompts, workspace naming) and updated Component Catalog docs and navigation with redirects.
  • Chores

    • Increased CI/tools timeout.
feat: Add ui.Toast() pattern for status notifications @osterman (#1794)

what

  • Add new ui.Toast() and ui.Toastf() functions for flexible toast-style status notifications
  • Extract toast functionality from the larger toolchain PR (#1686) for independent review and merge
  • Update documentation to show Toast pattern as the primary approach for status notifications

why

  • The toolchain PR (#1686) is large and complex - extracting independent features allows faster review and merge
  • Toast pattern provides a unified, flexible approach for all user-facing status messages
  • Custom icon support enables better visual communication without creating new wrapper functions
  • Improves code maintainability by establishing a clear pattern for status notifications

Implementation Details

New Functions:

// Primary toast pattern with custom icons
ui.Toast("📦", "Using latest version: 1.2.3")
ui.Toastf("🔧", "Tool %s is not installed", toolName)

// Existing convenience wrappers (now documented as Toast wrappers)
ui.Success("Done!")      // ✓ Done! (green)
ui.Error("Failed!")      // ✗ Failed! (red)
ui.Warning("Deprecated") // ⚠ Deprecated (yellow)
ui.Info("Processing...")  // ℹ Processing... (cyan)

Benefits:

  • ✅ Consistent pattern for all toast notifications
  • ✅ Flexible icon support (custom emojis or themed icons)
  • ✅ Automatic channel routing (stderr for UI)
  • ✅ Automatic secret masking via I/O layer
  • ✅ Zero breaking changes - all existing functions work as before

Documentation Updates:

  • Updated docs/io-and-ui-output.md with Toast API reference
  • Updated docs/prd/io-handling-strategy.md with Toast pattern examples
  • Enhanced comments in pkg/ui/formatter.go to clarify Toast pattern

Testing

  • ✅ All existing tests pass
  • ✅ Builds successfully
  • ✅ Linter checks pass
  • ✅ No breaking changes to existing API

references

  • Extracted from #1686 (toolchain PR)
  • Part of ongoing UI/UX improvements for Atmos CLI

Summary by CodeRabbit

  • New Features

    • Added toast-style notifications with icon support, multiline toasts, and consistent icon+text formatting.
    • Introduced convenience functions for success, error, warning, and info messages with formatting variants.
  • Documentation

    • Updated UI API docs with toast examples and a new Plain UI Text section (Write/Writef/Writeln examples).
  • Tests

    • Added comprehensive tests covering toast outputs, multiline/unicode handling, and formatting variants.
fix: Use YAML !env function in Sentry config examples @osterman (#1793)

Summary

  • Fixed incorrect shell-style variable expansion to proper YAML !env function syntax in Sentry configuration examples
  • Removed redundant blog post sections (Real-World Impact, Why This Matters, What's Next)

Changes

Updated configuration examples to use !env VARIABLE_NAME instead of ${VARIABLE_NAME} syntax across documentation and blog post.

Summary by CodeRabbit

  • Documentation
    • Configuration syntax examples have been updated throughout documentation to provide improved clarity and consistency across all setup instructions and configuration best practices.
    • Blog post has been streamlined and refined with condensed narrative sections while fully preserving essential configuration examples, comprehensive technical guidance, and practical recommendations for users.
docs: Add identity provider file isolation PRDs @osterman (#1792)

what

  • Create universal identity provider file isolation pattern PRD defining canonical pattern for all providers (AWS, Azure, GCP, etc.)
  • Document AWS authentication file isolation as reference implementation showing how existing code implements the pattern
  • Document Azure authentication file isolation as planned implementation following the universal pattern
  • Establish clear separation between Atmos-managed enterprise/customer credentials and developer's personal hobby accounts

why

  • Protect developer's personal credentials: Most developers have personal AWS/Azure/GCP accounts for hobby projects that are manually configured with aws configure, az login, gcloud init. Atmos must never modify these personal accounts.
  • Critical multi-customer use case: When managing infrastructure for multiple customers (Cloud Posse use case), need physically separate credential files to make it "provably impossible" to accidentally use wrong customer's credentials.
  • Establish universal pattern: All identity providers (AWS, Azure, GCP) must follow the same XDG-compliant file isolation pattern for consistency.
  • Enable clean logout: Deleting an Atmos identity removes all work credentials without affecting personal hobby accounts.
  • Azure needs implementation: Current Azure implementation writes to ~/.azure/ which breaks developer's personal Azure CLI setup. This PRD documents the required changes to match AWS pattern.

Key architectural decision: Atmos-managed credentials go in ~/.config/atmos/{cloud}/, personal credentials stay in default locations (~/.aws/, ~/.azure/, ~/.config/gcloud/).

references

  • Implements XDG Base Directory Specification for credential storage
  • Documents existing AWS implementation that successfully isolates credentials using AWS_SHARED_CREDENTIALS_FILE and AWS_CONFIG_FILE
  • Plans Azure implementation using AZURE_CONFIG_DIR environment variable for isolation
  • Related to ongoing Azure authentication work

Summary by CodeRabbit

  • Documentation
    • Added a universal authentication file isolation pattern covering per-provider credential isolation, logout/cleanup semantics, XDG-compliant storage, environment variable wiring, security guidance, testing strategy, and migration steps
    • Added AWS-specific implementation and environment mappings
    • Added Azure-specific implementation guidance, XDG storage guidance, environment mappings, migration guidance, and testing recommendations
fix: Upgrade CodeQL Action from v3 to v4 @osterman (#1790)

what

  • Upgrade all CodeQL Action references from deprecated v3 to v4
  • Updates github/codeql-action/init, autobuild, analyze, and upload-sarif actions
  • Resolves deprecation warning about v3 being removed in December 2026

why

CodeQL Action v3 is deprecated and will be removed on December 28, 2026. This PR ensures the workflow continues to function with the supported version.

references

Summary by CodeRabbit

  • Chores
    • Updated GitHub Actions workflow dependencies to latest compatible versions for improved reliability and security in the continuous integration pipeline.
feat: Migrate theme commands to StandardFlagParser @osterman (#1772) ## Summary

This PR has two main components:

  1. Theme Command Migration: Migrated theme list and show commands to use the modern StandardFlagParser pattern
  2. Error Handling Documentation: Comprehensive documentation improvements for the Atmos error handling system

Changes

Theme Commands

  • Theme list command: Removed global variables, added type-safe options struct, enabled environment variable support (ATMOS_RECOMMENDED), and implemented proper flag precedence (CLI > env > config > default)
  • Theme show command: Established consistent StandardFlagParser pattern for future flag additions
  • Error handling: Improved theme command errors to use builder pattern with actionable hints
  • Test coverage: Added 30 comprehensive test cases validating flag handling, Viper integration, and flag precedence behavior

Error Handling Documentation

  • atmos-errors agent: Created comprehensive agent guide (14.7KB) for designing user-friendly error messages
  • Key principles documented:
    • Hints = WHAT TO DO (actionable steps) - NOT "what happened"
    • Explanations = WHAT HAPPENED (educational context)
    • Context = WHERE/HOW (debugging details, non-redundant)
  • Critical patterns:
    • Subprocess exit code preservation with exec.ExitError
    • errors.Join() order non-preservation warning
    • Error builder pattern with formatted methods (WithHintf, WithExplanationf)
    • Avoiding redundancy across builder methods
  • Error docs improvements: Updated docs/errors.md with formatted builder methods and clearer examples

Benefits

Theme Commands

  • Type-safe options with proper encapsulation
  • Full flag precedence support (CLI > env > config > default)
  • Environment variable support for all flags
  • Consistency with other Atmos commands
  • Better error messages with actionable hints

Error Handling System

  • Clear guidance for developers on creating user-friendly errors
  • Prevents common anti-patterns (explanatory hints, redundancy, wrong exit codes)
  • Ensures consistent error experience across Atmos
  • Proactive agent that reviews error handling code

Testing

  • 30+ theme command test cases covering flag handling and precedence
  • All error documentation examples validated for correctness
  • Error builder pattern verified with proper separation of hints/explanations/context

Summary by CodeRabbit

  • New Features

    • Theme commands now respect ATMOS_THEME/THEME with consistent precedence (CLI > env > config > default).
    • Added a "recommended only" option and unified flag parsing for theme list/show.
    • New public theme errors for clearer "not found" and "invalid" theme cases.
  • Tests

    • Expanded coverage for flag/env/config precedence, option parsing, theme resolution, and command executions.
  • Documentation

    • Blog post documenting env-var support and usage examples.
  • Chores

    • CI: pin Helm version for Helmfile steps.
feat: Add native Azure authentication support @jamengual (#1768)

Summary

This PR adds comprehensive native Azure authentication support to Atmos, enabling seamless authentication to Azure with full Terraform provider compatibility.

Features

Three Authentication Methods

  • Device Code Flow: Browser-based authentication for interactive developer sessions with MFA support
  • OIDC: Workload identity federation for GitHub Actions, GitLab CI, and Azure DevOps pipelines
  • Service Principals: Client credential authentication for automation and service accounts

Full Terraform Provider Support

Works seamlessly with all Azure Terraform providers out of the box:

  • azurerm - Complete Azure Resource Manager support including KeyVault operations
  • azuread - Azure Active Directory management
  • azapi - Alternative Azure management interface

Identical to az login Behavior

  • Writes credentials to ~/.azure/msal_token_cache.json (Azure CLI MSAL cache)
  • Updates ~/.azure/azureProfile.json with subscription configuration
  • Sets ARM_USE_CLI=true for Terraform providers
  • Drop-in replacement - existing Terraform code works without changes
  • Provides all three token scopes for complete Azure functionality:
    • https://management.azure.com/.default - Azure Resource Manager
    • https://graph.microsoft.com/.default - Azure AD operations
    • https://vault.azure.net/.default - Azure KeyVault operations

Multi-Subscription & Multi-Region Support

Easy switching between different Azure subscriptions, regions, and environments:

auth:
  identities:
    azure-dev:
      kind: azure/subscription
      principal:
        subscription_id: "DEV_SUBSCRIPTION_ID"
        location: "eastus"
    
    azure-prod:
      kind: azure/subscription
      principal:
        subscription_id: "PROD_SUBSCRIPTION_ID"
        location: "westus"

Quick Start

Configuration

auth:
  providers:
    azure-dev:
      kind: azure/device-code
      tenant_id: "12345678-1234-1234-1234-123456789012"
      subscription_id: "87654321-4321-4321-4321-210987654321"
      location: "eastus"

  identities:
    azure-dev-subscription:
      default: true
      kind: azure/subscription
      via:
        provider: azure-dev
      principal:
        subscription_id: "87654321-4321-4321-4321-210987654321"
        location: "eastus"

Usage

# Authenticate to Azure
atmos auth login

# Use with Terraform
atmos terraform plan my-component -s my-stack
atmos terraform apply my-component -s my-stack

# Switch subscriptions
atmos terraform apply my-component -s prod --identity azure-prod

Implementation Details

New Packages

pkg/auth/providers/azure/

  • device_code.go - Device code flow authentication with interactive browser flow
  • oidc.go - OIDC workload identity federation for CI/CD
  • service_principal.go - Client credentials authentication
  • cli.go - Azure CLI compatibility utilities
  • device_code_cache.go - Token caching and MSAL cache management

pkg/auth/identities/azure/

  • subscription.go - Azure subscription identity with location support

pkg/auth/cloud/azure/

  • setup.go - MSAL cache and Azure profile file management
  • env.go - Environment variable configuration for Terraform
  • files.go - Credential file operations with proper locking
  • console.go - Azure Portal URL generation

pkg/auth/types/

  • azure_credentials.go - Azure credential type implementation

Architecture

Follows Atmos architectural patterns:

  • Registry Pattern: Azure providers/identities register via factory
  • Interface-Driven: All components implement Provider/Identity interfaces
  • Provider-Agnostic Core: No Azure-specific code in core auth manager
  • Testable: Comprehensive unit tests with mocked dependencies

Complete Token Support

Atmos provides all three Azure token scopes (matching az login exactly):

  1. Management Token (https://management.azure.com/.default)

    • Used by azurerm and azapi providers
    • Enables all Azure Resource Manager operations
  2. Graph API Token (https://graph.microsoft.com/.default)

    • Used by azuread provider
    • Enables Azure AD operations (users, groups, service principals)
  3. KeyVault Token (https://vault.azure.net/.default)

    • Used by azurerm provider for KeyVault operations
    • Enables secret, key, and certificate management

This comprehensive token support ensures all Terraform resources work correctly, including KeyVault certificate contacts, secret management, and AD group operations.

CI/CD Integration Examples

GitHub Actions with OIDC

name: Deploy Infrastructure
on:
  push:
    branches: [main]

permissions:
  id-token: write  # Required for OIDC
  contents: read

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install Atmos
        uses: cloudposse/github-action-setup-atmos@v2

      - name: Authenticate to Azure
        run: atmos auth login --identity azure-prod-ci
        env:
          AZURE_CLIENT_ID: ${{ secrets.AZURE_CLIENT_ID }}
          AZURE_TENANT_ID: ${{ secrets.AZURE_TENANT_ID }}
          AZURE_SUBSCRIPTION_ID: ${{ secrets.AZURE_SUBSCRIPTION_ID }}

      - name: Deploy
        run: atmos terraform apply my-component -s prod

Service Principal Authentication

auth:
  providers:
    azure-automation:
      kind: azure/service-principal
      tenant_id: "YOUR_TENANT_ID"
      client_id: "YOUR_SERVICE_PRINCIPAL_CLIENT_ID"
      subscription_id: "YOUR_SUBSCRIPTION_ID"

  identities:
    azure-automation-prod:
      kind: azure/subscription
      via:
        provider: azure-automation
      principal:
        subscription_id: "YOUR_SUBSCRIPTION_ID"

Testing

Unit Test Coverage

Added comprehensive test coverage (3,401 lines of test code):

pkg/auth/cloud/azure/

  • files_test.go - File manager, locking, permissions (112 lines)
  • setup_test.go - MSAL cache updates, JWT extraction, profile management (336 lines)

pkg/auth/providers/azure/

  • device_code_test.go - Device code provider, validation, spinner UI (302 lines)
  • device_code_cache_test.go - Token caching, expiration, MSAL updates (286 lines)
  • cli_test.go - CLI provider validation and environment prep (179 lines)
  • oidc_test.go - OIDC provider with workload identity federation
  • service_principal_test.go - Service principal authentication

pkg/auth/identities/azure/

  • subscription_test.go - Subscription identity, location overrides (141 lines)

pkg/auth/types/

  • azure_credentials_test.go - Credential type, expiration, validation

Coverage Results

  • pkg/auth/cloud/azure: 81.9% ✅ (exceeded 80% target)
  • pkg/auth/providers/azure: 54.7%
  • pkg/auth/identities/azure: 62.5%
  • Overall patch coverage: 64.42% (comparable to AWS SSO at 62.64%)

Coverage gap is primarily in Azure SDK integration code (device code authentication flow, token acquisition) which follows the same pattern as AWS implementation (no SDK mocking).

Manual Testing

Verified with:

  • ✅ Device code authentication flow with browser interaction
  • ✅ Multi-subscription workflows with location overrides
  • ✅ Terraform azurerm provider with KeyVault resources
  • ✅ Terraform azuread provider with AD group operations
  • ✅ Terraform azapi provider
  • ✅ Token caching and automatic reuse
  • ✅ MSAL cache format compatibility with Azure CLI
  • ✅ Cross-platform testing (macOS, Linux, Windows)

Documentation

Comprehensive Tutorial

Created detailed Azure authentication guide:

  • website/docs/cli/commands/auth/tutorials/azure-authentication.mdx (689 lines)
  • Covers all three authentication methods with step-by-step examples
  • Multi-subscription workflows and CI/CD patterns
  • Troubleshooting guide and common scenarios
  • Security best practices

Updated Command Documentation

  • Updated website/docs/cli/commands/auth/auth-login.mdx with Azure examples
  • Added authentication methods comparison (AWS vs Azure)
  • Added provider-specific configuration examples

Feature Announcement Blog Post

  • website/blog/2025-11-07-azure-authentication-support.mdx (447 lines)
  • Feature announcement with usage examples
  • Migration guide from az login
  • CI/CD integration patterns (GitHub Actions, service principals)
  • Implementation details (MSAL cache, token scopes)
  • Security features and best practices

Migration from az login

Atmos is a drop-in replacement for az login:

Before:

az login
az account set --subscription "YOUR_SUBSCRIPTION_ID"
terraform apply

After:

atmos auth login --identity azure-dev
atmos terraform apply my-component -s my-stack

Both write to the same Azure CLI files (~/.azure/msal_token_cache.json and ~/.azure/azureProfile.json), so existing Terraform code works without any changes.

Security Features

  • Secure Storage: Credentials stored in OS keyring (Keychain on macOS, Secret Service on Linux, Credential Manager on Windows)
  • MSAL Cache Compatibility: Tokens also written to Azure CLI MSAL cache for Terraform provider compatibility
  • Token Expiration: Automatic detection and handling of expired tokens (1-hour default)
  • File Permissions: Credential files created with 0600 permissions (user read/write only)
  • Least Privilege: Supports Azure RBAC for minimal access configuration
  • No Plaintext Secrets: Service principal secrets stored in keyring, not on disk

Files Changed

New Implementation Files (27 files)

Core Azure Auth

  • pkg/auth/types/azure_credentials.go - Azure credential type
  • pkg/auth/cloud/azure/*.go - Azure cloud utilities (5 files)
  • pkg/auth/providers/azure/*.go - Azure providers (5 files)
  • pkg/auth/identities/azure/*.go - Azure identities (1 file)

Tests

  • pkg/auth/types/azure_credentials_test.go
  • pkg/auth/cloud/azure/*_test.go (2 files)
  • pkg/auth/providers/azure/*_test.go (3 files)
  • pkg/auth/identities/azure/*_test.go (1 file)

Integration

  • pkg/auth/factory/factory.go - Register Azure providers/identities
  • pkg/auth/types/constants.go - Azure provider kind constants
  • pkg/schema/schema.go - Azure auth context schema
  • errors/errors.go - Azure error definitions

Documentation Files (4 files)

  • website/docs/cli/commands/auth/tutorials/azure-authentication.mdx
  • website/docs/cli/commands/auth/auth-login.mdx (updated)
  • website/blog/2025-11-07-azure-authentication-support.mdx
  • website/blog/authors.yml (updated)

Modified Core Files (6 files)

  • internal/exec/terraform_generate_backend.go - Azure backend auth
  • internal/exec/terraform_utils.go - Azure provider auth
  • internal/exec/utils.go - Azure auth context handling
  • cmd/auth_console.go - Azure console URL support
  • go.mod / go.sum - Dependencies already present

Breaking Changes

None. This is a new feature that doesn't affect existing functionality.

Checklist

  • Code compiles successfully
  • All existing tests pass
  • Added comprehensive unit tests (3,401 lines)
  • Test coverage >80% on core packages
  • Cross-platform compatibility (macOS, Linux, Windows)
  • Manual testing with all Azure Terraform providers
  • Documentation added (tutorial + command docs + blog post)
  • Blog post required for minor feature ✅
  • No breaking changes
  • Follows conventional commits format
  • CodeQL security scan passing

Future Enhancements

Potential additions:

  • Azure Managed Identity support for VM/container workloads
  • Azure Government Cloud / sovereign cloud support
  • Azure CLI credential migration/import tools
  • Enhanced Azure-specific debugging and logging
  • Certificate-based service principal authentication

References


Co-Authored-By: Claude noreply@anthropic.com

Summary by CodeRabbit

  • New Features

    • Native Azure authentication (device-code, CLI, OIDC), subscription-scoped identities, tenant-aware portal sign-in links, in-process credential handling, secure on-disk credential management, MSAL/token cache and Azure CLI profile sync, and environment preparation for Terraform/tool compatibility.
  • Bug Fixes

    • Azure console access now returns tenant-scoped portal links instead of an error.
  • Documentation

    • Added Azure auth guide, tutorials, CLI docs, and a blog post.
  • Tests

    • Extensive unit tests covering Azure providers, identities, file/cache, MSAL cache, console URLs, and env prep.
fix: Pin Helm to v3.19.2 to avoid Helm 4.0 plugin verification issues @osterman (#1785)

what

  • Pins Helm to v3.19.2 (latest 3.x version) in CI workflows
  • Updates helmfile-action to use pinned helm-version parameter
  • Replaces apt-get helm installation with azure/setup-helm action for version control

why

Helm 4.0 was released with breaking changes to plugin verification that causes the helm-diff plugin installation to fail with "Error: plugin source does not support verification". Pinning to Helm 3.x ensures compatibility with the existing helm-diff plugin until it's updated to support Helm 4.0.

references

  • Closes plugin verification issues in Helm 4.0
  • Maintains compatibility with current helm-diff plugin

Summary by CodeRabbit

  • Chores
    • Updated test workflow configuration to explicitly specify Helm version for improved consistency and reliability in CI/CD testing infrastructure.
feat: add error handling infrastructure with context-aware capabilities @osterman (#1763) ## what
  • Add comprehensive error handling infrastructure extracted from PR #1599
  • Provide foundation for rich, user-friendly error messages without migrating existing code
  • Add error builder with fluent API for hints, context, exit codes, and explanations
  • Add smart error formatting with TTY detection, markdown rendering, and color support
  • Add verbose mode with --verbose flag for context table display
  • Add Sentry integration for optional error reporting
  • Add Claude agent for error message design expertise
  • Add linter rules to enforce error handling best practices

why

  • Lower Risk: Extract infrastructure only, no migration of existing code (20 files vs 100 in original PR)
  • Enable Future Work: Provides foundation for incremental migration in focused follow-up PRs
  • Better Developer Experience: Rich error messages with actionable hints guide users toward solutions
  • Testable: 78.8% test coverage, all tests passing
  • Well Documented: Complete developer guide, architecture PRDs, and Claude agent
  • Enforced Patterns: Linter rules prevent bad error handling patterns

Infrastructure Added

Error Builder Pattern

Fluent API for constructing enriched errors:

err := errUtils.Build(errUtils.ErrComponentNotFound).
    WithHintf("Component '%s' not found in stack '%s'", component, stack).
    WithHint("Run 'atmos list components -s %s' to see available components", stack).
    WithContext("component", component).
    WithContext("stack", stack).
    WithExitCode(2).
    Err()

Smart Formatting

  • TTY-aware rendering with markdown support
  • Automatic color degradation (TrueColor → 256 → 16 → None)
  • Context table display in verbose mode
  • Markdown-rendered error messages

Verbose Mode

New --verbose / -v flag (also ATMOS_VERBOSE=true) enables:

  • Context table display showing structured error details
  • Full stack traces for debugging
  • Detailed error information

Normal output:

✗ Component 'vpc' not found

💡 Component 'vpc' not found in stack 'prod/us-east-1'
💡 Run 'atmos list components -s prod/us-east-1' to see available components

Verbose output (--verbose):

✗ Component 'vpc' not found

💡 Component 'vpc' not found in stack 'prod/us-east-1'
💡 Run 'atmos list components -s prod/us-east-1' to see available components

┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┓
┃ Context   ┃ Value               ┃
┣━━━━━━━━━━━╋━━━━━━━━━━━━━━━━━━━━━┫
┃ component ┃ vpc                 ┃
┃ stack     ┃ prod/us-east-1      ┃
┗━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━┛

Sentry Integration

Optional error reporting with:

  • Automatic context extraction (hints → breadcrumbs, context → tags)
  • PII-safe reporting
  • Atmos-specific tags (component, stack, exit code)
  • Configurable in atmos.yaml

Static Sentinel Errors

Type-safe error definitions in errors/errors.go:

  • Enables errors.Is() checking across wrapped errors
  • Prevents typos and inconsistencies
  • Makes error handling testable

Exit Code Support

  • Custom exit codes (2 for config/usage errors, 1 for runtime errors)
  • Proper error classification
  • Consistent error signaling

Documentation

  • Developer Guide: docs/errors.md - Complete API reference with examples
  • Architecture PRD: docs/prd/error-handling.md - Design decisions and rationale
  • Error Types: docs/prd/error-types-and-sentinels.md - Error catalog
  • Exit Codes: docs/prd/exit-codes.md - Exit code standards
  • Implementation Plan: docs/prd/atmos-error-implementation-plan.md - Migration phases
  • CLAUDE.md Updates: Enhanced error handling patterns for contributors
  • Claude Agent: .claude/agents/atmos-errors.md - Expert system for error message design

Linter Enforcement

Added rules to .golangci.yml:

  • Require static sentinel errors (no dynamic errors like errors.New(fmt.Sprintf(...)))
  • Prevent deprecated github.com/pkg/errors usage
  • Encourage WithHintf() over WithHint(fmt.Sprintf())

Schema Updates

Added to pkg/schema/schema.go:

  • ErrorsConfig - Error handling configuration
  • ErrorFormatConfig - Formatting options (verbose, color)
  • SentryConfig - Sentry integration settings

Configuration example:

errors:
  format:
    verbose: false  # Enable with --verbose flag
    color: auto     # auto|always|never
  sentry:
    enabled: true
    dsn: "https://..."
    environment: "production"

Testing

  • ✅ 78.8% test coverage for error handling components
  • ✅ All tests passing
  • ✅ Build succeeds
  • ✅ No existing tests broken
  • ✅ Zero integration with existing code (no risk)

What Was NOT Extracted (Deferred to Future PRs)

  • ❌ Migration of existing errors to use new system
  • ❌ Context-aware hints for specific error scenarios (can be added incrementally)
  • ❌ Changes to existing error handling in cmd/, internal/exec/, pkg/
  • ❌ Race condition fixes (can be handled separately)

Risk Assessment

Risk Level: VERY LOW

  1. No Breaking Changes - All new code, no existing code modified
  2. Zero Integration - Error package is standalone, not used by existing code yet
  3. Fully Tested - Complete test coverage, all tests passing
  4. Well Documented - Comprehensive documentation for developers
  5. Linter Enforced - Prevents bad patterns in new code

Files Changed

New Files (20):

  • errors/ package (14 files): builder, formatter, exit codes, Sentry, tests
  • docs/errors.md - Developer guide
  • docs/prd/ - 4 PRD documents
  • .claude/agents/atmos-errors.md - Claude agent

Modified Files (6):

  • CLAUDE.md - Error handling documentation
  • .golangci.yml - Linter rules
  • cmd/root.go - Verbose flag only
  • pkg/schema/schema.go - Error config schema
  • go.mod, go.sum - Dependencies

Total: 26 files (vs 100 files in original PR #1599)

Future Migration Strategy

With this infrastructure in place, future PRs can:

  1. Migrate specific commands incrementally - One command at a time (e.g., terraform commands)
  2. Add context-aware hints gradually - Low risk, focused changes for each error scenario
  3. Fix race conditions independently - Separate from error handling changes

Each follow-up PR will be:

  • Small and focused (5-10 files)
  • Easy to review
  • Low risk to existing functionality

Dependencies Added

  • github.com/cockroachdb/errors - Core error handling library (drop-in replacement for stdlib errors)
  • github.com/getsentry/sentry-go - Optional error reporting

references

Co-Authored-By: Claude noreply@anthropic.com

Summary by CodeRabbit

  • New Features

    • New verbose flag (-v/--verbose); Markdown-styled error output with Error/Explanation/Hints/Examples/Context; error builder API; improved exit-code propagation; dedicated exec/workflow error types; configurable Sentry reporting with per-component clients and registry.
  • Bug Fixes

    • More consistent, deterministic error presentation and reliable extraction/propagation of subprocess and wrapped error exit codes.
  • Documentation

    • Extensive developer guides, PRDs, website docs, and a blog post on error handling and monitoring.
  • Tests

    • Expanded coverage for formatting, builder, exit codes, Sentry integration, renderer, and CLI snapshots.
fix: Resolve changelog check failures on large PRs @osterman (#1782) ## Summary

Replace GitHub's diff API with local git diff to avoid API limitations that cause changelog check failures on large PRs. GitHub's diff API has hard limits (~300 files or ~3000 lines), but using local git diff with base/head SHAs works reliably for any PR size.

Test Plan

  • Changelog check should pass on large PRs
  • Changelog check should still work on normal-sized PRs
  • Blog post detection logic unchanged

Summary by CodeRabbit

  • Chores
    • Optimized changelog verification workflow to more reliably detect blog file changes in pull requests while reducing dependency on external API limitations.
Refine homepage hero and typing animation @osterman (#1781)

Summary

  • Remove excessive padding around hero demo image and screenshot container
  • Redesign typing cursor as a thin, blinking block character with subtle CRT-style glow
  • Remove "and more..." from typing animation word list
  • Left-align feature card descriptions
  • Constrain hero demo image to 70% viewport width for better layout

Summary by CodeRabbit

  • New Features

    • Added an InstallWidget to the quick-start page for selecting and copying install commands.
  • Style

    • Refined typing animation cursor with a stronger block glyph, glow pulse, and contrast-aware coloring
    • Enhanced landing hero spacing, image sizing, and feature card alignment
    • Shortened product list in typing animation
    • Enabled smooth scrolling site-wide
  • UX

    • Hero intro now animates into view with a subtle fade-in motion
Add custom sanitization map to test CLI for test-specific output rules @osterman (#1779) This PR introduces a custom sanitization map for the test CLI to enable test-specific output rules. The implementation adds comprehensive test coverage for path sanitization across different formats (Unix, Windows, debug logs) and standardizes snapshot testing behavior.

Tests verify sanitization of absolute paths, Windows-style backslashes, debug logs with import prefixes, multiple occurrences, and path normalization. Documentation updates explain the sanitization testing strategy.

🚀 Enhancements

fix: Prevent usage error after successful workflow TUI execution @aknysh (#1796)

what

  • Fixed workflow command to return immediately after successful TUI execution
  • Added comprehensive tests for ExecuteWorkflowCmd function
  • Increased workflow test coverage from 1.7% to 2.8% (+64.7%)
  • Added regression test to prevent future occurrences of this bug

why

  • When using the workflow TUI (atmos workflow with no args), the command would show "Incorrect Usage" message after successfully selecting and displaying a workflow
  • This happened because after the TUI execution returned successfully, the code continued to check for the --file flag, which was never set when using the TUI
  • The fix adds an early return nil after successful TUI execution to prevent the unwanted usage error
  • The regression test ensures that workflow execution with the --file flag continues to work correctly

Summary by CodeRabbit

  • Bug Fixes

    • Workflow command no longer displays spurious usage errors when run without arguments or without a file flag.
  • Tests

    • Added comprehensive tests covering workflow execution, flag handling, path resolution, dry-run, stack/from-step/identity flags, and error cases.
  • Chores

    • Bumped several dependency versions and updated license entries.
  • Fixtures

    • Added a test workflow that exercises a failing shell command scenario.
fix: Propagate auth context through nested `!terraform.state` functions @aknysh (#1786)

what

  • Fixed authentication context not propagating through nested !terraform.state and !terraform.output YAML function evaluations
  • Added AuthManager field to ConfigAndStacksInfo struct to enable auth propagation through the execution pipeline
  • Implemented component-level authentication override for nested functions, allowing each component to optionally define its own auth: configuration
  • Enhanced auth resolver to check for default identities before creating component-specific AuthManager
  • Updated TerraformStateGetter and TerraformOutputGetter interfaces to accept authManager parameter
  • Added comprehensive test fixtures and test suites for nested authentication scenarios (18 tests covering 5 scenarios)
  • Fixed identity selector exit handling: Pressing Ctrl+C or ESC now immediately exits with proper POSIX exit code (130) instead of requiring multiple presses or continuing execution
  • Fixed authentication prompt for invalid components: Component validation now occurs before authentication, preventing identity selection prompts when the component doesn't exist

why

Problem 1: Nested Authentication Propagation

When executing Atmos commands with authentication enabled, nested !terraform.state functions failed with IMDS timeout errors even though the top-level command had valid authenticated credentials. This occurred when a component's configuration contained !terraform.state functions that referenced other components which themselves contained !terraform.state functions.

Root Cause: The GetTerraformState() function received an authContext parameter but did not have access to the AuthManager. When processing nested components, it called ExecuteDescribeComponent() without an AuthManager, breaking the authentication chain at level 2+ of nesting.

Example Failure:

# Level 1: tgw/routes (top-level, ✅ works)
tgw/routes:
  vars:
    routes:
      - attachment_id: !terraform.state tgw/attachment vpc_attachment_id

# Level 2: tgw/attachment (nested, ✅ works)
tgw/attachment:
  vars:
    transit_gateway_id: !terraform.state tgw/hub core-use2-network transit_gateway_id  # ❌ FAILS - no auth

# Level 3: tgw/hub (nested within nested, ❌ fails)

Solution: Added AuthManager to ConfigAndStacksInfo struct and threaded it through the entire execution pipeline, enabling all nested function evaluations to access authenticated credentials. Additionally implemented component-level auth override to support cross-account state reading in nested scenarios.


Problem 2: Identity Selector Exit Handling

When the identity selector appeared (either from --identity flag without value, or when processing YAML functions with no default identity configured), pressing Ctrl+C would not exit the program. Instead:

  • First Ctrl+C press was consumed by the huh TUI library but returned ErrUserAborted
  • The autoDetectDefaultIdentity() function intentionally swallowed ALL errors (including ErrUserAborted) for "backward compatibility"
  • The function returned ("", nil), causing execution to continue without authentication
  • User had to press Ctrl+C a second time to actually exit

Root Cause: The error handling in pkg/auth/manager_helpers.go:autoDetectDefaultIdentity() was catching ErrUserAborted from the identity selector and converting it to a successful empty result for backward compatibility, preventing proper exit handling.

Solution:

  1. Modified autoDetectDefaultIdentity() to propagate ErrUserAborted while preserving backward compatibility for other errors
  2. Added exit handlers in terraform.go and terraform_utils.go to immediately exit with code 130 when user aborts
  3. Enhanced identity selector with custom KeyMap to support both Ctrl+C and ESC keys
  4. Added visible instruction: "Press ctrl+c or esc to exit"
  5. Created constant ExitCodeSIGINT = 130 for POSIX-compliant signal exit codes

Problem 3: Authentication Before Component Validation

When running a command with an invalid component name, Atmos would prompt for identity selection before checking if the component exists:

atmos terraform apply bad-component -s core-euc1-network
# Prompted for identity selection first
# Then showed error: Could not find the component 'bad-component' in the stack

Root Cause: The authentication flow in ExecuteTerraform() was calling CreateAndAuthenticateManager() before the component existence check, causing unnecessary user interaction for invalid components.

Solution: Modified the component auth config retrieval logic to immediately exit if ExecuteDescribeComponent() returns ErrInvalidComponent, preventing authentication attempts for non-existent components.


Benefits:

  • ✅ Nested !terraform.state functions now work at any depth with proper authentication
  • ✅ Components can override authentication at any nesting level for cross-account scenarios
  • ✅ No IMDS timeout errors when processing nested component configurations
  • ✅ Identity selector no longer shows incorrectly for components without default identity
  • ✅ Cleaner debug logs with reduced noise from expected auth resolution paths
  • Ctrl+C and ESC immediately exit the identity selector (single keypress, exit code 130)
  • No error message displayed on user abort (clean exit)
  • Clear exit instructions shown to users in the selector UI
  • No authentication prompt for invalid components (validation happens first)

Summary by CodeRabbit

  • New Features

    • Component-level authentication overrides for nested Terraform functions enable fine-grained control over credentials in multi-account setups.
  • Bug Fixes

    • Fixed authentication context propagation through nested Terraform state and output evaluations.
    • Improved user experience when canceling interactive identity selection.
  • Documentation

    • Added comprehensive guides on authentication flows for Terraform YAML functions and nested authentication handling.
  • Chores

    • Updated AWS SDK and Go dependencies to latest versions.
fix: Reduce log spam for imports outside base directory @osterman (#1780)

what

  • Changed import path validation logging from WARN to TRACE level
  • Added test coverage for imports outside base directory

why

When using import paths that resolve outside the base directory, the warning message was being logged repeatedly during CI/CD workflows, creating excessive log spam. This is particularly noticeable in GitHub Actions where atmos is invoked multiple times.

The message "Import path is outside of base directory" is informational trace data about the import resolution process, not a user-actionable warning. Imports outside the base directory are a valid use case for shared configurations.

references

  • Consistent with other import-related logging in the same file (lines 108, 111, 114 of pkg/config/imports.go) which use TRACE level
  • Follows the logging level pattern:
    • TRACE: Detailed import flow (what's happening during resolution)
    • DEBUG: Error conditions (what went wrong)
    • WARN: User-actionable problems (what needs fixing)

Summary by CodeRabbit

  • Tests

    • Added test coverage for local imports outside the base directory to ensure proper resolution behavior.
  • Bug Fixes

    • Reduced log verbosity by changing import path warnings to trace-level logs, decreasing warning-level noise in logs.

Don't miss a new atmos release

NewReleases is sending notifications on new releases.