github RunOnFlux/flux v7.2.2

7 hours ago

Summary

Release version 7.2.2 addressing critical container mount timing issues and syncthing initialization race conditions
that occur after OS restarts.

Problem Statement

When a FluxOS node experiences an OS-level restart, two critical race conditions were occurring:

  1. Container Mount Timing Issue: Docker's auto-start feature would launch Flux containers before FluxOS could create
    the necessary volume mounts, causing containers to run with improper mount configurations.
  2. Syncthing Initialization Race: The syncthing service status check would delay the webserver startup, even though
    syncthing verification runs asynchronously in the background.

These issues resulted in:

  • Applications running with incorrect or missing volume mounts
  • "Syncthing not running" errors during startup
  • Delayed node availability after system reboots

Solution

  1. Container Mount Recovery System

Implemented a new dedicated service (containerMountRecovery.js) that:

  • Detects OS restart events by comparing system boot time against application state
  • Identifies containers that started before their volume mounts were created
  • Automatically restarts affected containers with proper mount configurations
  • Uses staged restart logic (2-second delays) to prevent resource contention

Key Functions:

  • containerStartedBeforeMounts(): Compares container start timestamps against mount creation times
  • getContainersNeedingRestart(): Filters running Flux containers requiring recovery
  • restartContainersWithProperMounts(): Executes safe, sequential container restarts
  • performContainerMountRecovery(): Orchestrates the complete recovery workflow
  1. Service Startup Sequence Optimization

Reorganized the service initialization order in serviceManager.js:

  1. Crontab and mount cleanup (synchronous, moved from 30s delay)
  2. Container mount recovery check
  3. Syncthing service startup
  4. Other services

This ensures volume mounts are properly configured before containers rely on them.

  1. Syncthing Initialization Optimization

Modified syncthingService.js to set syncthingStatusOk = true optimistically on startup:

  • Prevents unnecessary webserver startup delays
  • Allows async syncthing verification to run in background
  • Eliminates false "syncthing not running" errors during normal operation

Files Changed

File Changes Description
ZelBack/src/services/containerMountRecovery.js +234 lines (new) Container mount recovery service
ZelBack/src/services/serviceManager.js +15 / -4 lines Startup sequence reorganization
ZelBack/src/services/syncthingService.js +5 / -3 lines Optimistic status initialization
package.json version bump 7.2.1 → 7.2.2
tests/unit/containerMountRecovery.test.js new file Unit tests for recovery module

Testing

  • Unit tests added for container mount recovery module
  • Tested OS restart scenarios to verify container recovery
  • Validated syncthing initialization improvements

Impact

  • ✅ Eliminates container mount timing issues after OS restarts
  • ✅ Ensures all applications run with correct volume configurations
  • ✅ Reduces false syncthing errors during startup
  • ✅ Improves node reliability and uptime
  • ✅ Faster webserver availability after system reboots

Related PRs

  • Fixes from PR #1616: Syncthing not running errors
  • Fixes from PR #1617: Container mount timing issues after OS restart

Don't miss a new flux release

NewReleases is sending notifications on new releases.