Summary
Release version 7.2.2 addressing critical container mount timing issues and syncthing initialization race conditions
that occur after OS restarts.
Problem Statement
When a FluxOS node experiences an OS-level restart, two critical race conditions were occurring:
- Container Mount Timing Issue: Docker's auto-start feature would launch Flux containers before FluxOS could create
the necessary volume mounts, causing containers to run with improper mount configurations. - Syncthing Initialization Race: The syncthing service status check would delay the webserver startup, even though
syncthing verification runs asynchronously in the background.
These issues resulted in:
- Applications running with incorrect or missing volume mounts
- "Syncthing not running" errors during startup
- Delayed node availability after system reboots
Solution
- Container Mount Recovery System
Implemented a new dedicated service (containerMountRecovery.js) that:
- Detects OS restart events by comparing system boot time against application state
- Identifies containers that started before their volume mounts were created
- Automatically restarts affected containers with proper mount configurations
- Uses staged restart logic (2-second delays) to prevent resource contention
Key Functions:
- containerStartedBeforeMounts(): Compares container start timestamps against mount creation times
- getContainersNeedingRestart(): Filters running Flux containers requiring recovery
- restartContainersWithProperMounts(): Executes safe, sequential container restarts
- performContainerMountRecovery(): Orchestrates the complete recovery workflow
- Service Startup Sequence Optimization
Reorganized the service initialization order in serviceManager.js:
- Crontab and mount cleanup (synchronous, moved from 30s delay)
- Container mount recovery check
- Syncthing service startup
- Other services
This ensures volume mounts are properly configured before containers rely on them.
- Syncthing Initialization Optimization
Modified syncthingService.js to set syncthingStatusOk = true optimistically on startup:
- Prevents unnecessary webserver startup delays
- Allows async syncthing verification to run in background
- Eliminates false "syncthing not running" errors during normal operation
Files Changed
| File | Changes | Description |
|---|---|---|
| ZelBack/src/services/containerMountRecovery.js | +234 lines (new) | Container mount recovery service |
| ZelBack/src/services/serviceManager.js | +15 / -4 lines | Startup sequence reorganization |
| ZelBack/src/services/syncthingService.js | +5 / -3 lines | Optimistic status initialization |
| package.json | version bump | 7.2.1 → 7.2.2 |
| tests/unit/containerMountRecovery.test.js | new file | Unit tests for recovery module |
Testing
- Unit tests added for container mount recovery module
- Tested OS restart scenarios to verify container recovery
- Validated syncthing initialization improvements
Impact
- ✅ Eliminates container mount timing issues after OS restarts
- ✅ Ensures all applications run with correct volume configurations
- ✅ Reduces false syncthing errors during startup
- ✅ Improves node reliability and uptime
- ✅ Faster webserver availability after system reboots
Related PRs