Background
An encrypted enterprise syncthing app (openclawpro1774571881282) was installed on a node (92.170.17.66:16187). For syncthing apps with g: containerData, the installer correctly creates the container but does not start it — the container sits stopped while the syncthing state machine syncs data from the primary, then masterSlaveApps() starts it when ready.
During another app's installation, pruneContainers() ran and permanently deleted the stopped container. The node then spent 13+ hours broadcasting the app as "running" to the network with no Docker container, while masterSlaveApps() silently failed to start it every 30 seconds.
Root cause
The pruneContainers() guard in registerAppLocally (appInstaller.js:460-480) builds a list of installed app component names by iterating compose arrays to detect stopped containers. But installedApps() returns raw DB records — for encrypted enterprise apps, compose is [] because the specs are encrypted. So encrypted enterprise apps produce zero component names, the guard thinks there are no stopped apps, and pruneContainers() deletes the stopped container.
peerNotification.js:143 already calls decryptEnterpriseApps() before the same pattern. appInstaller.js did not.
Secondary issue
Master/slave (g:) apps were completely excluded from the stopped-app recovery loop in peerNotification.js (line 188: if (appDetails && !appInstalledMasterSlaveCheck)). When a master/slave app's container goes missing, it never hits the !containerExists check that would trigger recreateMissingContainers or removeAppLocally. The node just keeps trying to start a non-existent container and broadcasting it as running forever.
Changes
appInstaller.js — Call decryptEnterpriseApps() on the installed apps list before iterating compose arrays for the prune guard.
peerNotification.js — Add handleMissingMasterSlaveContainer() for master/slave apps with missing containers. A stopped container is normal (syncthing secondary) and left alone. A missing container triggers recreation via the existing recreateMissingContainers(), with:
- Backup/restore awareness (skips if app is being backed up)
- TOCTOU protection (if recreation fails but another process created the container, skip removal)
- Fallback to
removeAppLocallyif recreation fails and container is still missing
Test plan
-
decryptEnterpriseAppscalled before prune guard component name iteration - Encrypted enterprise app with stopped container prevents
pruneContainersfrom running -
handleMissingMasterSlaveContainerreturns early when container exists - Missing container triggers recreation, with fallback to app removal
- TOCTOU: if another process creates container during recreation failure, skip removal
- 21 unit tests passing across both files