github RunOnFlux/flux v8.11.0

11 hours ago

Release v8.11.0. Bundles app tampering detection + enforcement and several lifecycle/networking fixes.

App tampering detection & DOS enforcement

  • appTamperingDetectionService records lifecycle anomalies (container_vanished, network_pruned, mount_vanished, crontab_wiped,
    recreation_failed, frequent_restart) into a new apptamperingevents collection with a 30-day TTL. History is queryable via GET /apps/tamperingevents/:appname.
  • appTamperingBlocklistService fetches helpers/tamperingblockednodes.json from master every 12h. If this node's collateral txhash is listed
    and has more than 10 tampering events, a sticky DOS message is set via a new stickyDosMessage/stickyDosState slot in fluxNetworkHelper
    that cannot be cleared by unrelated DOS checks — only the next tick can clear it (if the txhash leaves the list or events drop).
  • Frequent restart tracking: a nodestartuptracker doc records the previous startup; restarts under one hour emit a frequent_restart event
    under a synthetic __system__ app.
  • ArcaneOS skipped via a fluxbenchd-backed three-state check (true/false/null). A null result (bench unreachable) skips the tick so a real
    ArcaneOS node is never falsely DOSed during an outage.

Fixes

  • getCgroupBurstPath now early-returns on pid 0 from stopped/exited containers instead of throwing through /proc/0/cgroup.
  • monitorFolderHealth now passes sendMessage=true to removeAppLocally when hard-removing after prolonged cannot_sync, so peers drop this IP
    from appLocations and the next spawner doesn't reselect the unreachable node.
  • Sticky-DOS clear is scoped to our own DOS_MESSAGE_PREFIX so it can't clobber state owned by a hypothetical future writer.
  • start() in the blocklist service respects stop() during the daemon-sync wait — cancellable sleep + stopping flag short-circuits polling.
  • recordEvent('mount_vanished') is now awaited in syncthingFolderStateMachine for consistency.
  • Dropped tamperingBlocklistCache: its 6h TTL was shorter than the 12h enforcement interval, so every fetch missed anyway.

Tests

New unit coverage for appTamperingBlocklistService, appTamperingDetectionService, and sticky-DOS behavior in fluxNetworkHelper.

Test plan

  • npm run lint
  • npm run test:zelback:unit
  • Verify tampering events are recorded and queryable via GET /apps/tamperingevents/:appname
  • Confirm sticky DOS only clears on its own 12h tick and preserves through unrelated DOS checks
  • Confirm ArcaneOS node is not DOSed (bench returns true) and is skipped (bench returns null)

Don't miss a new flux release

NewReleases is sending notifications on new releases.