V17.4.7 — 409 Conflict circuit breaker on the polling path
Direct response to @chapapagit's log on discussion #441, which captured the exact failure shape this release fixes.
What was happening
V17.4.4 added a "skip our restart, let the library retry naturally" handler for ETELEGRAM: 409 Conflict. That's the correct approach for the transient same-process race (clears within seconds — a getUpdates from our previous polling cycle hasn't been server-cleaned yet). It is the wrong approach when a separate bot instance is actively polling the same token (second Node-RED, forgotten Docker container, accidentally registered webhook), because that conflict never clears — and the library's retry happily re-enters the same 409 forever, filling logs with thousands of identical errors while the bot does nothing useful.
chapapagit's log file from #441 showed this shape exactly: 9 repeated 409s in under 1 second, each correctly identified by the V17.4.4 handler, with the library happily retrying into the same conflict. The plugin was doing its job — but its job was wrong for this scenario.
What V17.4.7 changes
A sliding-window circuit breaker on the polling-error path. The relevant excerpt from bot-node.js:
const CONFLICT_409_THRESHOLD = 10;
const CONFLICT_409_WINDOW_MS = 30000;
this.record409Conflict = function () {
const now = Date.now();
self.conflict409Times.push(now);
while (self.conflict409Times.length > 0 && now - self.conflict409Times[0] > CONFLICT_409_WINDOW_MS) {
self.conflict409Times.shift();
}
let trip = false;
if (self.conflict409Times.length >= CONFLICT_409_THRESHOLD) {
self.conflict409Times = [];
trip = true;
}
return trip;
};If 10 409 Conflict events fire within a 30-second window, the breaker trips: abortBot stops polling cleanly, the request pool is destroyed, and one node.error line is logged with actionable guidance:
Bot <name> stopped: 10 "409 Conflict" responses in 30s — another getUpdates is
actively in flight for this bot token. Check https://api.telegram.org/bot<TOKEN>/getWebhookInfo,
kill any duplicate bot instance, then redeploy.
No auto-recovery. A persistent 409 means a second poller exists somewhere; auto-restart would just re-enter the loop. Operator intervention is required:
- Check the webhook:
curl "https://api.telegram.org/bot<TOKEN>/getWebhookInfo". Ifurlis non-empty, an earliersetWebHookcall has switched the bot to webhook mode andgetUpdateswill conflict. Delete withsetWebhook?url=. - Search for a second instance: grep your network for the bot token. The most common offenders are a backup Node-RED VM, an old Docker container that survived a host reboot, a forgotten test script, or a second
telegram botconfig node in the same Node-RED with the same token.
After fixing the duplicate, redeploy the flow — the breaker resets at construction.
What V17.4.7 does NOT do
- Does not fix the underlying duplicate-instance problem (out of plugin scope).
- Does not change behaviour for transient 409s (the V17.4.4 "let it clear naturally" handler still runs for the first 9 in any 30s window). The threshold is well above what any normal same-process race would produce.
- Does not auto-recover. By design — operator action is required.
Tests
228 passing (up from 223 in V17.4.6). 5 new mocha cases in test/nodes/bot-node-restart.test.js:
conflict409Timesinitialised as[],record409Conflictexposed- Does not trip below the threshold (9 calls → false)
- Trips on the 10th 409 in the window and resets the array (one error per outage, not per overflow)
- Prunes timestamps older than the 30 s window (9 ancient + 1 fresh → no trip)
- Only trips when the 10 calls fall inside the window (5 ancient + 5 fresh → no trip)
What to look for after upgrading
If you've been seeing pages of 409 Conflict logs and a frozen bot, you should now see one cleanly formatted node.error instead, plus the bot in a stopped state until you fix the duplicate poller. Once fixed and redeployed, the breaker is back to its idle state.