Highlights: This release adds PAUSE, RESUME, and RECONNECT admin commands for zero-downtime PostgreSQL maintenance. Backend health monitoring now detects dead connections while a client holds an idle transaction. Nine bug fixes address connection lifetime defaults, idle timeout behavior, and retain cycle fairness.
Features
-
Add PAUSE, RESUME, RECONNECT admin commands for runtime pool control.
PAUSE [db]blocks new backend connections (active transactions finish).RESUME [db]lifts the pause.RECONNECT [db]rotates all connections — idle ones close immediately, active ones close on return. Use case:PAUSE→ upgrade PostgreSQL →RESUME. (#141 by @vadv) -
Add stale backend detection during client idle-in-transaction. Previously, if the backend died (failover, OOM kill, network partition) while a client held a transaction open, pg_doorman only discovered this when the client sent the next query. Until then the dead connection occupied a pool slot, blocking new clients from being served. Now pg_doorman actively probes the backend during idle-in-transaction and detects failures within ~100ms, freeing the pool slot immediately. (#144 by @vadv)
-
Add
pool_sizecolumn toSHOW POOLS/SHOW POOLS_EXTENDEDand newpg_doorman_pool_sizePrometheus gauge. Compare active connections against configured capacity without checking the config file. (#149 by @Fingolfin-Anekane) -
Add
auth_query.min_pool_sizefor dynamic user pools in passthrough mode. Keeps a minimum number of backend connections per user, prewarmed on pool creation and replenished by the retain cycle. Pools withmin_pool_size > 0are never garbage-collected. Default0. (#147 by @Fingolfin-Anekane)
Changes
-
Rename
auth_query.pool_size→auth_query.workers(executor connections) andauth_query.default_pool_size→auth_query.pool_size(data connections). Migration: renamepool_sizetoworkersanddefault_pool_sizetopool_sizein yourauth_queryconfig. (#148 by @Fingolfin-Anekane) -
Change
idle_timeoutdefault from 300,000,000ms (~83 hours) to 600,000ms (10 minutes). The old default effectively disabled idle cleanup. (#139) -
Change
server_lifetimedefault from 5 minutes to 20 minutes. The old value was shorter thanidle_timeout, preventing idle cleanup from ever triggering. (#139)
Fixes
-
Fix session mode connections destroyed after ordinary SQL errors. A failed query (e.g.
SELECT 1/0) marked the backend as bad, destroying a working connection at session end. (#152 by @vadv) -
Fix pool-level
server_lifetimeandidle_timeoutoverrides silently ignored — global values were always used. (#139 by @vadv) -
Fix
idle_timeout = 0closing connections after ~1ms instead of disabling idle cleanup. Now matches PgBouncer'sserver_idle_timeout = 0semantics. (#139) -
Fix idle timeout having no jitter — all connections expired simultaneously after a traffic burst, causing mass closures. Now applies ±20% per-connection jitter. (#139)
-
Fix
retain_connections_maxdraining a random pool when quota was exhausted. A zero remainder was misinterpreted as "unlimited", closing all idle connections in one cycle. (#139) -
Fix unfair retain quota distribution — HashMap iteration gave the same pool priority every cycle, starving others. Fixed by shuffling iteration order. (#139)
-
Fix retain and replenish using different pool snapshots when config reload happened between phases. (#139)