Adaptive Floor Control
Concept
The system now supports a dynamic minimum writer target (adaptive floor).
Two operating modes are available:
static— fixed flooradaptive— conditionally reduced floor for single-endpoint DCs during idle
Adaptive mode reduces operational churn while preserving fast recovery under load.
Hot reload behavior:
- When configuration is reloaded, adaptive floor immediately raises if required
- This prevents under-provisioning after policy changes
Modes
Static Mode
Concept
- Fixed required floor per DC-group
- No time-based or metric-based adjustments
- Deterministic and predictable behavior
Properties
- No adaptive logic applied
- Immediate full target maintained at all times
Adaptive Mode
Concept
For single-endpoint DCs, floor may temporarily decrease during sustained idle periods.
The goal is to reduce unnecessary reconnect/refill churn when there is no user traffic.
Floor reduction occurs only if all conditions are met:
- DC-group is single-endpoint
- No bound client sessions on writers
- Idle duration ≥
me_adaptive_floor_idle_secs
When activity resumes:
- A recover grace window is activated
- During
me_adaptive_floor_recover_grace_secs- Floor behaves as in
static - Prevents flapping during short bursts
- Floor behaves as in
Minimum adaptive floor is bounded by:
me_adaptive_floor_min_writers_single_endpoint (1..32)
Implementation
- Introduced
adaptivefloor mode - Configurable switch between
staticandadaptive - Adaptive floor computed via deterministic formula using:
- current state
- runtime metrics
- configured bounds
- Floor dynamically adjusts based on:
- idle state
- activity recovery
- Immediate floor raise on hot reload
- Adaptive logic currently applied in health path only
- No persistence of adaptive state across full reinitialization
- Switching
adaptive → static:- Clears adaptive state
- Next health cycle restores full fixed floor immediately
Mode Switching
Behavior
- Controlled via configuration
- Switch is immediate
- Calculation strategy changes instantly
Notes:
- Adaptive state is not preserved across mode switches
- Health-path logic recalculates floor on next cycle
Impact on System Behavior
Benefits
- Reduced churn during prolonged idle
- Fewer reconnect/refill cycles
- Lower risk of idle-close/reconnect loops
- Cleaner health signal under low traffic
- Fast recovery when activity resumes due to:
- Recover grace window
- Standard reconnect mechanisms
Trade-offs
- During deep idle, fewer writers are maintained
- First traffic burst after idle may incur short warm-up period
- Adaptive affects operational floor (health target)
- Does not alter reinit/warmup base targets
Limitations & Nuances
- Adaptive logic currently scoped to health path
- Immediate floor raise on hot reload may temporarily affect metrics
- No full state reinitialization when switching modes
- Adaptive does not override reinit generation targets
Observability
Added Metrics
- Current floor mode (
static/adaptive) - Mode switch counters
- Adaptive floor value tracking
Recommended Monitoring
- Floor value over time
- Mode transitions (
static ↔ adaptive) - Churn metrics
- Latency impact during reload events
- Correlation between idle periods and floor reduction
What's Changed
- ME Healthcheck + ME Keepalive + ME Pool in Metrics by @axkurcom in #297
- Update Cargo.toml by @axkurcom in #298
- ME Adaptive Floor by @axkurcom in #299
Full Changelog: 3.1.4...3.1.5