ProxCenter v1.4.1
Patch release: DRS automatic-mode hardening, vSphere migration fixes,
rolling-update reliability, security.
DRS, automatic mode hardening
The automatic-mode DRS is significantly safer in this release. Several
latent issues surfaced under multi-cluster workloads with scheduled
rebalancing and are now closed.
- Per-cluster migration cap replaces the global cap as the user-facing
throttle. Operators reason per cluster, not globally, so the legacy
global setting is removed from the UI. Each cluster gets its own slot
budget, preventing one busy cluster from monopolizing every scheduled
tick (backend #5, frontend #333). - Per-target inflow cap (opt-in). Caps how many migrations may target
the same node within one Rebalance cycle, preventing ping-pong when
several recommendations converge on the same "least-loaded" node and
overshoot it. - Post-migration snowball fix.
triggerPostMigrationEvaluationno
longer auto-runsRebalance()after every load-balance or rule-violation
migration completes. The scheduled cron is now the only periodic
trigger. Maintenance evacuations still self-loop so node drains progress
batch by batch (backend #4). - Vague 1 hardening bundle (backend #3, 10 commits):
- 64-bit UUID generation prevents recommendation / migration ID collisions.
- Post-migration singleflight guards against concurrent re-entry into
Evaluate. - Exclusive scheduler registration so cron overlaps cannot fire two cycles in parallel.
- Freshness gate (StaleTTL) rejects recommendations that aged out before execution.
- Storage gate scoped strictly to maintenance evacuation + QEMU, never to automatic rule-violation migrations on local-disk VMs.
- Affinity enforcement preserved across PVE flap (zero LastSeenAt fail-closed).
- Target mutation in mergeRecommendations resets confirmation counter.
mergeRecommendationspartitions by recommendation origin (maintenance / rule / load-balance) so flags can't bleed across classes.EvacuateNodesupersedes prior pending evacuations for the same (cluster, source-node).
- DRS settings UI cleanup. Advanced section reorganized into Migration
limits, Migration behavior, and Resource weights subsections with icons.
Slider helper text moved to tooltip with?icon. Dead-knob toggles
removed ("Migrate larger first", "Prevent overprovisioning" with
misleading scope). EWMA formula descriptive block removed. - Rebalance interval supports 15m and 30m in addition to hourly options.
Migration
- Migrate-to-Proxmox button greyed out on single-disk Proxmox nodes
fixed (#331). When/is the only large filesystem on the target
(no separate/var/lib/vz, LVM-thin storage not visible todf), the
preflight returned an emptytempStorageslist, hiding the Temporary
Storage selector and silently disabling the Migrate button./tmpis
now synthesized as a fallback when the root filesystem has at least
1 GiB free, and a defensive Alert surfaces the truly degenerate case. - Long-running PVE config PUT timeouts on slow storage fixed
(#332, #334). ZFS-over-iSCSI auto-attach was failing with an 8s
timeout while the underlying PUT took 20+ seconds. The failover
masquerade then reported a fake "all cluster nodes unreachable"
error. All migration-time/qemu/{vmid}/configPUTs now use a 120s
timeout. - curl stderr surfaced + orphan LVM freed on stream failure (#316).
Rolling update
- Respect
reboot_timeoutend-to-end with sustained-online polling- verify retry (backend #2).
waitForNodeOnlineused to return on the
first online sighting. After a reboot pmxcfs can briefly report online
while corosync re-joins, then flip back to unknown. Now requires 3
consecutive online sightings (10s sustained), andverifyNodeHealth
polls for up to 60s instead of one-shot.
- verify retry (backend #2).
- Scope
reboot_timeoutdeadline to the reboot path only so a short
value doesn't shorten the standalone verify window for non-reboot
updates. - Run apt / ha-manager / ceph / reboot as root via
sudo -nwhen
the PVE connection uses a non-root SSH user (backend #1). - Surface node version and API token permission errors in the
rolling-update UI (#318).
Deployment & install
- Backfill
ORCHESTRATOR_API_KEYon upgrade and refuse placeholder
at startup (#330). Pre-v1.4.0 installs without the key, and
.env.exampleplaceholder leakage, are detected and fixed by the
installer. The frontend container refuses to boot with the placeholder. - Auto-generate
INTERNAL_API_TOKENoutside Docker for source-built
installs. - Install URL uses
proxcenter.io/install/*instead of theget.
subdomain.
AI
- Test connection for Ollama fixed (#314, #315). The provider check
was broken since the auth refactor.
Security
- SSRF guard on AI test and models endpoints (#335). The
ai/test
andai/modelsroutes accept user-supplied base URLs for Ollama and
OpenAI-compatible providers. The validator now blocks cloud metadata
endpoints (AWS 169.254.169.254, Alibaba 100.100.100.200, OCI
192.0.0.192, AWS IPv6 IMDS), strips IPv6 brackets before comparison,
and performs a DNS lookup so DNS aliases (such as*.nip.iostyle
hostnames) that resolve to blocked addresses are rejected. Loopback
and RFC1918 remain reachable for legitimate local Ollama setups.
Closes the 2 critical CodeQL alerts (js/request-forgery) on these
routes. - Bump
wsto 8.20.1 (CVE-2026-45736) (#335). - Bump bundled
npmto 11.15.0 in the runner image (covers
brace-expansion< 5.0.6 / CVE-2026-45149) (#335). - Patch bundled npm in runner image for CVE-2026-42338 (
ip-address
< 10.1.1) (#311). - Tighten enterprise.proxmox.com URL spoofing check in the license
validator (#306). - Backend: bump
go-ntlmsspto v0.1.1, Alpine base 3.19 to 3.22,
sanitise filename components against path injection in the reports
generator. - Frontend: add shell-arg validators on routes that build SSH commands,
allow testing SSH against unsaved form values.
Quality & test coverage
- New PR-only SonarCloud Quality Gate workflow with proper LCOV path
rewriting and baseline analysis on main. - New Vitest route-handler harness, with tests for the connections POST
route, SSH test endpoint, orchestrator client, and SSH helpers. - Multiple Sonar bug, vulnerability, and smell cleanups.
Dependencies
- Node 22-alpine to 26-alpine on the frontend image.
- Various dependabot bumps (eslint-config-next, stylelint,
softprops/action-gh-release).
Upgrade notes
- No schema changes since v1.4.0. PostgreSQL connection string unchanged.
- The DRS Settings UI presents
max_concurrent_migrations_per_cluster
instead ofmax_concurrent_migrations. Existing configs are
auto-migrated at runtime (per-cluster falls back to 2 when persisted
value is 0). The legacymax_concurrent_migrationsfield is still
parsed from existing YAML / DB rows but no longer enforced. rebalance_intervalnow accepts 15m and 30m in addition to the
hourly options.- AI provider URLs are now subject to an SSRF guard. URLs targeting
cloud metadata endpoints (link-local 169.254.x, Alibaba, OCI, AWS
IPv6 IMDS) are refused with an explicit error message. Local Ollama
setups usinglocalhost,127.0.0.1, or RFC1918 addresses are
unaffected.
Docker Images
docker pull ghcr.io/adminsyspro/proxcenter-frontend:v1.4.1
docker pull ghcr.io/adminsyspro/proxcenter-orchestrator:v1.4.1
docker pull ghcr.io/adminsyspro/proxcenter-weasyprint:v1.4.1