New permission system
The release has three big stories — a new core permission system with optional client-cert principals on NRPE, a
PDH overhaul that fixes long-standing counter-collection crashes and adds counter functions, and a WEB hardening
option that lets monitoring-only deployments expose the WEB UI without seeding a privileged admin account. Everything
else is bug fixes, small features, and follow-ups around those three threads.
Highlights
- Core permission system — opt-in policy layer that gates which caller can run which command. Configured under
/settings/permissions. Disabled by default; existing installs keep working.
See https://nsclient.org/docs/concepts/permissions/ for the model, identity table, and rollout recipe. - NRPE client identity from cert CN — when
client identity source = cnis set onNRPEServerand the listener
verifies the client cert, the CN is stamped as the policy principal so rules can be written per-cert (
NRPEServer:icinga-master = ...). Hard guardrail at module start refuses to load the module if the TLS verify mode
would let the CN be attacker-supplied. - Global
allow exectoggle — exec is now gated by a single on/off switch under/settings/permissions. The
per-command rule table applies to queries only. Defaulttrueso enabling the policy system does
not break exec callers. - PDH (performance counter) overhaul — fixes for service crashes when PDH misbehaves (#592, #547), counter retry
when temporarily unavailable (#634), reliable English counter lookup (#652, #906), a resource leak in the
counter-lookup path, and a refactor to smart-buffer-based PDH enumeration. Most users running CheckSystem on Windows
should see meaningfully better reliability. - check_pdh counter scaling and functions (#281) —
details-syntaxand related rendering paths can now apply
scaling and other functions, e.g.'${counter}'=${value:scale(/1024)}MB. - check_network — human-readable strings, scaling, speed, and percentages (#329); team-network statistics (#625).
See https://nsclient.org/docs/reference/check/CheckNet. - Nagios range syntax in performance data (#748) —
1:10,~:5,@10:20etc. work in perfdata thresholds,
matching the Nagios plugin spec. disable admin useron WEBServer — monitoring-only deployments can expose the WEB UI without ever seeding the
built-in admin (and previously seeded admin entries are ignored). Pairs naturally with the new permission system to
lock down reconfiguration surfaces.- Path overrides moved to
boot.ini+ new--path-overrideCLI flag — path tokens (module-path,
certificate-path, etc.) are now declared early inboot.iniso they take effect before the main config is loaded.
Per-invocation overrides via--path-override KEY=VALUE. See https://nsclient.org/docs/concepts/settings. - NRPE startup is no longer fatal on listener failure — bad bind address / port already in use logs a clear error
and leaves the module loaded so settings and commands stay usable for diagnostics. - Dual-stack listening fixed (#312) — v4 and v6 acceptors no longer trample each other's pending connection slot.
disable admin user,client identity source,allow exec, and the policy table are all documented in
https://nsclient.org/docs/concepts/permissions/ and https://nsclient.org/docs/setup/securing. Treat those two as the
starting point for any new
install.
Detailed changes
Security and permissions
Core permission system
A policy layer in the core decides whether a given caller may run a given command. Disabled by default; when enabled,
rules form a strict allow-list.
[/settings/permissions]
enabled = true
log denials = true
log allows = false ; noisy, only flip on while rolling out
allow exec = true ; queries-only rule table; exec is a global toggle
[/settings/permissions/policies]
NRPEServer = CheckHelpers.*, CheckSystem.check_cpu
WEBServer:admin = *
WEBServer:viewer = CheckSystem.check_cpu, CheckSystem.check_drivesize
Scheduler = CheckHelpers.*, CheckSystem.*
Subject is module[:principal]; object is module.command. Wildcards (*, ?) supported. Rules combine additively.
See https://nsclient.org/docs/concepts/permissions/ for the full identity model, the
CheckHelpers identity-forwarding behaviour, and a step-by-step rollout recipe.
NRPE client cert CN as principal
When two-way TLS is configured and verifying client certs against your CA, the Common Name is stamped as the policy
principal:
[/settings/NRPE/server]
client identity source = cn ; default: none
verify mode = peer-cert
ca = /etc/nsclient/ca.pem[/settings/permissions/policies]
NRPEServer:icinga-master = CheckHelpers.*, CheckSystem.*
NRPEServer:metrics-shipper = CheckSystem.check_cpu, CheckSystem.check_drivesize
Guardrails: the module refuses to start if client identity source = cn is configured without SSL, without
verify_mode containing peer and fail-if-no-peer-cert (or the peer-cert alias), or without a non-empty
ca path. The CN is logged at debug level on every accepted handshake for diagnostics. CN-only (not full DN) because
INI key syntax uses = as the key/value separator and would corrupt DN-shaped policy keys; see the "Why CN-only"
section of the permissions doc. See https://nsclient.org/docs/reference/client/NRPEServer.
Global allow exec toggle
Per-command rules apply to queries only. The exec surface (WEB scripts UI, lua/python core:simple_exec(...), CLI
exec) is gated by a single boolean:
[/settings/permissions]
allow exec = false ; hard lockdown; default is trueWhen false and enabled = true, every exec call returns
Permission denied: exec is globally disabled (/settings/permissions/allow exec = false).
See "Why exec is a single toggle" in https://nsclient.org/docs/concepts/permissions/.
disable admin user on WEBServer
For installations that expose the WEB UI for status/visualisation only and never want a remote-reconfiguration surface:
[/settings/WEB/server]
disable admin user = trueWith this set, the built-in admin is not seeded on first boot, and any existing admin entry in the user settings is
ignored at load time.
Security guide updates
https://nsclient.org/docs/setup/securing was rewritten with concrete configurations for NRPE (with
and without mTLS) and the WEB server. Read it before exposing either to a network you don't fully control.
Performance counters / PDH
The PDH subsystem (the Windows performance-counter collection backbone behind CheckSystem, check_cpu, check_pdh,
check_network, etc.) got a substantial reliability pass. Most users running NSClient++ as a long-running service on
Windows should see fewer crashes and more consistent results.
- Service crashes when PDH misbehaves on a particular machine (#592, #547) — root-caused and fixed. Misbehaving
counter registrations no longer take the service down. - Counter not retried if unavailable (#634) — counters that fail to bind at first sight now get retried on
subsequent collection cycles, instead of being permanently unhealthy for the lifetime of the process. - English counter lookup improved (#652, #906) — addresses reading of localised counters by their canonical English
names on non- English Windows installs. - Resource leak in PDH counter lookup fixed.
- PDH enumeration refactored to smart buffers — clearer memory ownership across the enumeration path, fewer footguns
for future changes. - check_pdh counter scaling and functions (#281) — all the details-syntax / rendering paths can now apply functions.
Examples:
See https://nsclient.org/docs/reference/check/CheckSystem for the function reference.check_pdh "counter=\Processor(_Total)\% Processor Time" \ "details-syntax=${counter} = ${value:round(2)}%"
check_network
- Human-readable strings, scaling, speed, and percentages (#329) — perfdata and message output now render numbers in
a way operators actually want to read:check_network 'filter=interface=Ethernet' \ 'top-syntax=${list}' \ 'detail-syntax=${interface}: ${total_rx_human}/s in, ${total_tx_human}/s out' - Team network statistics (#625) — aggregate stats across Windows NIC teams.
See https://nsclient.org/docs/check/CheckNet.
Performance data formatting
- Nagios range syntax in performance data (#748) — the perfdata threshold fields now accept the standard Nagios
range syntax:5:10,~:5,@10:20, etc. Brings NSClient++ into line with what Nagios consumers already expect.
Settings, paths, and CLI
- Path overrides moved to
boot.ini— path tokens (module-path,certificate-path,data-path,log-path, …)
now live under[paths]inboot.ini(next tonscp.exe), not innsclient.ini. Overrides take effect before the
main config is loaded — including the bootstrap step that decides where the main config itself lives.; boot.ini [paths] module-path = D:\monitoring\modules certificate-path = D:\monitoring\certs
--path-overrideCLI flag — per-invocation override, repeatable. (Renamed from--pathto avoid colliding with
thenscp settings --pathsubcommand option.)nscp client --path-override module-path=/build/modules --path-override log-path=. ...- See https://nsclient.org/docs/concepts/settings for the precedence rules and the migration note for installs that had
a[/paths]section innsclient.ini.
Aliases and command registration
- CheckHelpers alias — aliases can now be defined under
[/settings/check helpers/alias]and are registered by
CheckHelpersdirectly, without requiringCheckExternalScriptsto be loaded. This is the preferred place going
forward; the legacy[/settings/external scripts/alias]is still honoured for backward compatibility. - API to list registered query aliases (#506) — programmatic introspection of the alias table, useful for tooling.
simple_command/simple_command_map— internal refactor that streamlines how modules register aliases. No
user-visible behaviour change, but module authors may want to look at the new pattern.- Icinga client alias (
7c49a3d3) — minor module-specific addition.
NRPEServer
- Listener failure no longer kills the module — a bad
bind toaddress that the resolver can't look up, or a port
already in use, used to make the whole module fail to load. Now the failure is logged clearly, the listener stays
down, and the module's settings and commands remain accessible for diagnostics and reconfiguration. Fix the config and
reload — no service restart needed. - Dual-stack fixed (#312) — the v4 and v6 acceptors used to share a single pending-connection slot, which caused
intermittentAlready openerrors on v6 once v4 accepted a client. Each family now owns its own slot. - Insecure mode produces an error-level log line — flipping
insecure = true(for legacycheck_nrpeinterop) now
surfaces as an ERROR so it shows up in monitoring dashboards, instead of silently disabling cert-based peer auth.
Plugin lifecycle
prepare_shutdownhook — modules can opt in to a first-phase shutdown pass before any plugin is unloaded. Used
by the Scheduler and similar long-running submitters to finish in-flight work cleanly. Operators see fewer "submission
failed during shutdown" lines during service stop.
Settings store
simpleinibuffer NUL-termination fix — fixes a buffer allocation issue in the INI parser that could
affect non-UTF-8 data paths.cache allowed hostis now a real boolean — previously parsed as a string with surprising truthiness; matches
what the docs always claimed.
Modules and clean-ups
- WMI module refactor — target handling and settings management cleaned up.
- IcingaClient cleanup — removed unused command-handling code paths.
- CheckLogFile config and descriptions — fixed misleading defaults and improved the help text.
- Web UI improvements — more settings elements exposed under modules, simpler module configuration. Web dependencies
refreshed. - Installer:
UninstallStringis now correct (#495) — removal via Windows "Apps & Features" works again. - Rust dependencies bumped.
Upgrade notes
Most installs can upgrade in place — defaults are preserved. Read the specific items below if any of them apply.
Permission system
The new policy layer is disabled by default. Existing installs continue to behave exactly as before until an
operator opts in via /settings/permissions/enabled = true.
If you do opt in:
- Per-command rules under
/settings/permissions/policiesapply to queries only. Any rules you might have written
for exec command patterns will be silently ignored for the exec dispatch path — exec is gated by the single global
allow execboolean. - The default for
allow execistrue, so enabling the policy will not silently break the WEB scripts UI, lua/python
core:simple_exec(...), or CLI exec. Flip tofalseonly if you want a hard exec lockdown. - Roll out with
log allows = truefirst so you can inventory what your actual traffic looks like before tightening to
a real allow-list. See the step-by-step recipe in https://nsclient.org/docs/concepts/permissions/.
NRPEServer
- The new
client identity sourcesetting defaults tonone, which matches the previous behaviour (subject is bare
NRPEServer). Set tocnonly when you want per-cert principals — and only after you've configured
verify_mode = peer-certand aca path. The module will refuse to start with a clear error if you setcnwithout
those. - Pin the
ca pathto your private monitoring CA. The system trust store (Windows root store / Linux distro bundle)
accepts certs from every public CA on the planet and would let an attacker with a public cert choose their own CN.
See "Pin to a private CA" in the permissions doc.
Path overrides
- If you had a
[/paths]section innsclient.inifrom an older NSClient++ install, those overrides moved to[paths]
inboot.ini(note: same section name, different file). There is no automatic migration. Copy eachkey = valueto a
[paths]section inboot.ini(next tonscp.exe) and delete the old section fromnsclient.ini.
WEB server
- The new
disable admin user = truesetting is opt-in. Existing installs keep their admin and continue to work
unchanged. Use this when you want to expose the WEB UI for status-only viewing and have no need to reconfigure the
agent through the web.
NRPEServer startup robustness
- A failed listener (bad bind address, port in use) used to make the whole
NRPEServermodule fail to load. It now logs
an ERROR and leaves the module loaded with no active listener — so you can reconfigure via
nscp settings --path /settings/NRPE/server --key ... --set ...and reload, without restarting the service. If you
had monitoring on "module load failed" specifically, you may want to add "NRPE listener failed" as a separate signal.
insecure = true on NRPEServer
- This option (for legacy
check_nrpeinterop) now logs at ERROR rather than DEBUG/INFO. Behaviour is unchanged; the
message is louder so it shows up in dashboards. If your monitoring filters by severity, you may want to whitelist this
specific message on agents that intentionally run in insecure mode.
cache allowed host
- Previously parsed as a string with surprising truthiness; now a real boolean. If you had
cache allowed host = yesor
= on, switch totrue. Numeric1/0still work.
Nagios range syntax in performance data
- This is additive — existing perfdata that doesn't use range syntax continues to work. Plain numbers still parse as
before. Only consumers that previously had to special-case NSClient++'s output may need adjusting, but most
Nagios-ecosystem tools handle both forms.
Full Changelog: 0.12.5...0.12.6