github robintra/perf-sentinel chart-v0.2.28
perf-sentinel chart v0.2.28

latest releases: chart-v0.2.63, v0.8.14, chart-v0.2.62...
one month ago

What's new in chart-v0.2.28

This is a daemon-version-only chart bump: appVersion advances from 0.5.24 to 0.5.25, the default image.tag now resolves to ghcr.io/robintra/perf-sentinel:0.5.25, and the artifacthub.io/images annotation is updated in lockstep so the Artifact Hub listing advertises the matching image. No chart-level template diff, no values.yaml schema change, no new RBAC, no new optional ConfigMap or Secret. The chart-v0.2.27 surface is preserved byte-for-byte.

The 0.5.25 daemon image adds two Prometheus counters on the daemon-side Scaphandre scraper. perf_sentinel_scaphandre_scrape_total{status} partitions every scrape attempt into success or failed (cached IntCounter children, lock-free fetch_add per tick), and perf_sentinel_scaphandre_scrape_failed_total{reason} partitions each failure into one of six closed-enum reasons (unreachable, timeout, http_error, body_read_error, request_error, invalid_utf8). The full reason set is pre-warmed to zero at daemon startup so dashboards build with rate() queries without absent() guards. Cardinality is bounded at 8 series, all label values are static &'static str constants from compile-time enums.

From a chart perspective, this surface is automatic. The ServiceMonitor rendering scrapes /metrics already, the new counters flow through the same path with no template change, no new scrape config, no new alert rule shipped by the chart. Operators who wire their own Prometheus alerts can target the new series directly: see docs/METRICS.md "Scaphandre scrape counters" for sample queries (success ratio over 5 minutes, dominant failure reason over 1 hour). The counters are gated behind the daemon's daemon feature, the same gate that already protects the optional [green.scaphandre] scraper itself.

The 0.5.25 daemon image also locks the long-standing contract that acknowledgments survive service restarts. The signature format <finding_type>:<service>:<sanitized_endpoint>:<sha256-prefix-of-template> deliberately excludes trace_id and span_id, and v0.5.25 ships eleven new tests across compute_signature, the OTLP ingest path (http.route > http.url > url.full precedence), and the Jaeger / Zipkin paths (http.route > http.target precedence). The chart-rendered daemon Service exposes the OTLP receivers and the HTTP API exactly as before, the new lock is a producer-side contract on the trace stream feeding the daemon. No chart template impact.

The HTTP API surface, the v0.5.21 ack Prometheus counters, the v0.5.23 [daemon.cors] config section, the ServiceMonitor rendering, the NetworkPolicy rendering, and the optional [daemon.ack] ConfigMap-and-Secret plumbing all keep their prior contracts. A helm upgrade from chart-v0.2.27 to chart-v0.2.28 is metadata-only.

Changed

  • appVersion bumped from 0.5.24 to 0.5.25, default image.tag now resolves to ghcr.io/robintra/perf-sentinel:0.5.25. The artifacthub.io/images annotation tracks the bump.
  • No chart-level config change. values.yaml, every template, the ServiceMonitor rendering, the NetworkPolicy rendering, the optional [daemon.ack] ConfigMap-and-Secret plumbing, the optional [daemon.cors] plumbing, and the ack-toml-baseline mount are byte-for-byte identical to chart-v0.2.27.

Behavior

  • Two new Prometheus counters on /metrics, surfaced automatically through the existing ServiceMonitor rendering: perf_sentinel_scaphandre_scrape_total{status="success|failed"} and perf_sentinel_scaphandre_scrape_failed_total{reason=...}. Pre-warmed at daemon startup. Reason labels: unreachable, timeout, http_error, body_read_error, request_error, invalid_utf8. No new scrape config, no new alert rule shipped by the chart.
  • No HTTP-shape change on the daemon side. The three ack endpoints, the v0.5.21 ack /metrics counters, the /api/findings, /api/status, /api/correlations, /api/explain/*, /api/export/report routes, and every other route keep their v0.5.24 status codes and JSON shapes. Existing scrapers, dashboards, and automation continue to work without adjustment.
  • No upgrade hook required, no on-disk migration. The runtime ack store JSONL schema is unchanged. A helm upgrade from chart-v0.2.27 keeps the daemon's existing acks.jsonl intact, the daemon replays and atomically rewrites it at startup as it did before.
  • [daemon.cors] and [daemon.ack] opt-in surfaces are unchanged. The v0.5.23 CORS layer scoping (only /api/*, never /v1/traces or /metrics or /health), the v0.5.20 ack store gating, the v0.5.21 ack counters, and the cross-source ack precedence (TOML wins over JSONL) all keep their v0.5.24 contracts.
  • Signature stability is now test-locked end-to-end. The signature <finding_type>:<service>:<sanitized_endpoint>:<sha256-prefix-of-template> excludes trace_id and span_id by design, so acknowledgments survive daemon restarts and service rolls. The endpoint component is sourced from http.route on the parent HTTP span (precedence http.route > http.url > url.full on OTLP, http.route > http.target on Jaeger and Zipkin). Standard OpenTelemetry agents (Spring Boot 3+ Java agent, ASP.NET Core .NET SDK, Express.js / Fastify / Koa @opentelemetry/instrumentation-*) emit http.route automatically. Producer-side instrumentation that omits it falls back to less-stable URL forms, ack churn proportional to URL cardinality.
  • Scaphandre scrape counters do not affect GreenOps scoring. They are operational metadata on the energy-source ingestion side. Daemon RSS impact is below 1 KB resident for the 8 fixed series, throughput target above 100k events/sec is unaffected.

Install

helm install perf-sentinel oci://ghcr.io/robintra/charts/perf-sentinel --version 0.2.28

Upgrade an existing release:

helm upgrade perf-sentinel oci://ghcr.io/robintra/charts/perf-sentinel --version 0.2.28

Sample PromQL queries against the new Scaphandre counters once the daemon is scraping its configured [green.scaphandre] endpoint:

# Scrape success ratio over 5 minutes (alert on < 0.95 for 15 min)
rate(perf_sentinel_scaphandre_scrape_total{status="success"}[5m])
  / rate(perf_sentinel_scaphandre_scrape_total[5m])

# Dominant failure reason over the past hour
topk(1, increase(perf_sentinel_scaphandre_scrape_failed_total[1h]))

See docs/METRICS.md "Scaphandre scrape counters" for the full label spec and additional examples.

Full Changelog: chart-v0.2.27...chart-v0.2.28

Don't miss a new perf-sentinel release

NewReleases is sending notifications on new releases.