github robintra/perf-sentinel v0.4.0

latest releases: chart-v0.2.63, v0.8.14, chart-v0.2.62...
2 months ago

perf-sentinel v0.4.0

Phase 6 release: turns the daemon from a pattern detector into an insight engine. Four headline features, plus hardening.

Highlights

  • Cross-trace temporal correlation (daemon mode). New detect/correlate_cross.rs module with a CrossTraceCorrelator that detects recurring co-occurrences between findings from different services within a rolling window (default 10 minutes). Uses Algorithm R reservoir sampling (capped at 256 samples per pair) seeded by a deterministic SplitMix64 + FNV-1a endpoint hash so sampled lags stay reproducible across runs. Incremental source_totals and select_nth_unstable_by_key bounded eviction keep the per-tick cost O(occurrences) instead of rebuilding every tick. Opt-in via [daemon.correlation] enabled = true with window_minutes, lag_threshold_ms, min_co_occurrences, min_confidence, and max_tracked_pairs knobs.
  • OTel source code attributes. Findings now carry a code_location field populated from code.function, code.filepath, code.lineno, and code.namespace span attributes. Supported across OTLP (gRPC + HTTP), Jaeger, and Zipkin ingesters. The CLI renders a new Source: line on findings (namespace.function (filepath:lineno), omitting absent parts), and SARIF v2.1.0 output gains physicalLocation entries when filepath is present, enabling inline annotations in GitHub and GitLab code scanning views. Hostile filepath values (literal and percent-encoded .. traversal, absolute paths, URL schemes, overlong UTF-8, BiDi / invisible Unicode) are rejected in the SARIF sanitizer to close Trojan Source (CVE-2021-42574) vectors.
  • Automated pg_stat ingestion from Prometheus. New perf-sentinel pg-stat --prometheus http://prometheus:9090 scrapes pg_stat_statements_seconds_total via the Prometheus HTTP API and produces the same PgStatReport as the existing file-based path. Zero new dependencies, reuses the http_client module introduced in v0.3.0. Endpoints are validated at config load (scheme must be http/https, userinfo rejected) and redacted in error messages. File-based --input traces.csv continues to work unchanged.
  • Daemon query API and query subcommand. The daemon exposes its internal state via five HTTP endpoints on the existing port 4318 (alongside /v1/traces and /metrics): GET /api/findings (filterable by service, type, severity, limit, capped at 1000), GET /api/findings/{trace_id}, GET /api/explain/{trace_id} (tree with findings inline, served from the in-memory trace window), GET /api/correlations (active cross-trace correlations), and GET /api/status (uptime, active traces, stored findings count, version). A new FindingsStore ring buffer (default 10000, configurable via [daemon] max_retained_findings) retains recent findings for querying. A new perf-sentinel query --daemon http://localhost:4318 <action> CLI subcommand queries these endpoints with five sub-actions (findings, explain, inspect, correlations, status), rendering colored terminal output by default and --format json when scripting. inspect fetches explain trees in parallel via tokio::task::JoinSet (concurrency 16) so the TUI opens in ~300 ms on 100 traces instead of ~5 s sequentially. Gated by [daemon] api_enabled (default true); see docs/LIMITATIONS.md for the no-auth threat model.

Breaking changes

  • DaemonError::TlsConfig(Box<dyn std::error::Error>) replaced by a typed TlsConfigError enum with five concrete variants (ReadCert, ReadKey, ParseCerts, ParseKey, ServerConfig). Callers that matched on the boxed error must now match on the enum. Source chains are preserved via #[source].
  • All public error enums are now #[non_exhaustive] (DaemonError, TlsConfigError, ConfigError, PgStatError, JsonIngestError, JaegerIngestError, ZipkinIngestError, TempoError, FetchError, CalibrationError, SarifError). External match expressions on these types must include a catch-all arm going forward; this lets subsequent minor releases add variants without a major bump.
  • SpanEvent gains four optional fields (code_function, code_filepath, code_lineno, code_namespace). Finding gains an optional code_location field. All marked #[serde(default, skip_serializing_if = "Option::is_none")], so JSON consumers that ignore unknown fields are unaffected.

Observability, performance and hardening

  • Daemon query API performance. Endpoint responses cap at 1000 items to bound payload size under pathological queries. The findings store clones outside the lock to minimize hold time, short-circuits on max_size == 0, and sizes the initial VecDeque capacity at min(max_size, 4096). The /api/explain/{trace_id} endpoint reads from the in-memory trace window and runs detect::detect() inline, with no per-request disk I/O.
  • Runtime reuse. perf-sentinel query and pg-stat --prometheus use the parent #[tokio::main] runtime directly instead of constructing a nested Runtime (which would have panicked at runtime).
  • Cognitive complexity reduction. Four functions flagged by SonarCloud's rust:S3776 rule were split into named helpers without behavioral change: print_findings (33 → under 15, extracted into print_finding_entry, print_finding_impact, format_code_location, and severity helpers), cmd_query (62 → under 15, one helper per action plus build_findings_path and print_pretty_json), daemon::run (19 → under 15, extracted ingest_event_batch, evict_expired_traces, flush_evicted, shutdown_listeners, and a ServiceMeter struct), CrossTraceCorrelator::ingest (18 → under 15, extracted evict_stale, record_co_occurrences, enforce_pair_cap).
  • TUI Source line. The inspect TUI detail panel now renders the same Source: line as the CLI text output when findings carry a code_location.

Docs and assets

  • Design docs updated EN + FR: 04-DETECTION (cross-trace correlation algorithm), 06-INGESTION-AND-DAEMON (daemon query API), 07-CLI-CONFIG-RELEASE (query subcommand, pg-stat Prometheus flag). New Mermaid diagram query-api.mmd with light + dark SVG exports, wired into docs/ARCHITECTURE.md and docs/design/06-INGESTION-AND-DAEMON.md (and their FR counterparts).
  • docs/CONFIGURATION.md, docs/LIMITATIONS.md, GUIDED-TOUR.md, ENTERPRISE-JAVA-INTEGRATION-FR.md updated with Phase 6 content.
  • docs/img/analyze/* and docs/img/inspect/* regenerated so the new Source: line shows on every n+1, redundant, and slow finding. Demo fixture tests/fixtures/demo.json gained plausible code.* attributes per trace (repository / client classes, Java filepaths, line numbers).

Install

# Prebuilt binaries (Linux amd64 / arm64, macOS arm64, Windows amd64)
curl -LO https://github.com/robintra/perf-sentinel/releases/download/v0.4.0/perf-sentinel-linux-amd64
chmod +x perf-sentinel-linux-amd64
sudo mv perf-sentinel-linux-amd64 /usr/local/bin/perf-sentinel
# From crates.io
cargo install perf-sentinel
# Docker
docker pull robintrassard/perf-sentinel:0.4.0

Also available on GHCR: ghcr.io/robintra/perf-sentinel:0.4.0

Verify the binary against SHA256SUMS.txt:

curl -LO https://github.com/robintra/perf-sentinel/releases/download/v0.4.0/SHA256SUMS.txt
sha256sum -c SHA256SUMS.txt --ignore-missing

Full changelog

Diff against the previous release: v0.3.2...v0.4.0. 4 commits, covering Phase 6 (cross-trace correlation, source causality, automated pg_stat, daemon query API) plus the SonarCloud cognitive-complexity refactor.

Don't miss a new perf-sentinel release

NewReleases is sending notifications on new releases.