github ruvnet/RuView v1024
Release v1024

latest releases: v1037, v1036, v1032...
2 hours ago

Automated release from CI pipeline

Changes:
feat(cog-person-count): train count_v1.safetensors — honest v0.0.1 (ADR-103) (#695)

Phase 2 of ADR-103: trained count head on the existing 1,077 paired
samples (the same data that produced pose_v1 yesterday).

Honest result: 65.1% eval accuracy / 100% within ±1 / MAE 0.349 on
the held-out time-window. Per-class: 100% on "empty room" / 0% on
"1 person". The model overfit by epoch 100 (train_acc → 1.0,
eval_loss climbed 0.67 → 7.8) and the "best" checkpoint is the
snapshot that happened to predict the eval window's class
distribution (140/215 = 65.1%, matches eval_acc exactly). Confidence
head Spearman = 0.023 ⇒ uncalibrated. Same data-bound failure mode
as pose_v1 (#645), bounded by single-session training data; same
fix path (multi-room).

What v0.0.1 still validates end-to-end:

  • PyTorch → safetensors → Candle Rust loads cleanly on first try.
    cog-person-count health reports backend: candle-cpu and emits
    real per-frame predictions instead of the stub backend's hard-coded
    {1 person, 0 confidence}. Architecture parity between train-count.py
    and src/inference.rs::CountNet is bit-exact.
  • ONNX export bit-clean (16 KB, opset 18, dynamic batch axis).
  • Training wall time: 5.6 s for 400 epochs on RTX 5080.
  • Binary size unchanged (2.36 MB stripped), model loads via mmap at
    runtime.

This commit ships:

  • scripts/align-ground-truth.js: extended to emit n_persons_mode +
    n_persons_max per window so the training pipeline has count
    labels. Backwards-compatible (additive fields).
  • scripts/train-count.py: new — mirrors CountNet architecture
    exactly, loads paired.jsonl, trains 400 epochs with
    CE+BCE+Brier loss, exports safetensors + ONNX + per-epoch JSON.
  • v2/.../cog/artifacts/{count_v1.safetensors,count_v1.onnx,
    count_train_results.json}: the trained artifacts.
  • v2/.../cog/README.md: Status table updated with the v0.0.1 numbers
    • an Honest Caveat section explaining the data-bound result.
  • docs/benchmarks/person-count-cog.md: new — full v0.0.1 benchmark
    log mirroring the format docs/benchmarks/pose-estimation-cog.md
    established. Includes comparison to ADR-103 v0.1.0 acceptance
    gates and per-class breakdown.

Still pending:

  • run subcommand wiring (long-running polling loop, same as pose)
  • Cross-compile + sign + GCS upload (mirror of pose cog pipeline)
  • Live install on cognitum-v0
  • v0.2.0: re-train on multi-room data, LoRA per-room adapters,
    Stoer-Wagner min-cut clip in fusion stage

Docker Image:
ghcr.io/ruvnet/RuView:6b4994e1052873e4249a5fbed7db54878e319e2d

Don't miss a new RuView release

NewReleases is sending notifications on new releases.