Automated release from CI pipeline
Changes:
feat(cog-person-count): train count_v1.safetensors — honest v0.0.1 (ADR-103) (#695)
Phase 2 of ADR-103: trained count head on the existing 1,077 paired
samples (the same data that produced pose_v1 yesterday).
Honest result: 65.1% eval accuracy / 100% within ±1 / MAE 0.349 on
the held-out time-window. Per-class: 100% on "empty room" / 0% on
"1 person". The model overfit by epoch 100 (train_acc → 1.0,
eval_loss climbed 0.67 → 7.8) and the "best" checkpoint is the
snapshot that happened to predict the eval window's class
distribution (140/215 = 65.1%, matches eval_acc exactly). Confidence
head Spearman = 0.023 ⇒ uncalibrated. Same data-bound failure mode
as pose_v1 (#645), bounded by single-session training data; same
fix path (multi-room).
What v0.0.1 still validates end-to-end:
- PyTorch → safetensors → Candle Rust loads cleanly on first try.
cog-person-count healthreportsbackend: candle-cpuand emits
real per-frame predictions instead of the stub backend's hard-coded
{1 person, 0 confidence}. Architecture parity between train-count.py
and src/inference.rs::CountNet is bit-exact. - ONNX export bit-clean (16 KB, opset 18, dynamic batch axis).
- Training wall time: 5.6 s for 400 epochs on RTX 5080.
- Binary size unchanged (2.36 MB stripped), model loads via mmap at
runtime.
This commit ships:
- scripts/align-ground-truth.js: extended to emit n_persons_mode +
n_persons_max per window so the training pipeline has count
labels. Backwards-compatible (additive fields). - scripts/train-count.py: new — mirrors CountNet architecture
exactly, loads paired.jsonl, trains 400 epochs with
CE+BCE+Brier loss, exports safetensors + ONNX + per-epoch JSON. - v2/.../cog/artifacts/{count_v1.safetensors,count_v1.onnx,
count_train_results.json}: the trained artifacts. - v2/.../cog/README.md: Status table updated with the v0.0.1 numbers
- an Honest Caveat section explaining the data-bound result.
- docs/benchmarks/person-count-cog.md: new — full v0.0.1 benchmark
log mirroring the format docs/benchmarks/pose-estimation-cog.md
established. Includes comparison to ADR-103 v0.1.0 acceptance
gates and per-class breakdown.
Still pending:
runsubcommand wiring (long-running polling loop, same as pose)- Cross-compile + sign + GCS upload (mirror of pose cog pipeline)
- Live install on cognitum-v0
- v0.2.0: re-train on multi-room data, LoRA per-room adapters,
Stoer-Wagner min-cut clip in fusion stage
Docker Image:
ghcr.io/ruvnet/RuView:6b4994e1052873e4249a5fbed7db54878e319e2d