safishamsi/graphify v0.7.13 on GitHub

What's new

Ollama VRAM exhaustion (#798): num_ctx is now derived from the actual chunk size instead of hardcoded 131072. With --token-budget 8192, the old value forced Ollama to allocate 128k KV-cache slots on a 31B model — 4×128k slots by chunk 4 caused OOM. New formula: min(input_tokens + output_cap + 2000, 131072) so an 8k chunk gets ~26k instead.
Hollow-response warning improved: now mentions VRAM pressure and points to GRAPHIFY_OLLAMA_NUM_CTX / GRAPHIFY_OLLAMA_KEEP_ALIVE env vars as tuning knobs.

graphify export callflow-html (#797): generates a self-contained Mermaid architecture/call-flow HTML page from graphify-out/graph.json — community sections, interactive flowcharts with zoom/pan, call detail tables, and graph report highlights.
Living architecture diagram (#800): callflow HTML now auto-regenerates on every --watch rebuild and post-commit hook if the file already exists. Run once, stays current forever.

uv tool upgrade graphifyy
# or: pip install --upgrade graphifyy