vellum-ai/vellum-assistant v0.6.6 on GitHub

Highlights

Unified plugin architecture for the agentic loop — The assistant's core reasoning loop has been refactored around a new unified plugin system, making the agent more reliable and extensible. This includes hardened failure modes, proper timeout handling, and cleaner separation between built-in and user-provided behaviors.
Expanded image generation support — Vellum now supports multiple image generation providers, including OpenAI's gpt-image-2 alongside the existing options. A new provider dispatcher routes image requests appropriately, and settings now include a provider-aware API key field so you can configure your preferred image generation service.
Improved permissions and auto-approve controls — You can now configure fine-grained auto-approve thresholds for tool calls, with a redesigned risk tolerance UI that includes a high-risk threshold level. The network request approval prompt has also been enriched with more HTTP context to help you make better-informed decisions.
Onboarding and first-run experience improvements — The onboarding flow now delivers a personalized templated greeting based on pre-chat selections, warms the LLM prompt cache after the initial greeting for faster responses, and fixes several issues around double-hatching and assistant name display after setup.
Smarter conversation and compaction handling — Conversation history is no longer stripped and re-fetched unnecessarily on thread switches, the compaction pipeline now correctly strips injections and rewrites prompts so summaries are more accurate, and the default LLM model has been updated to Claude Sonnet 4.6.

Build: 0.6.6
Commit: 707be9616
Built at: 2026-04-23 23:20:07 UTC

vellum-ai/vellum-assistant v0.6.6 Vellum 0.6.6 on GitHub

Highlights

vellum-ai/vellum-assistant v0.6.6
Vellum 0.6.6

on GitHub