github volcengine/OpenViking v0.3.20

4 hours ago

OpenViking v0.3.20 Release Notes / 发布说明

Release date / 发布日期: 2026-05-25

Full Changelog / 完整变更记录: v0.3.19...v0.3.20

This release contains 12 commits over v0.3.19, with 82 changed files.


中文

版本概览

v0.3.20 聚焦在线上调试、会话写入吞吐、记忆检索质量和语义索引稳定性。主要变化包括请求级 HTTP profile、批量写入 Session 消息的新 API、基于 embedding_template 的记忆向量化输入、语义索引 target sync 与锁交接恢复,以及若干 CLI、LangChain、VLM、文档和测试修复。

主要更新

  • 请求级 HTTP profiling: 服务端新增 server.profile_enabled 开关。开启后,请求带 profile=1 时会对当前 HTTP 请求启用 cProfile,并在 JSON 响应中追加 profile 行数组。ov CLI 新增 --profile 入口并能保留、展示 profile 输出。
  • 批量 Session 消息写入: 新增 POST /api/v1/sessions/{session_id}/messages/batch 和 Python HTTP client / Session wrapper 的 batch_add_messages,一次请求最多写入 100 条消息,减少 LangChain/LangGraph 等集成连续写消息时的 HTTP 往返。
  • 记忆向量化输入模板: 记忆 schema 新增顶层 embedding_template,替代字段级 searchable 标记。默认的 entitieseventspreferences 模板现在会把关键字段和正文一起用于 embedding,提高语义召回命中。
  • 语义索引与锁稳定性: resource 处理会先把 temp source 同步到 target 后再执行语义 DAG;diff 结果现在使用 target URI。语义锁 handoff 失效时会尝试重新获取 tree lock,锁冲突类错误会重排队而不是误触发 API circuit breaker。
  • Embedding 输入保护: embedding 队列会按 embedding.max_input_tokens 截断输入,并把过大输入错误分类为 input_too_large,避免对不可恢复的大输入反复重试。
  • CLI profile 输出修复: CLI 在表格、标量、文件系统和系统命令输出中保留 profile 段,避免 result 解包后丢失 profile 信息。

新功能用法

请求级 profiling

服务端需要先允许 profile:

{
  "server": {
    "profile_enabled": true
  }
}

单次请求启用:

curl -G http://localhost:1933/health \
  --data-urlencode "profile=1"

CLI 单次启用:

ov --profile health

HTTP client / CLI 也可以在 ovcli.conf 中默认启用:

{
  "profile": true
}

注意:profile 只影响当前请求;服务端未开启 server.profile_enabled 时会忽略该参数;只有 JSON 响应会被追加 profile 字段。

批量写入 Session 消息

HTTP:

POST /api/v1/sessions/chat-001/messages/batch
Content-Type: application/json

{
  "messages": [
    {"role": "user", "content": "Summarize the repo status."},
    {"role": "assistant", "content": "I will inspect the latest commits."}
  ]
}

Python async Session wrapper:

session = client.session("chat-001")
await session.batch_add_messages([
    {"role": "user", "content": "Summarize the repo status."},
    {"role": "assistant", "content": "I will inspect the latest commits."},
])

自定义记忆 embedding 模板

自定义记忆 schema 应使用顶层 embedding_template 控制写入向量库的文本:

memory_type: preferences
embedding_template: |-
  {{ user }}

  {{ topic }}

  {{ content }}

模板变量来自记忆字段,content 表示最终记忆正文,也可以在需要时访问 extract_context

体验与兼容性改进

  • LangChain OpenVikingChatMessageHistory.add_messages 改为调用批量写入 API,减少多消息写入时的延迟。
  • Codex VLM 后端初始化 async client cache,避免缓存属性未初始化导致的运行时问题。
  • README 修复 LangChain / LangGraph 文档链接,避免旧编号路径 404。
  • Agent 集成文档新增 AstrBot 插件,覆盖自动记忆捕获、自动召回和 venue 记忆隔离模式。
  • v0.3.19 的 Console BFF timezone 参数说明补充到 Usage/Audit 文档中。

修复

  • 修复语义 target sync 中 source/target URI 混用导致的增量更新不一致。
  • 修复 stale semantic lock handoff 无法恢复时直接失败的问题。
  • 修复 embedding 与 semantic 队列对超大输入持续重试的问题。
  • 修复 CLI profile 字段在 result 解包、表格渲染、标量渲染时被丢弃的问题。
  • 修复 role_id 记忆隔离配置命名不一致,配置项统一为 memory.role_id_memory_isolation_enabled

升级注意事项

  • 自定义 memory schema 如果还在字段上使用 searchable: true,应迁移到顶层 embedding_template。字段级 searchable 已不再参与 embedding 文本生成。
  • 配置项 memory.enable_role_id_memory_isolate 已统一为 memory.role_id_memory_isolation_enabled。请更新自定义 ov.conf
  • profile=1 是调试能力,不建议在高流量生产路径默认开启;返回内容最多保留约 16 KiB profile 文本。
  • 批量消息 API 单次最多接受 100 条消息,每条消息沿用单条 add_messagerolecontentpartscreated_atrole_id 语义。

文档、测试与安全

  • 新增 request profile middleware 测试、memory embedding template 测试、semantic processor 锁与 target sync 测试、HTTP client profile 配置测试。
  • 更新中英文 API 与配置文档,补齐 profile 查询参数、server.profile_enabledovcli.conf.profile
  • 新增 v0.3.18、v0.3.19 中英文 changelog 条目。

English

Overview

v0.3.20 focuses on production debugging, faster session ingestion, memory retrieval quality, and semantic indexing reliability. The release adds request-scoped HTTP profiling, a batch Session message API, memory embedding templates, semantic target-sync and lock-handoff fixes, plus CLI, LangChain, VLM, documentation, and test updates.

Highlights

  • Request-scoped HTTP profiling: Servers can enable server.profile_enabled. Requests with profile=1 then run cProfile for only that HTTP request and append profile lines to JSON responses. The ov CLI can enable and display this with --profile.
  • Batch Session message ingestion: Added POST /api/v1/sessions/{session_id}/messages/batch plus Python HTTP client and Session wrapper support through batch_add_messages. A single request can add up to 100 messages, reducing HTTP round trips for LangChain/LangGraph-style integrations.
  • Memory embedding templates: Memory schemas now support top-level embedding_template, replacing field-level searchable flags. Built-in entities, events, and preferences templates include key fields plus final content in embedding input.
  • Semantic indexing reliability: Resource processing now syncs temp source trees into the target before running the semantic DAG, and diff results use target URIs. Stale semantic lock handoffs can be recovered by reacquiring tree locks, and lock acquisition failures requeue work instead of tripping the API circuit breaker.
  • Embedding input guardrails: Embedding queue input is truncated according to embedding.max_input_tokens, and oversized-input errors are classified as input_too_large to avoid repeated retries for unrecoverable payloads.
  • CLI profile preservation: CLI rendering keeps the profile section for table, scalar, filesystem, content, and system command output after response envelope unwrapping.

New Feature Usage

Request Profiling

Enable profiling on the server:

{
  "server": {
    "profile_enabled": true
  }
}

Enable it for one HTTP request:

curl -G http://localhost:1933/health \
  --data-urlencode "profile=1"

Enable it for one CLI invocation:

ov --profile health

Enable it by default for the HTTP client or CLI:

{
  "profile": true
}

Notes: profile only applies to the current request; the server ignores it unless server.profile_enabled is true; only JSON responses receive an added profile field.

Batch Session Messages

HTTP:

POST /api/v1/sessions/chat-001/messages/batch
Content-Type: application/json

{
  "messages": [
    {"role": "user", "content": "Summarize the repo status."},
    {"role": "assistant", "content": "I will inspect the latest commits."}
  ]
}

Python async Session wrapper:

session = client.session("chat-001")
await session.batch_add_messages([
    {"role": "user", "content": "Summarize the repo status."},
    {"role": "assistant", "content": "I will inspect the latest commits."},
])

Custom Memory Embedding Templates

Custom memory schemas should use top-level embedding_template to control the text sent to the vector store:

memory_type: preferences
embedding_template: |-
  {{ user }}

  {{ topic }}

  {{ content }}

Template variables come from memory fields, content is the final rendered memory body, and extract_context is available when needed.

Improvements

  • LangChain OpenVikingChatMessageHistory.add_messages now uses batch ingestion to lower latency for multi-message writes.
  • The Codex VLM backend initializes its async client cache, preventing runtime issues from an uninitialized cache attribute.
  • README links for the LangChain / LangGraph integration now point to the current numbered docs page.
  • Agent integration docs now include the AstrBot plugin, including automatic capture, automatic recall, and venue-level isolation modes.
  • Usage/Audit docs now include the v0.3.19 Console BFF timezone query parameter behavior.

Fixes

  • Fixed inconsistent incremental semantic updates caused by source/target URI mixups during target sync.
  • Fixed stale semantic lock handoff recovery.
  • Fixed repeated retries for oversized embedding and semantic inputs.
  • Fixed CLI profile loss during response unwrapping and output rendering.
  • Standardized role-id memory isolation config as memory.role_id_memory_isolation_enabled.

Upgrade Notes

  • If a custom memory schema still uses field-level searchable: true, migrate it to top-level embedding_template. Field-level searchable no longer contributes to embedding text generation.
  • Rename memory.enable_role_id_memory_isolate to memory.role_id_memory_isolation_enabled in custom ov.conf files.
  • Treat profile=1 as a debugging tool, not a high-traffic production default. Profile output is capped at about 16 KiB.
  • The batch message endpoint accepts up to 100 messages per request. Each message keeps the same role, content, parts, created_at, and role_id semantics as single-message add_message.

Docs, Tests, and Security

  • Added tests for request profiling middleware, memory embedding templates, semantic processor lock and target-sync behavior, and HTTP client profile configuration.
  • Updated English and Chinese API/configuration docs for the profile query parameter, server.profile_enabled, and ovcli.conf.profile.
  • Added English and Chinese changelog entries for v0.3.18 and v0.3.19.

Don't miss a new OpenViking release

NewReleases is sending notifications on new releases.