DeepTutor v1.3.2 Release Notes

Release Date: 2026.04.29

v1.3.2 is a focused stability release after v1.3.1. It tightens the embedding
endpoint contract, makes LlamaIndex RAG recover more cleanly from stale or
invalid indexes, and prevents reasoning-model scratchpad output from leaking
into long-term memory.

Highlights

Transparent Embedding Endpoint URLs

Embedding configuration is now explicit about the exact URL DeepTutor will call.
This removes the hidden "base URL vs endpoint URL" ambiguity that could make a
successful Settings test behave differently from a Knowledge Base re-index.

Settings now shows Endpoint URL for embeddings - the Web settings page
labels embedding URLs as endpoint URLs and explains that DeepTutor posts to
the visible URL exactly, without appending /embeddings or /api/embed at
request time.
Provider defaults are full endpoints - OpenAI, OpenRouter, Jina, vLLM/LM
Studio, and SiliconFlow default to /embeddings; Ollama defaults to
/api/embed; Cohere defaults to /embed; DashScope keeps its native
multimodal embedding endpoint.
Legacy base URLs are migrated safely - saved embedding profiles using
old-style bases such as https://api.openai.com/v1,
https://openrouter.ai/api/v1, or http://localhost:11434 are normalized to
the full endpoint form and persisted back to the model catalog. Custom
OpenAI-compatible URLs are left untouched.
Misconfigured endpoints fail early - the embedding client now rejects
known-provider URLs that point to a root/base path instead of the real
embedding endpoint, with an actionable message before indexing starts.
OpenRouter embedding uses exact-URL HTTP - public embedding providers no
longer route through the OpenAI SDK's hidden path-appending behavior.
custom_openai_sdk remains available for legacy configs, but is hidden from
the Settings provider dropdown.
Connection-test diagnostics match runtime behavior - embedding tests now
report "POSTed exactly as shown in Settings", matching the adapter behavior
used by RAG indexing and retrieval.

RAG Re-index and Retrieval Resilience

The LlamaIndex pipeline now refreshes embedding state more aggressively and
turns invalid persisted vectors into clear re-index guidance instead of raw
Python or NumPy errors.

Cached pipelines pick up Settings changes - initialize, search, and
incremental add paths reconfigure LlamaIndex before use, so a long-lived
pipeline does not keep embedding model, dimension, or endpoint settings from
an older Settings session.
Embedding clients refresh when config changes - the shared embedding
client is recreated when the resolved runtime config changes, and the
LlamaIndex CustomEmbedding adapter fingerprints the active config before
reusing a cached client.
Persisted index vectors are validated before retrieval - LlamaIndex
storage now checks the saved vector store for null, non-numeric, non-finite,
dropped, or inconsistent vectors before running similarity search.
Invalid indexes return a re-index hint - known failures such as
unsupported operand type(s) for *: 'NoneType' and 'float', vector shape
mismatches, and newly detected invalid persisted vectors now return
needs_reindex: true with a user-facing explanation.
Embedding connectivity checks use the same validation path - the
pre-index smoke test validates provider output with the same batch validator
used during indexing and retrieval.
RAG error logs are quieter when the fix is known - classified invalid
embedding/index failures are logged as actionable warnings instead of noisy
full tracebacks.

Memory Cleanup for Thinking Models

Memory refresh now strips private reasoning blocks before they can become
durable user memory.

Thinking tags are removed before writes - profile and summary rewrites run
through the shared clean_thinking_tags() helper after code-fence cleanup, so
<think> / <thinking> blocks from reasoning models are not saved into
PROFILE.md or SUMMARY.md.
Existing memory files self-repair on read - if an older memory file
already contains closed or unclosed thinking tags, reading the snapshot cleans
the content and writes the repaired version back to disk when possible.
Manual memory edits use the same cleanup - direct memory writes also pass
through the cleaner, keeping UI edits, refreshes, and runtime reads aligned.

Settings and Runtime Polish

Embedding provider choices are less confusing - Settings no longer offers
the legacy custom_openai_sdk provider in the public dropdown, while existing
saved profiles continue to resolve for backwards compatibility.
Model catalog normalization is persisted - catalog loads now save when
normalization changes active profile/model IDs or embedding endpoint URLs,
preventing the same migration from repeating on every startup.
OpenAI-compatible embedding errors are clearer - non-JSON or HTML
embedding responses now point to wrong endpoint/model pairings without
incorrectly suggesting only one gateway-specific cause.
Deep Solve ReAct calls are aligned again - the solver loop no longer
passes a stale attachments keyword into SolverAgent.process(), avoiding a
runtime TypeError while keeping attachment forwarding on the planner and
replan calls where it is supported.

Tests

Added endpoint migration coverage for OpenAI, OpenRouter, Ollama, and custom
embedding profiles.
Added Settings API coverage for full endpoint provider choices and hidden
custom_openai_sdk.
Added embedding client coverage for endpoint validation, OpenRouter's raw HTTP
adapter path, client refresh on config changes, and exact URL transparency.
Added LlamaIndex coverage for stale embedding-client refresh, repeated
settings reconfiguration, invalid persisted vector detection, and re-index
hints for invalid indexes.
Added memory coverage for closed and unclosed thinking tags, plus repair of
existing memory files during reads.
Ran targeted Deep Solve/RAG capability tests covering solver runtime wiring
after the stale attachments argument fix.

Upgrade Notes

Embedding URLs in Settings should now be full endpoint URLs. Existing known
provider profiles are migrated automatically, but custom gateways should be
reviewed manually if they use non-standard paths.
If Knowledge Base search still reports invalid embedding vectors, re-index the
affected KB after confirming the active embedding provider, model, dimension,
and endpoint URL.
Memory files containing old <think> blocks will be cleaned the next time the
Memory page or memory service reads them; this read can update the underlying
PROFILE.md or SUMMARY.md file to persist the cleaned version.

Full Changelog: v1.3.1...v1.3.2

HKUDS/DeepTutor v1.3.2 DeepTutor-v1.3.2 on GitHub