github HKUDS/DeepTutor v1.3.2
DeepTutor-v1.3.2

7 hours ago

DeepTutor v1.3.2 Release Notes

Release Date: 2026.04.29

v1.3.2 is a focused stability release after v1.3.1. It tightens the embedding
endpoint contract, makes LlamaIndex RAG recover more cleanly from stale or
invalid indexes, and prevents reasoning-model scratchpad output from leaking
into long-term memory.

Highlights

Transparent Embedding Endpoint URLs

Embedding configuration is now explicit about the exact URL DeepTutor will call.
This removes the hidden "base URL vs endpoint URL" ambiguity that could make a
successful Settings test behave differently from a Knowledge Base re-index.

  • Settings now shows Endpoint URL for embeddings - the Web settings page
    labels embedding URLs as endpoint URLs and explains that DeepTutor posts to
    the visible URL exactly, without appending /embeddings or /api/embed at
    request time.
  • Provider defaults are full endpoints - OpenAI, OpenRouter, Jina, vLLM/LM
    Studio, and SiliconFlow default to /embeddings; Ollama defaults to
    /api/embed; Cohere defaults to /embed; DashScope keeps its native
    multimodal embedding endpoint.
  • Legacy base URLs are migrated safely - saved embedding profiles using
    old-style bases such as https://api.openai.com/v1,
    https://openrouter.ai/api/v1, or http://localhost:11434 are normalized to
    the full endpoint form and persisted back to the model catalog. Custom
    OpenAI-compatible URLs are left untouched.
  • Misconfigured endpoints fail early - the embedding client now rejects
    known-provider URLs that point to a root/base path instead of the real
    embedding endpoint, with an actionable message before indexing starts.
  • OpenRouter embedding uses exact-URL HTTP - public embedding providers no
    longer route through the OpenAI SDK's hidden path-appending behavior.
    custom_openai_sdk remains available for legacy configs, but is hidden from
    the Settings provider dropdown.
  • Connection-test diagnostics match runtime behavior - embedding tests now
    report "POSTed exactly as shown in Settings", matching the adapter behavior
    used by RAG indexing and retrieval.

RAG Re-index and Retrieval Resilience

The LlamaIndex pipeline now refreshes embedding state more aggressively and
turns invalid persisted vectors into clear re-index guidance instead of raw
Python or NumPy errors.

  • Cached pipelines pick up Settings changes - initialize, search, and
    incremental add paths reconfigure LlamaIndex before use, so a long-lived
    pipeline does not keep embedding model, dimension, or endpoint settings from
    an older Settings session.
  • Embedding clients refresh when config changes - the shared embedding
    client is recreated when the resolved runtime config changes, and the
    LlamaIndex CustomEmbedding adapter fingerprints the active config before
    reusing a cached client.
  • Persisted index vectors are validated before retrieval - LlamaIndex
    storage now checks the saved vector store for null, non-numeric, non-finite,
    dropped, or inconsistent vectors before running similarity search.
  • Invalid indexes return a re-index hint - known failures such as
    unsupported operand type(s) for *: 'NoneType' and 'float', vector shape
    mismatches, and newly detected invalid persisted vectors now return
    needs_reindex: true with a user-facing explanation.
  • Embedding connectivity checks use the same validation path - the
    pre-index smoke test validates provider output with the same batch validator
    used during indexing and retrieval.
  • RAG error logs are quieter when the fix is known - classified invalid
    embedding/index failures are logged as actionable warnings instead of noisy
    full tracebacks.

Memory Cleanup for Thinking Models

Memory refresh now strips private reasoning blocks before they can become
durable user memory.

  • Thinking tags are removed before writes - profile and summary rewrites run
    through the shared clean_thinking_tags() helper after code-fence cleanup, so
    <think> / <thinking> blocks from reasoning models are not saved into
    PROFILE.md or SUMMARY.md.
  • Existing memory files self-repair on read - if an older memory file
    already contains closed or unclosed thinking tags, reading the snapshot cleans
    the content and writes the repaired version back to disk when possible.
  • Manual memory edits use the same cleanup - direct memory writes also pass
    through the cleaner, keeping UI edits, refreshes, and runtime reads aligned.

Settings and Runtime Polish

  • Embedding provider choices are less confusing - Settings no longer offers
    the legacy custom_openai_sdk provider in the public dropdown, while existing
    saved profiles continue to resolve for backwards compatibility.
  • Model catalog normalization is persisted - catalog loads now save when
    normalization changes active profile/model IDs or embedding endpoint URLs,
    preventing the same migration from repeating on every startup.
  • OpenAI-compatible embedding errors are clearer - non-JSON or HTML
    embedding responses now point to wrong endpoint/model pairings without
    incorrectly suggesting only one gateway-specific cause.
  • Deep Solve ReAct calls are aligned again - the solver loop no longer
    passes a stale attachments keyword into SolverAgent.process(), avoiding a
    runtime TypeError while keeping attachment forwarding on the planner and
    replan calls where it is supported.

Tests

  • Added endpoint migration coverage for OpenAI, OpenRouter, Ollama, and custom
    embedding profiles.
  • Added Settings API coverage for full endpoint provider choices and hidden
    custom_openai_sdk.
  • Added embedding client coverage for endpoint validation, OpenRouter's raw HTTP
    adapter path, client refresh on config changes, and exact URL transparency.
  • Added LlamaIndex coverage for stale embedding-client refresh, repeated
    settings reconfiguration, invalid persisted vector detection, and re-index
    hints for invalid indexes.
  • Added memory coverage for closed and unclosed thinking tags, plus repair of
    existing memory files during reads.
  • Ran targeted Deep Solve/RAG capability tests covering solver runtime wiring
    after the stale attachments argument fix.

Upgrade Notes

  • Embedding URLs in Settings should now be full endpoint URLs. Existing known
    provider profiles are migrated automatically, but custom gateways should be
    reviewed manually if they use non-standard paths.
  • If Knowledge Base search still reports invalid embedding vectors, re-index the
    affected KB after confirming the active embedding provider, model, dimension,
    and endpoint URL.
  • Memory files containing old <think> blocks will be cleaned the next time the
    Memory page or memory service reads them; this read can update the underlying
    PROFILE.md or SUMMARY.md file to persist the cleaned version.

Full Changelog: v1.3.1...v1.3.2

Don't miss a new DeepTutor release

NewReleases is sending notifications on new releases.