DeepTutor v1.3.3 Release Notes

Release Date: 2026.04.30

v1.3.3 is a fast follow-up release after v1.3.2. It expands provider coverage
with NVIDIA NIM and Gemini embeddings, makes Space the unified place to attach
chat history, notebooks, question-bank items, skills, and memory to a turn, and
continues the stability work around RAG re-indexing, thinking-model cleanup,
TutorBot history, and persisted session context.

Highlights

Provider and Embedding Coverage

DeepTutor now covers more hosted provider setups out of the box and keeps the
runtime configuration path aligned with the Setup Tour and .env examples.

NVIDIA NIM is a first-class LLM provider - provider auto-detection now
recognizes nvapi- keys and NVIDIA API bases, defaults to
https://integrate.api.nvidia.com/v1, and avoids sending
stream_options.include_usage because NIM can hang when that option is
present.
Gemini embeddings are available end to end - embedding runtime metadata,
endpoint validation, Setup Tour choices, model suggestions, and .env
examples now include Gemini, with gemini-embedding-001, 3072 dimensions, and
GEMINI_API_KEY fallback support.
Provider-specific embedding keys survive Settings writes - .env writes
preserve keys such as SiliconFlow, DashScope, Cohere, Jina, and Gemini instead
of only preserving the older core provider set.
Dependency resolution is less fragile - the NumPy upper bound was relaxed
to support current Manim installs in deeptutor[all], and Windows setup docs
now call out the Visual Studio Build Tools / C++ workload prerequisite.

Space, Chat Context, Skills, and Memory

The chat composer now treats all learning context as Space context, instead of
splitting references, skills, and memory across separate controls.

Space opens on Chat History - the Space entry point now lands on the new
Chat History page, where previous conversations can be searched, refreshed,
renamed, deleted, and reopened directly from the Space workspace.
One Space menu powers toolbar and @ mentions - the old inline
AtMentionPopup was replaced by a shared Space menu for chat history,
notebooks, question-bank items, skills, and memory, whether opened from the
toolbar or by typing @.
Skills selection is clearer - skills now open in a full picker with
search, tags, explicit multi-select, and Auto mode, instead of a small inline
dropdown beside the composer.
Memory can be attached per turn - users can select the running summary,
profile, or both through the new Memory picker. The request sends
memory_references, and the backend only injects the selected memory files.
Context chips show the full turn setup - selected history, notebooks,
question-bank items, skills, and memory all appear as removable chips before
send; sent user messages also show matching request-snapshot badges.
Answer Now and session hydration keep context - replayed turns and loaded
sessions now hydrate notebooks, history references, question-bank references,
skills, memory references, and attachments from persisted message metadata.

Session Persistence and Message Normalization

Conversation state now records more of the user's actual send-time context and
handles non-text message content more defensively.

Message metadata is persisted - the SQLite session store adds a
metadata_json column and stores a request_snapshot for user messages,
including capability, tools, selected KBs, language, config, attachments,
Space references, skills, and memory selections.
WebSocket turns accept memory and skills explicitly - incoming payloads
normalize memory_references to summary / profile, normalize skills into
a string list, and materialize both into message metadata.
TutorBot history handles multimodal content - bot history and recent bot
previews normalize string, array, object, and image-style content into safe
display text, while internal reasoning_content is stripped from API
responses.
Frontend message previews are safer - shared message-content utilities
now accept unknown content, stringify custom objects, render image parts as
[image], and truncate previews consistently across chat and session lists.

Memory, Notebook, and Thinking-Model Cleanup

The thinking-output cleanup introduced in v1.3.2 now reaches more durable
storage surfaces and rejects malformed memory rewrites before they can corrupt
profile or summary files.

Memory rewrites must match the expected shape - profile and summary
refreshes now verify allowed section headings before writing. If a thinking
model answers the user instead of returning structured memory, the write is
rejected rather than persisted.
Memory context is explicit - build_memory_context() now only includes
summary and/or profile when those files are requested, matching the new
per-turn Memory picker and avoiding accidental default memory injection.
Notebook summaries are cleaned and repaired - notebook writes, streaming
summary saves, and notebook loads strip thinking tags from summaries; older
notebook records are repaired on read when possible.
Streaming summary chunks are cleaner - generated notebook summaries are
assembled, cleaned, and emitted after cleanup, so empty or scratchpad-only
chunks are not streamed to clients.

RAG and Knowledge Base Resilience

RAG validation now catches more invalid persisted indexes before retrieval and
returns clearer events when the user needs to re-index.

Stale processing KBs recover when an index is ready - if kb_config.json
is stuck at processing or initializing but a ready LlamaIndex version is
already on disk, Knowledge Base info reports ready and hides the stale
progress bar instead of leaving the UI in a perpetual processing state.
More vector stores are validated - LlamaIndex storage now checks the
default vector store, storage_context.vector_stores, and persisted
*vector_store.json embedding dictionaries for null, dropped, non-numeric,
non-finite, or inconsistent vectors.
Invalid-index failures emit user-facing status events - RAG search now
sends a structured error status with needs_reindex through the tool event
stream and avoids treating known invalid-index failures as successful
retrieval attempts.
Low-level vector errors are less exposed - known invalid embedding/index
failures are logged and surfaced as re-index guidance instead of raw
NoneType * float style tracebacks in user-facing logs.

Tests

Added Knowledge Manager coverage for promoting stale processing /
initializing status to ready when a valid index version already exists.
Added provider coverage for NVIDIA NIM registry metadata, stream-option
behavior, Gemini embedding runtime defaults, endpoint validation, Setup Tour
provider choices, and .env key preservation.
Added session and WebSocket coverage for metadata_json, request snapshots,
skills normalization, memory reference parsing, and turn materialization.
Added memory and notebook coverage for thinking-tag stripping, invalid memory
rewrite rejection, selective memory-context injection, and summary repair on
read.
Added RAG/LlamaIndex coverage for multi-vector-store validation,
disk-persisted invalid vectors, needs_reindex status events, and sanitized
raw logs.
Added TutorBot and frontend message-content coverage for non-string,
multimodal, object, image, and truncated message content.

Upgrade Notes

Existing SQLite session databases are migrated in place with a new
messages.metadata_json column the first time the session store opens them.
Custom WebSocket clients that relied on implicit memory injection should now
pass memory_references: ["summary"], ["profile"], or both. Empty or absent
memory references intentionally mean "do not attach long-term memory".
Knowledge bases that still report invalid persisted vectors should be
re-indexed after confirming the active embedding provider, model, dimension,
and endpoint URL.
Notebook summary streaming clients should expect cleaned summary output after
assembly rather than relying on every raw model chunk being forwarded.
NVIDIA NIM users should configure an OpenAI-compatible model under the new
provider and keep stream_options.include_usage disabled for this gateway.

Full Changelog: v1.3.2...v1.3.3

HKUDS/DeepTutor v1.3.3 DeepTutor-v1.3.3 on GitHub