github HKUDS/DeepTutor v1.3.3
DeepTutor-v1.3.3

5 hours ago

DeepTutor v1.3.3 Release Notes

Release Date: 2026.04.30

v1.3.3 is a fast follow-up release after v1.3.2. It expands provider coverage
with NVIDIA NIM and Gemini embeddings, makes Space the unified place to attach
chat history, notebooks, question-bank items, skills, and memory to a turn, and
continues the stability work around RAG re-indexing, thinking-model cleanup,
TutorBot history, and persisted session context.

Highlights

Provider and Embedding Coverage

DeepTutor now covers more hosted provider setups out of the box and keeps the
runtime configuration path aligned with the Setup Tour and .env examples.

  • NVIDIA NIM is a first-class LLM provider - provider auto-detection now
    recognizes nvapi- keys and NVIDIA API bases, defaults to
    https://integrate.api.nvidia.com/v1, and avoids sending
    stream_options.include_usage because NIM can hang when that option is
    present.
  • Gemini embeddings are available end to end - embedding runtime metadata,
    endpoint validation, Setup Tour choices, model suggestions, and .env
    examples now include Gemini, with gemini-embedding-001, 3072 dimensions, and
    GEMINI_API_KEY fallback support.
  • Provider-specific embedding keys survive Settings writes - .env writes
    preserve keys such as SiliconFlow, DashScope, Cohere, Jina, and Gemini instead
    of only preserving the older core provider set.
  • Dependency resolution is less fragile - the NumPy upper bound was relaxed
    to support current Manim installs in deeptutor[all], and Windows setup docs
    now call out the Visual Studio Build Tools / C++ workload prerequisite.

Space, Chat Context, Skills, and Memory

The chat composer now treats all learning context as Space context, instead of
splitting references, skills, and memory across separate controls.

  • Space opens on Chat History - the Space entry point now lands on the new
    Chat History page, where previous conversations can be searched, refreshed,
    renamed, deleted, and reopened directly from the Space workspace.
  • One Space menu powers toolbar and @ mentions - the old inline
    AtMentionPopup was replaced by a shared Space menu for chat history,
    notebooks, question-bank items, skills, and memory, whether opened from the
    toolbar or by typing @.
  • Skills selection is clearer - skills now open in a full picker with
    search, tags, explicit multi-select, and Auto mode, instead of a small inline
    dropdown beside the composer.
  • Memory can be attached per turn - users can select the running summary,
    profile, or both through the new Memory picker. The request sends
    memory_references, and the backend only injects the selected memory files.
  • Context chips show the full turn setup - selected history, notebooks,
    question-bank items, skills, and memory all appear as removable chips before
    send; sent user messages also show matching request-snapshot badges.
  • Answer Now and session hydration keep context - replayed turns and loaded
    sessions now hydrate notebooks, history references, question-bank references,
    skills, memory references, and attachments from persisted message metadata.

Session Persistence and Message Normalization

Conversation state now records more of the user's actual send-time context and
handles non-text message content more defensively.

  • Message metadata is persisted - the SQLite session store adds a
    metadata_json column and stores a request_snapshot for user messages,
    including capability, tools, selected KBs, language, config, attachments,
    Space references, skills, and memory selections.
  • WebSocket turns accept memory and skills explicitly - incoming payloads
    normalize memory_references to summary / profile, normalize skills into
    a string list, and materialize both into message metadata.
  • TutorBot history handles multimodal content - bot history and recent bot
    previews normalize string, array, object, and image-style content into safe
    display text, while internal reasoning_content is stripped from API
    responses.
  • Frontend message previews are safer - shared message-content utilities
    now accept unknown content, stringify custom objects, render image parts as
    [image], and truncate previews consistently across chat and session lists.

Memory, Notebook, and Thinking-Model Cleanup

The thinking-output cleanup introduced in v1.3.2 now reaches more durable
storage surfaces and rejects malformed memory rewrites before they can corrupt
profile or summary files.

  • Memory rewrites must match the expected shape - profile and summary
    refreshes now verify allowed section headings before writing. If a thinking
    model answers the user instead of returning structured memory, the write is
    rejected rather than persisted.
  • Memory context is explicit - build_memory_context() now only includes
    summary and/or profile when those files are requested, matching the new
    per-turn Memory picker and avoiding accidental default memory injection.
  • Notebook summaries are cleaned and repaired - notebook writes, streaming
    summary saves, and notebook loads strip thinking tags from summaries; older
    notebook records are repaired on read when possible.
  • Streaming summary chunks are cleaner - generated notebook summaries are
    assembled, cleaned, and emitted after cleanup, so empty or scratchpad-only
    chunks are not streamed to clients.

RAG and Knowledge Base Resilience

RAG validation now catches more invalid persisted indexes before retrieval and
returns clearer events when the user needs to re-index.

  • Stale processing KBs recover when an index is ready - if kb_config.json
    is stuck at processing or initializing but a ready LlamaIndex version is
    already on disk, Knowledge Base info reports ready and hides the stale
    progress bar instead of leaving the UI in a perpetual processing state.
  • More vector stores are validated - LlamaIndex storage now checks the
    default vector store, storage_context.vector_stores, and persisted
    *vector_store.json embedding dictionaries for null, dropped, non-numeric,
    non-finite, or inconsistent vectors.
  • Invalid-index failures emit user-facing status events - RAG search now
    sends a structured error status with needs_reindex through the tool event
    stream and avoids treating known invalid-index failures as successful
    retrieval attempts.
  • Low-level vector errors are less exposed - known invalid embedding/index
    failures are logged and surfaced as re-index guidance instead of raw
    NoneType * float style tracebacks in user-facing logs.

Tests

  • Added Knowledge Manager coverage for promoting stale processing /
    initializing status to ready when a valid index version already exists.
  • Added provider coverage for NVIDIA NIM registry metadata, stream-option
    behavior, Gemini embedding runtime defaults, endpoint validation, Setup Tour
    provider choices, and .env key preservation.
  • Added session and WebSocket coverage for metadata_json, request snapshots,
    skills normalization, memory reference parsing, and turn materialization.
  • Added memory and notebook coverage for thinking-tag stripping, invalid memory
    rewrite rejection, selective memory-context injection, and summary repair on
    read.
  • Added RAG/LlamaIndex coverage for multi-vector-store validation,
    disk-persisted invalid vectors, needs_reindex status events, and sanitized
    raw logs.
  • Added TutorBot and frontend message-content coverage for non-string,
    multimodal, object, image, and truncated message content.

Upgrade Notes

  • Existing SQLite session databases are migrated in place with a new
    messages.metadata_json column the first time the session store opens them.
  • Custom WebSocket clients that relied on implicit memory injection should now
    pass memory_references: ["summary"], ["profile"], or both. Empty or absent
    memory references intentionally mean "do not attach long-term memory".
  • Knowledge bases that still report invalid persisted vectors should be
    re-indexed after confirming the active embedding provider, model, dimension,
    and endpoint URL.
  • Notebook summary streaming clients should expect cleaned summary output after
    assembly rather than relying on every raw model chunk being forwarded.
  • NVIDIA NIM users should configure an OpenAI-compatible model under the new
    provider and keep stream_options.include_usage disabled for this gateway.

Full Changelog: v1.3.2...v1.3.3

Don't miss a new DeepTutor release

NewReleases is sending notifications on new releases.