github HKUDS/DeepTutor v1.3.4
DeepTutor-v1.3.4

12 hours ago

DeepTutor v1.3.4 Release Notes

Release Date: 2026.05.01

v1.3.4 turns the Book Engine and chat workspace into a tighter learning loop:
book pages can now carry their own persistent chat sessions, books can be
rebuilt from an existing spine, and regular chat turns can cite selected book
pages alongside Space context. This release also improves language consistency,
DeepSeek-style reasoning output handling, document extraction for RAG, logging
infrastructure, and the public documentation around DeepTutor's arXiv paper.

Highlights

Book Engine, Page Chat, and Book References

Book generation and reading now preserve more of the user's context and make it
easier to iterate on a generated book without starting over.

  • Book page chat uses the unified stream protocol - the page chat panel now
    uses the shared WebSocket client and stream-event renderer used by the main
    chat workspace, so tool output, assistant events, attachments, and restored
    session history behave consistently.
  • Page chat sessions are persisted per book page - each page can be bound to
    a chat session_id through the new page-chat-session API, and reopening the
    reader restores the page's conversation when available.
  • Books can be rebuilt from the approved spine - the new rebuild flow clears
    generated page content and progress while keeping the confirmed outline, then
    restarts compilation from that structure.
  • Single-page regeneration keeps learner notes - forced recompilation can
    reset generated content while preserving user-authored note blocks and key
    transition metadata.
  • Regular chat can cite book pages - the chat composer can attach selected
    books and pages as request context, persist them in the turn snapshot, restore
    them when sessions hydrate, and show them as removable context chips.
  • Book context is cleaner for reasoning models - selected book pages are
    converted into bounded text references with thinking tags stripped before they
    are injected into chat or page-side conversations.

Chat Language and Reasoning-Model Behavior

Chat turns now follow the user's current language setting more reliably and are
more tolerant of providers that return reasoning content differently.

  • Language is part of each chat turn - WebSocket requests can carry the
    current language, and both agentic chat and classic chat append explicit
    language instructions so answers match the active UI language.
  • Regenerate and Answer Now use the current app language - new turns and
    regenerated turns read the latest stored language instead of relying only on
    older session preferences.
  • DeepSeek-style empty-content responses recover better - OpenAI-compatible
    providers can fall back to reasoning_content when a model returns an empty
    visible content field.
  • Book block writing can tune reasoning effort - LLM-backed book block
    generation now passes reasoning_effort, and structured JSON retries can
    lower effort when reasoning-heavy models fail to return parseable JSON.

RAG, Documents, and Knowledge Base Recovery

Document ingestion now uses the same extraction path across more file types and
keeps re-indexing available in more recovery states.

  • Office files route through parser extraction - .xlsx and .pptx files
    now join PDF and DOCX in the parser-backed routing path, with spreadsheet and
    presentation categories available to downstream RAG logic.
  • LlamaIndex loading uses shared document extraction - parser-routed files
    are read through extract_text_from_path(), use file-type-specific size
    limits, and avoid unnecessary character truncation during indexing.
  • DOCX extraction has a safer fallback path - when python-docx cannot read
    a file, the extractor can parse OOXML content through defusedxml instead of
    failing immediately.
  • Knowledge Base re-index controls are less fragile - the web UI can expose
    re-index actions for error and mismatch states without requiring an already
    initialized RAG runtime, as long as source documents are available.
  • Scanned or empty documents fail more clearly - extraction and validation
    now distinguish byte limits, character limits, empty parsed content, and
    unsupported parser results more consistently.

Settings, Runtime State, and Logging

This release continues the infrastructure cleanup needed for long-running local
and server deployments.

  • Settings shows clearer runtime state - backend, LLM, embedding, and search
    status are displayed as service cards with online state, timestamps, runtime
    model details, and pending-apply indicators.
  • LLM_REASONING_EFFORT is configurable - reasoning effort can be supplied
    through environment configuration and is included in runtime summaries.
  • Logging uses the standard logger surface - routers, agents, providers, RAG
    code, and TutorBot integrations move away from the old custom logger module
    toward standard logging.getLogger(__name__) usage plus a focused Loguru
    bridge.
  • Raw RAG debug log forwarding is quieter - the RAG service no longer
    forwards low-level logging handler output into user-facing event streams by
    default.
  • CI and lint coverage were refreshed - workflow and test changes cover the
    logging configuration path, process-log streaming, and lint consistency.

Documentation and Localization

The project documentation now reflects the paper release and keeps localized
README files aligned with the latest release cadence.

  • The arXiv paper is linked from the README - the main README badge and News
    section now point to 2604.26962.
  • Localized READMEs were refreshed - translated README files include the
    latest release list, arXiv/news updates, and the expanded language navigation.
  • Book, chat, settings, and rebuild copy is localized - English and Chinese
    app strings now cover the new Book chat, rebuild, language, attachment, and
    runtime-state surfaces.

Tests

  • Added chat-language prompt coverage for per-turn language directives and
    language-aware agentic chat behavior.
  • Added Book Engine coverage for book context extraction, page-chat session
    binding, rebuild controls, forced page recompilation, and LLM JSON writing.
  • Added RAG and document-loader coverage for parser-routed files, Office
    extraction paths, file-size limits, and re-index eligibility helpers.
  • Added provider/runtime coverage for LLM_REASONING_EFFORT, OpenAI-compatible
    reasoning fallback behavior, and provider runtime summaries.
  • Added logging tests for configuration, context propagation, Loguru bridging,
    process-log extraction, and task log streaming.
  • Updated frontend tests for document attachment handling, version reporting,
    and Knowledge Base re-index helper behavior.

Upgrade Notes

  • CLI and server installs should refresh dependencies after upgrading. The CLI
    extra and requirements/cli.txt now include defusedxml>=0.7.1 for safer
    XML parsing during Office document extraction.
  • Custom WebSocket clients can pass book_references and language on turn
    start messages. Clients that persist request snapshots should store book
    references alongside notebooks, history, skills, memory, and attachments.
  • Deployments that use reasoning models can set LLM_REASONING_EFFORT to tune
    reasoning effort globally; per-profile and per-model values remain available
    as lower-priority fallbacks.
  • Integrations that consumed raw RAG debug log events should rely on structured
    status and tool events instead of low-level forwarded logger output.
  • Book clients should call the new page-chat-session and rebuild APIs when they
    need page-level conversation persistence or spine-preserving regeneration.

Full Changelog: v1.3.3...v1.3.4

Don't miss a new DeepTutor release

NewReleases is sending notifications on new releases.