Import Notes
- Introduced a raw data query API
/query/data, enabling developers to retrieve complete raw data recalled by LightRAG for fine-grained processing. - Optimized the system to efficiently handle hundreds of documents and hundreds of thousands of ten-relationships in one batch job, resolving UI lag and enhancing overall system stability.
- Drop entities with short numeric names that negatively impact performance and query results; dropping names containing only two digits, names shorter than six characters, and names mixed with digits and dots, like 1.1, 12.3, 1.2.3 etc.
- Significantly improves the quantity and quality of entity and relation extraction for smaller parameter models, leading to a substantial improvement in query performance
- Optimized the prompt engineering for
Qwen3-30B-A3B-Instructandgpt-oss-120bmodels, incorporating targeted fault tolerance for model outputs. - Implemented
max tokensandtempretureconfiguration to prevent excessively long or endless output loop during the entity relationship extraction phase for Large Language Model (LLM) responses.
# Increased temperature values may mitigate infinite inference loops in certain LLM, such as Qwen3-30B.
OPENAI_LLM_TEMPERATURE=0.9
# For vLLM/SGLang doployed models, or most of OpenAI compatible API provider
OPENAI_LLM_MAX_TOKENS=9000
# For Ollama Deployed Modeles
OLLAMA_LLM_NUM_PREDICT=9000
# For OpenAI o1-mini or newer modles
OPENAI_LLM_MAX_COMPLETION_TOKENS=9000
The purpose of setting max tokens parameter is to truncate LLM output before timeouts occur, thereby preventing document extraction failures. This addresses issues where certain text blocks (e.g., tables or citations) containing numerous entities and relationships can lead to overly long or even endless loop outputs from LLMs. This setting is particularly crucial for locally deployed, smaller-parameter models. Max tokens value can be calculated by this formula:
LLM_TIMEOUT * llm_output_tokens/second(i.e.9000 = 180s * 50 tokens/s)
What's New
- refact: Enhance KG Extraction with Improved Prompts and Parser Robustness by @danielaskdd in #2032
- feat: Limit Pipeline Status History Messages to Latest 1000 Entries by @danielaskdd in #2064
- feature: Enhance document status display with metadata tooltips and better icons by @danielaskdd in #2070
- refactor: Optimize Entity Extraction for Small Parameter LLMs with Enhanced Prompt Caching by @danielaskdd in #2076 #2072
- Feature: Add LLM COT Rendering support for WebUI by @danielaskdd in #2077
- feat: Add Deepseek Sytle CoT Support for Open AI Compatible LLM Provider by @danielaskdd in #2086
- Add
query_datafunction and /query/data API endpoint to LightRag for retrieval structured response by @tongda in #2036 #2100
What's Fixed
- Fix: Eliminate Lambda Closure Bug in Embedding Function Creation by @avchauzov in #2028
- refac: Eliminate Conditional Imports and Simplify Initialization by @danielaskdd in #2029
- Fix: Preserve Leading Spaces in Graph Label Selection by @danielaskdd in #2030
- Fix ENTITY_TYPES Environment Variable Handling by @danielaskdd in #2034
- refac: Enhanced Entity Relation Extraction Text Sanitization and Normalization by @danielaskdd in #2031
- Fix LLM output instability for <|> tuple delimiter by @danielaskdd in #2035
- Enhance KG Extraction for LLM with Small Parameters by @danielaskdd in #2051
- Add VDB error handling with retries for data consistency by @danielaskdd in #2055
- Fix incorrect variable name in NetworkXStorage file path by @danielaskdd in #2060
- refact: Smart Configuration Caching and Conditional Logging by @danielaskdd in #2068
- refactor: Improved Exception Handling with Context-Aware Error Messages by @danielaskdd in #2069
- fix env file example by @k-shlomi in #2075
- Increase default Gunicorn worker timeout from 210 to 300 seconds by @danielaskdd in #2078
- Fix assistant message display with content fallback by @danielaskdd in #2079
- Prompt Optimization: remove angle brackets from entity and relationship output formats by @danielaskdd in #2082
- Refactor PostgreSQL Graph Query by Native SQL and Standardized Parameter Passing by @Matt23-star in #2027
- Update env.example by @rolotumazi in #2091
- refactor: Optimize Prompt and Fault Tolerance for LLM with Smaller Param LLM by @danielaskdd in #2093
New Contributors
- @avchauzov made their first contribution in #2028
- @k-shlomi made their first contribution in #2075
- @rolotumazi made their first contribution in #2091
- @tongda made their first contribution in #2036
Full Changelog: v1.4.7...v1.4.8