Support RAG
Seamlessly integrates document interactions into your chat experience.
Support AI Agent
AI Agent = Prompt (Role) + Tools (Function Callings) + Knowndge (RAG). It's also known as OpenAI's GPTs.
New Platforms
- lingyiwanwu(01ai)
- voyageai
- jina
New Models
- claude:claude-3-5-sonnet-20240620
- vertexai:gemini-1.5-pro-001
- vertexai:gemini-1.5-flash-001
- vertexai-claude:claude-3-5-sonnet@20240620
- bedrock:anthropic.claude-3-5-sonnet-20240620-v1:0
- zhipuai:glm-4-0520
- lingyiwanwu:yi-large*
- lingyiwanwu:yi-medium*
- lingyiwanwu:yi-spark
All embedding/reranker models are ignored
New Configuration
repl_prelude: null # Overrides the `prelude` setting specifically for conversations started in REPL
agent_prelude: null # Set a session to use when starting a agent. (e.g. temp, default)
# Regex for seletecting dangerous functions
# User confirmation is required when executing these functions
# e.g. 'execute_command|execute_js_code' 'execute_.*'
dangerously_functions_filter: null
# Per-Agent configuration
agents:
- name: todo-sh
model: null
temperature: null
top_p: null
dangerously_functions_filter: null
# Define document loaders to control how RAG and `.file`/`--file` load files of specific formats.
document_loaders:
# You can add custom loaders using the following syntax:
# <file-extension>: <command-to-load-the-file>
# Note: Use `$1` for input file and `$2` for output file. If `$2` is omitted, use stdout as output.
pdf: 'pdftotext $1 -' # Load .pdf file, see https://poppler.freedesktop.org
docx: 'pandoc --to plain $1' # Load .docx file
# xlsx: 'ssconvert $1 $2' # Load .xlsx file
# html: 'pandoc --to plain $1' # Load .html file
recursive_url: 'rag-crawler $1 $2' # Load websites, see https://github.com/sigoden/rag-crawler
# ---- RAG ----
rag_embedding_model: null # Specifies the embedding model to use
rag_reranker_model: null # Specifies the rerank model to use
rag_top_k: 4 # Specifies the number of documents to retrieve
rag_chunk_size: null # Specifies the chunk size
rag_chunk_overlap: null # Specifies the chunk overlap
rag_min_score_vector_search: 0 # Specifies the minimum relevance score for vector-based searching
rag_min_score_keyword_search: 0 # Specifies the minimum relevance score for keyword-based searching
rag_min_score_rerank: 0 # Specifies the minimum relevance score for reranking
rag_template: ...
clients:
- name: localai
models:
- name: xxxx # Embedding model
type: embedding
max_input_tokens: 2048
default_chunk_size: 2000
max_batch_size: 100
- name: xxxx # Reranker model
type: reranker
max_input_tokens: 2048
New REPL Commands
.edit session Edit the current session with an editor
.rag Init or use the RAG
.info rag View RAG info
.rebuild rag Rebuild the RAG to sync document changes
.exit rag Leave the RAG
.agent Use a agent
.info agent View agent info
.starter Use the conversation starter
.exit agent Leave the agent
.continue Continue the response
.regenerate Regenerate the last response
New CLI Options
-a, --agent <AGENT> Start a agent
-R, --rag <RAG> Start a RAG
--list-agents List all agents
--list-rags List all RAGs
Break Changing
Some client fields have changed
clients:
- name: myclient
patches:
<regex>:
- request_body:
+ chat_completions_body:
models:
- name: mymodel
max_output_tokens: 4096
- pass_max_tokens: true
+ require_max_tokens: true
The way to identify dangerous functions has changed
Previous we treats function name that starts with may_
as execute type (dangerously). This method requires modifying function names, which is inflexible.
Now we makes it configurable. In config.yaml
, you can now define which functions are considered dangerous and require user confirmation .
dangerously_functions_filter: 'execute_.*'
New Features
- support RAG (#560)
- custom more path to file/dirs with environment variables (#565)
- support agent (#579)
- add config
dangerously_functions
(#582) - add config
repl_prelude
andagent_prelude
(#584) - add
.starter
repl command (#594) - add
.edit session
repl command (#606) - abandon
auto_copy
(#607) - add
.continue
repl command (#608) - add
.regenerate
repl command (#610) - support lingyiwanwu client (#613)
- qianwen support function calling (#616)
- support rerank (#620)
- cloudflare support embeddings (#623)
- serve embeddings api (#624)
- ernie support embeddings and rereank (#630)
- ernie support function calling (#631)
- support rag-dedicated clients (jina and voyageai) (#645)
- custom rag document loaders (#650)
- rag load websites (#655)
- implement native rag url loader (#660)
.file
/--file
support URLs (#665)- support
.rebuild rag
repl command (#672)