This is version 2 of the web search beta which contains some important fixes including upstream llama.cpp fixes for Llama 3.1.
Fixes
- Update to latest llama.cpp which includes RoPE fix
- Fix problem with only displaying one source for tool call excerpts
- Add the extra snippets to the source excerpts
- Fix the way we're injecting the context back into the model for web search
- Change the suggestion mode to turn on for tool calls by default
WARNING:
There was a problem with the synchronization between this beta release and the models.json. In order to make this work you have to perform the following steps:
- Rename the file
Meta-Llama-3.1-8B-Instruct-128k-Q4_0.gguf
toMeta-Llama-3.1-8B-Instruct-Q4_0.gguf
- Copy the following into the prompt template:
<|start_header_id|>user<|end_header_id|>
%1<|eot_id|><|start_header_id|>assistant<|end_header_id|>
%2
- Copy the following into the system prompt:
<|start_header_id|>system<|end_header_id|>
Environment: ipython
Tools: brave_search
Cutting Knowledge Date: December 2023
Today Date: 25 Jul 2024
You are a helpful assistant.<|eot_id|>