Changelog
- Play sound when VOCR is launched and ready.
- Check for update when launching
- Increased timeout for request to 10 minutes
- Alert update through notification center
- Fixed error when encountering Ollama model with no families.
- Realtime OCR shortcut toggles the feature.
- Autoupdater
- Implemented logger
- Ask which model for Ollama to use if multiple clip models are found.
- You can also select a model for Ollama by just click Ollama in the model menu.
- Ask for a prompt after taking a screenshot.
- New prompt for explore
- Explore no longer generates images meant for debugging.
- Presents the same menu when launched by shortcut or clicking statusbar.
- Reports more errors when request fails.
- Cancels previous request when making new request
- Ollama support
- Use original screenshot resolution instead of window resolution point except explore mode.
- New Workflow: Use Command+Control+Shift+W/V to set the target to a window/VOCursor and perform the OCR scan. After that, the features such as real-time OCR, explore, and ask will use the target.
- Reset shortcut if there are different features after an update
- Bug fix: global shortcuts sometimes not active
- Customize shortcuts
- Token usage at the end of description
- Support system prompt for GPT
- Setting to toggle use last prompt without asking
- Save last screenshot
- Dismiss menu with command+Z instead of esc if realtime or navigation is active.
- You can just press return to ask GPT without editing.
- Changed diff algorithm for less verbose realtime OCR.
- Realtime OCR remains active at its initial location, allowing you to move the VOCursor during the process. To perform realtime OCR in a different location, stop the OCR, move the VOCursor, then restart realtime OCR.
- Realtime OCR of VOCursor: Command+Control+Shift+r
- Able to toggle obbject detection from the setings.
- OCR Window: Command+Control+Shift+w
- OCR VOCursor: Command+Control+Shift+v
- Ask GPT about VOCursor: Command+Control+Shift+a
- Settings: Command+Control+Shift+S
- Faster screenshot of VOCursor
- Open an image file in VOCR from finder to ask GPT
- Gpt response gets copied to the clipboard, so you can paste somewhere if you miss it.
- Object Detection through rectangles: Any boxes without text such as icons.
- Moved save OCR result to the menu.
- Moved target window to settings menu.
- auto Scan: Thanks @vick08
- Readme Improvement: Thanks @ssawczyn
The GPT features utilize GPT-4V, and they require your own OpenAI API key.
Explore feature only works with GPT, and location information from the model is extremely unreliable and inaccurate.
Instruction for Ollama
- Download Ollama and install.
- Open terminal, and type "ollama run llava" without the quotes.
- Wait until you get the prompt >>> send a message
- Then type /bye and press return
- Quit terminal
- Go to VOCR menu > Settings > Models and select Ollama
Experimental
These features may not make into the public release.
- Identify object when navigation is active: Command+Control+I
- Explore window with GPT: Command+Control+Shift+e
- an option to switch to using a local model such as Llava using llama.cpp instead of GPT.
Warning: It's very complex to set your own Llama.cpp server.