Changelog
This is a major update focusing on flexibility, customization, and making VOCR accessible to more users around the world.
Make sure to read below to find more about exciting features introduced in this release!
There are many internal changes, so we need your help with comprehensive testing to ensure nothing is broken and the new features are working as expected.
We also need help from the community to improve the translations which are done by AI as a starting point. Please check out the translation guide for more information.
3.0.0-beta.4
Computer Use
You can now ask AI to control Apps using Mouse and Keyboard commands. Check out the demo.
- Start/Stop Computer Use: Command+Control+Shift+u
- Pause/Resume Computer Use: Command+Control+P (Only Available during the Computer Use)
- Toggle Speak Assistant Message: Command+Control+S (Only Available during the Computer Use)
If you notice Computer Use enters a loop while attempting the same task, either cancel the task or pause and resume it with a different instruction.
Token usage and trace log will be copied to the clipboard at the end.
- Use focused window instead of first window.
- When a dialog opens keyboard focus is now in the dialog.
- Made prompt boxes bigger for easy editing.
- Reorganized the settings menu: Submenus for OCR and AI
- Clearer accessible labels in the preset editor (@vic08)
- Norwegian Bokmål translation ()@SuperCliff)
- Translation scripts now report which ones are New, Needs_Review, Translated or Stale.
- Bug fixes and optimizations
3.0.0-beta.3
- Permission Wizard (@vic08)
- New Italian Translation (Giovanni lo Monaco)
- Polish translation Fix (@pitermach)
3.0.0-beta.2
- Fix the bug where it would trigger beta updates even though it was not checked.
- Added languages submenu to choose Language for UI
3.0.0-beta.1
Preset System
The old engine selection (GPT / Ollama / LlamaCpp) has been replaced with a flexible Preset system. Each preset stores its own API URL, API key, model, and prompts, so you can set up as many AI configurations as you like and switch between them instantly from the new Presets menu.
A Preset Manager window lets you create, edit, duplicate, and delete presets.
API keys are encrypted and stored securely using the macOS Keychain, and are only displayed in plain text when creating a preset.
Built-in provider URLs are included for Claude, Gemini, Ollama, OpenAI, and OpenRouter. Any service that offers an OpenAI-compatible API will work.
Follow-up Conversations
A new Follow up checkbox in the Ask dialog lets you carry on a multi-turn conversation with the AI instead of starting fresh each time.
Customizable Explore Prompts
Explore mode uses a structured json-schema response format for more reliable results, and you can now edit the system prompt and user prompt used by Explore mode via Presets > Edit Explore Prompts.
Reset Option
A Reset item in the Settings menu lets you erase all settings and presets and start fresh.
11 Languages
VOCR is now localized in English, German, Spanish, French, Japanese, Korean, Polish, Portuguese, Russian, Ukrainian, and Simplified Chinese.
The translations are done by AI, so we need help from the community to improve. See the translation guide for more information.
Changed
- "Use Last Prompt" has been replaced by "Use Preset Prompt", which sends the prompt saved in the active preset without showing the dialog.
- The Ask dialog button is now labeled "Ask" instead of "Ok".
- AI status messages now show the name of the preset being used (e.g., "Asking Claude... Please wait...").
- Error messages are now shown as alert dialogs.
- Token usage information is now copied to the clipboard along with the AI response.
Removed
- The Engine submenu (GPT / Ollama / LlamaCpp) -- replaced by presets.
- The OpenAI API Key dialog in Settings -- API keys are now managed per-preset.
- The Set System Prompt dialog in Settings -- prompts are now managed per-preset.