cua-agent v0.4.0

This update refactored the Agent SDK to make it easier to implement new features and support the release of new agent models/loops.

Changelog:

Reworked agent loop, now all agent providers share a loop (Generate, Execute, Repeat), with the only difference between loops being the implementation of the Generate function
Replaced LLM clients with LiteLLM, now all agent providers support any provider supported by LiteLLM
Added 2 custom LiteLLM providers for local model inference on CUDA and MLX devices: huggingface-local/, mlx/
Reworked callback system to have hooks at every step of the lifecycle
Converted logging, trajectory saving, image retention into callbacks
Added new callbacks - PII Anonymization (still a W.I.P) & budget management
Anthropic providers - Added support for explicit prompt caching
OpenAI providers - Added support for zero data retention
Added Agent CLI for quick testing: python -m agent.cli <model name>

Breaking Changes

Initialization:
- ComputerAgent (v0.4.x) uses model as a string (e.g. "anthropic/claude-3-5-sonnet-20241022") instead of LLM and AgentLoop objects.
- tools is a list (can include multiple computers and decorated functions).
- callbacks are now first-class for extensibility (image retention, budget, trajectory, logging, etc).
No explicit loop parameter:
- Loop is inferred from the model string (e.g. anthropic/, openai/, omniparser+, ui-tars).
No explicit computer parameter:
- Computers are added to tools list.

Install

# Before merge:
pip install --pre "cua-agent[all]==0.4.0b4"

# After merge:
pip install "cua-agent[all]"

# or install specific providers
pip install "cua-agent[openai]"        # OpenAI computer-use-preview support
pip install "cua-agent[anthropic]"     # Anthropic Claude support
pip install "cua-agent[omni]"          # Omniparser + any LLM support
pip install "cua-agent[uitars]"        # UI-TARS
pip install "cua-agent[uitars-mlx]"    # UI-TARS + MLX support
pip install "cua-agent[uitars-hf]"     # UI-TARS + Huggingface support
pip install "cua-agent[ui]"            # Gradio UI support

Supported Models

Anthropic Claude (Computer Use API)

model="anthropic/claude-3-5-sonnet-20241022"
model="anthropic/claude-3-5-sonnet-20240620"
model="anthropic/claude-opus-4-20250514"
model="anthropic/claude-sonnet-4-20250514"

OpenAI Computer Use Preview

model="openai/computer-use-preview"

UI-TARS (Local or Huggingface Inference)

model="huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B"
model="ollama_chat/0000/ui-tars-1.5-7b"

Omniparser + Any LLM

model="omniparser+ollama_chat/mistral-small3.2"
model="omniparser+vertex_ai/gemini-pro"
model="omniparser+anthropic/claude-3-5-sonnet-20241022"
model="omniparser+openai/gpt-4o"

trycua/cua agent-v0.4.0 cua-agent v0.4.0 on GitHub

cua-agent v0.4.0

Changelog:

Breaking Changes

Install

Supported Models

Anthropic Claude (Computer Use API)

OpenAI Computer Use Preview

UI-TARS (Local or Huggingface Inference)

Omniparser + Any LLM

trycua/cua agent-v0.4.0
cua-agent v0.4.0

on GitHub