open-jarvis/OpenJarvis v1.0.0 on GitHub

OpenJarvis v1.0.0. The five-primitive architecture (Intelligence, Engine, Agents, Tools & Memory, Learning) is now stable, with efficiency and on-device learning as first-class capabilities alongside accuracy.

Companion blog post: From Minions to OpenJarvis: A Retrospective on Two Years in Local AI

Install

pip install --upgrade openjarvis
jarvis init

What v1.0 ships

Eight built-in agents across three execution modes (on-demand, scheduled, continuous), spanning a single-turn chat baseline, a deep-research agent with inline citations, a CodeAct-style coder, and a continuous monitor with memory compression for long-horizon workflows.
Seven starter presets that bundle an agent with a hardware-appropriate engine, connectors, and tools, installable with jarvis init --preset <name>. Variants cover Apple Silicon, Linux GPU servers, and CPU-only laptops.
Four local engines (Ollama, vLLM, SGLang, llama.cpp) and five cloud engines (OpenAI, Anthropic, Google Gemini, OpenRouter, MiniMax), all behind a single Engine interface.

That shared interface is what makes local-cloud collaboration expressible as a composable spec, and OpenJarvis ships with three concrete patterns from this research arc:

Per-query routing through a query-complexity analyzer (learning/routing/complexity.py). Easy queries stay local; only queries that need frontier capability escalate. This is the routing pattern that recovers the 60–80% energy, compute, and cost reductions from IPW versus a batched cloud baseline.

LLM-guided spec search (learning/spec_search/). A frontier model reads traces, proposes coordinated edits across all five primitives, and a held-out gate accepts only non-regressing edits. The resulting spec runs entirely on-device.

Minions-style decompose-and-execute (agents/hybrid/minions.py). A frontier model plans, a local model executes subtasks in parallel, the frontier aggregates.

Local-cloud collaboration as a research surface

Every project in this arc is an instance of the same underlying question: how should a frontier cloud model and a local model divide labor on a single query, task, or workflow? Minions answered it one way for long-context QA; Archon, Advisor Models, Conductor, ToolOrchestra, and SkillOrchestra answer it in other ways for other workloads. OpenJarvis ships all six as composable LocalCloudAgent subclasses under agents/hybrid/, with a runner CLI and an experiment registry so comparisons stay apples-to-apples. The taxonomy mapping workload characteristics to hybrid paradigms — coding wants one thing, long-context retrieval wants another, agentic tool use wants a third — is the thread we are most excited to coordinate on with the broader community.

Efficiency as a first-class constraint

Hardware-agnostic energy telemetry at 50ms resolution across NVIDIA, AMD, Apple Silicon, and Intel RAPL backends. Energy, dollar cost, FLOPs, and latency are treated as evaluation targets alongside accuracy.

Local learning loop

Closed-loop optimization across the stack — model weights via SFT and GRPO, prompts via DSPy, agent logic via GEPA, and engine + stack configuration via LLM-guided spec search. LearningOrchestrator coordinates triggers and applies optimizer overlays at discovery time so improvements compound across primitives.

Migration from 0.x

learning/distillation/ → learning/spec_search/. Update any imports. The jarvis distillation CLI command is removed.
generate_full return shape extended with energy and tool-call telemetry fields. Existing callers that didn't read these fields are unaffected.

Full release notes: CHANGELOG.md [1.0.0]

open-jarvis/OpenJarvis v1.0.0 OpenJarvis v1.0.0 on GitHub