evo

Name: evo
Availability: OnlineOnly
Author: evo-hq

An open-source autoresearch orchestrator that runs parallel coding agent experiments on your repo, scores every patch, and keeps only changes that improve the target metric.

Visit Website

At a Glance

Pricing

Open Source

Fully open-source CLI and plugin available on PyPI and GitHub under Apache 2.0.

Engagement

Available On

CLI

Web

API

evo-hqDelhi, IndiaEst. 2026

Listed Jun 2026

About evo

Evo is an open-source autoresearch orchestrator for codebases, built by evo-hq and released under the Apache 2.0 license. It plugs into popular agentic coding frameworks and runs a structured tree search over your repository — discovering what to measure, instrumenting the benchmark, and iterating with parallel subagents until the score stops improving. The hosted platform is currently in waitlisted beta, while the CLI and plugin are freely installable from PyPI and GitHub.

What It Is

Evo sits in the category of autonomous code-optimization systems. Rather than asking a developer to manually direct an AI agent, evo sets up an automated loop: it discovers metrics, runs experiments in isolated git worktrees, scores each patch, and commits only the changes that pass both the metric threshold and any registered gates. The project describes itself as inspired by Karpathy's autoresearch concept — a pure hill-climb where an LLM runs experiments autonomously to beat its own best score — but adds tree search, parallelism, shared state, and gating on top of that baseline idea.

How the Optimization Loop Works

The core workflow involves two commands:

/evo:discover — explores the repo, identifies what to measure, instruments the evaluation, and attaches a held-out-slice score-floor gate automatically when building a benchmark from scratch.
/evo:optimize — runs the experiment loop, dispatching parallel subagents each in its own isolated workspace.

Each subagent reads shared state (failure traces, annotations, discarded hypotheses), forms a hypothesis, edits code, and runs the benchmark. The orchestrator then selects which committed branch to extend next using a configurable frontier strategy: argmax, top-k, epsilon-greedy, softmax, or pareto-per-task. Between rounds, scan subagents read trace batches in parallel and surface compound failure patterns, feeding findings back into shared state for the next round.

Agent and Infrastructure Compatibility

Evo is designed as a plugin for existing agentic frameworks rather than a standalone agent. It currently supports:

Agents: Claude Code, Codex, Cursor, OpenClaw, Hermes, Opencode, Pi
Sandboxes & infra: Local git worktrees (default), SSH, Modal, E2B, Daytona, AWS EC2, Azure VMs

Installation is handled via evo install <host>, which places the plugin into the host's marketplace and stages the hooks evo needs to communicate with in-flight subagents.

Gates and Safety Checks

A key design element is the gates system: pass/fail checks that run on every experiment. Any command that exits zero on pass and non-zero on fail qualifies — a test suite, an invariant script, or a score floor on a held-out benchmark slice. Gates inherit down the experiment tree, so a gate registered at the root runs on every descendant. The README notes that without gates, search will find ways to return a constant, skip work, or trade correctness for speed.

Update: evo 0.5.0

The latest stable release is v0.5.0, published on June 6, 2026. The project has been active since April 2026 and shows regular release cadence, with recent patch releases (0.4.4, 0.4.5) addressing issues like Codex hook trust and plugin cache bugs. The changelog documents migration paths for pre-0.4.4 installs and alpha testing procedures. The hosted platform remains in waitlisted beta as of the latest available information, while the open-source CLI is available on PyPI as evo-hq-cli.

Community Discussions

Be the first to start a conversation about evo

Share your experience with evo, ask questions, or help others learn from your insights.

Pricing

OPEN SOURCE

Open Source

Fully open-source CLI and plugin available on PyPI and GitHub under Apache 2.0.

evo CLI (evo-hq-cli on PyPI)
Plugin for Claude Code, Codex, Cursor, OpenClaw, Hermes, Opencode, Pi
Local git worktree backend
SSH backend
Remote backends via provider extras (Modal, E2B, Daytona, AWS, Azure)

Capabilities

Key Features

Autoresearch orchestration loop for codebases
Benchmark discovery via /evo:discover command
Parallel subagents running in isolated git worktrees
Tree search over greedy hill climb
Shared state across agents (failure traces, annotations, discarded hypotheses)
Configurable frontier strategies: argmax, top-k, epsilon-greedy, softmax, pareto-per-task
Gates system for pass/fail regression and safety checks
Cross-cutting scan subagents for compound failure pattern detection
Local and remote sandbox backends (Modal, E2B, Daytona, AWS, Azure, SSH)
Web dashboard for monitoring experiments
Plugin compatibility with Claude Code, Codex, Cursor, OpenClaw, Hermes, Opencode, Pi
evo update command for CLI and host plugin version management

Integrations

Claude Code

Codex

Cursor

OpenClaw

Hermes

Opencode

Modal

E2B

Daytona

AWS EC2

Azure VMs

PyPI

API Available

View Docs

Back to all tools Suggest an edit