# Canary

> A QA harness built for Claude Code that reads code diffs, identifies affected UI flows, and tests them in real browser instances with full session recordings.

Canary is an open-source QA harness purpose-built for coding agents like Claude Code, Cursor, and Codex. It reads code diffs, identifies affected UI flows, and drives real browser instances using a QuickJS WASM sandbox that exposes the full Playwright API. The project is MIT-licensed and hosted on GitHub under the `wizenheimer/canary` repository.

## What It Is

Canary sits at the intersection of browser automation and AI coding agents. Rather than forcing a choice between opaque agent runs you can't reproduce and raw Playwright scripts you have to write and maintain by hand, Canary does both: the agent performs the QA and hands back a reproducible Playwright script. Every session captures screen recordings, console logs, network requests, HAR files, and Playwright traces in a self-contained `report.html` that requires no server or build step to open.

## How the Agent Workflow Works

The core loop is straightforward: describe a UI flow in plain language, and the agent drives a real browser, then returns both a human-readable report and the exact Playwright script behind it. In Claude Code, Canary registers as a first-class plugin with slash commands (`/canary:verify`, `/canary:session`, `/canary:run`, `/canary:review`) and subagents. Cursor and Codex integrations are also available via their respective plugin marketplaces, all pointing at the same shared `skills/`, `agents/`, and `commands/` directories.

## Architecture: Three Tools, One Runtime

Canary ships as a pnpm + Turborepo monorepo with three user-facing tools sharing a single background daemon:

- **`@usecanary/cli` (`canary`)** — the main session orchestrator; records capture-enabled QA sessions and renders reports
- **`@usecanary/browser` (`canary-browser`)** — a lightweight engine for quick, one-off browser automation with no recording overhead
- **`@usecanary/ui` (`canary-viewer`)** — a local Astro-based viewer for browsing, searching, and replaying recorded sessions

The daemon runs Playwright and a QuickJS WASM sandbox. Scripts execute in a sandboxed environment with no arbitrary host access — no Node.js module system, no direct filesystem or network access from script context, and enforced memory and CPU limits.

## What Gets Captured

Every Canary session records a comprehensive evidence trail by default:

- **Video replay** with a per-step filmstrip and scrubbing
- **Playwright trace** (`trace.zip`) decodable with `npx playwright show-trace`
- **Network HAR** with per-request headers, payloads, and response inspection
- **Console log** filterable by level (errors, warnings, info, logs) with source URLs
- **Reproducible Playwright script** — the exact calls (`goto`, `waitForSelector`, `evaluate`, `screenshot`) with params and timing
- **Self-contained `report.html`** — one file, no server, committable and shareable

Individual capture streams can be disabled with `--no-trace`, `--no-video`, `--no-har`, or `--no-console`.

## Setup Path

Installation requires Node 20+ and pnpm. The quickest path is:

```
npm i -g @usecanary/cli @usecanary/ui
canary install   # one-time: downloads Chromium + runtime (~150 MB) into ~/.canary
```

A guided wizard (`npm create canary@latest`) handles the full setup interactively. All commands also run one-off via `npx` without a global install. Agent plugin installation uses each agent's own marketplace mechanism — Claude Code via `/plugin marketplace add wizenheimer/canary`, Cursor via its Marketplace UI, and Codex via `codex marketplace add wizenheimer/canary`.

## Current Status

The repository was created in June 2026 and had 344 stars and 19 forks as of mid-June 2026, with active development reflected in recent pushes. The project is MIT-licensed, with portions derived from MIT-licensed work by Sawyer Hood. The license file in the repository is listed as "NOASSERTION" in GitHub metadata, though the `LICENSE` file itself is a standard MIT license text.

## Features
- Reads code diffs and identifies affected UI flows
- Drives real browser instances via Playwright
- QuickJS WASM sandbox with full Playwright Page API
- Full session recordings with video replay
- Playwright trace capture and decoding
- Network HAR capture with per-request inspection
- Console log capture filterable by level
- Reproducible Playwright scripts generated from every run
- Self-contained report.html with no server required
- Claude Code plugin with slash commands and subagents
- Cursor and Codex plugin integrations
- Background daemon with automatic lifecycle management
- One-off browser automation via canary-browser
- Local session viewer via canary-viewer
- Sandboxed script execution with memory and CPU limits
- CI-ready script replay with zero inference cost
- Attach to existing Chrome via remote debugging port

## Integrations
Claude Code, Cursor, Codex, Playwright, Chromium, QuickJS WASM, pnpm, Turborepo, Biome (Ultracite), pino (structured logging)

## Platforms
CLI, API

## Pricing
Open Source

## Links
- Website: https://github.com/wizenheimer/canary
- Documentation: https://github.com/wizenheimer/canary/blob/main/AGENTS.md
- Repository: https://github.com/wizenheimer/canary
- EveryDev.ai: https://www.everydev.ai/tools/canary-qa-harness