Weave Router
A prompt router that routes each request to the best quality-per-token AI model, working as a drop-in proxy for Claude Code, Codex, and Cursor.
At a Glance
Run the full router stack on your own infrastructure using your own provider keys. No managed service fees.
Engagement
Available On
Alternatives
Listed Jun 2026
About Weave Router
Weave Router is a model routing proxy built by Workweave that sits between AI coding tools and upstream LLM providers, automatically selecting the best model for each prompt based on quality-per-token scoring. It works as a transparent drop-in for Claude Code, OpenAI Codex, Cursor, and opencode, requiring no changes to how developers use those tools. The project is source-available on GitHub under the Elastic License 2.0 and can be run as a managed hosted service or self-hosted on your own infrastructure.
What It Is
Weave Router is an AI model routing layer that intercepts prompts from agentic coding tools and routes them to the most appropriate model across Anthropic, OpenAI, Google Gemini, and OpenRouter (for open-source models like DeepSeek, Kimi, Llama, Mistral, and others). Rather than using a "vibes-based prompt" to pick models, the router uses a small on-box ONNX embedder and a cluster scorer derived from the Avengers-Pro research paper (arXiv:2508.12631) to make routing decisions in under 50 milliseconds. The core claim, as stated on the product page, is that routing easy requests to cheaper open-source models at parity quality can roughly double a team's token runway on the same budget.
How the Routing Works
The router embeds each incoming prompt using a lightweight in-process ONNX model, then scores it against frozen cluster centroids to classify the request complexity. Based on that classification, it selects the model with the best quality-per-token ratio from the enabled provider pool. Key architectural properties include:
- Cache awareness: Model selection accounts for prompt caching so the router only switches models when the cost/quality tradeoff justifies losing cache hits.
- Latency overhead: The routing layer adds only low single-digit milliseconds on top of the upstream call.
- Zero-retention by default: Prompts and completions are not stored on Weave infrastructure; only routing metadata (classification label, chosen model, latency, cost) is retained.
- BYOK: Provider keys stay on the user's machine, encrypted at rest, for self-hosted deployments.
Supported Providers and Tools
The router speaks multiple API formats natively — Anthropic Messages, OpenAI Chat Completions, and Gemini native — including streaming, tool use, and vision. Supported upstream providers include Anthropic, OpenAI, Google Gemini, and OpenRouter for open-source models. On the client side, it integrates with:
- Claude Code via automatic config wiring
- OpenAI Codex CLI via
~/.codex/config.tomlpatching - opencode via
opencode.jsonprovider merging - Cursor via the OpenAI Base URL override in settings (noted as early beta)
Setup Path
The fastest path to the hosted managed service is a single command: npx @workweave/router. The installer detects installed clients, asks for scope (user vs. project), generates a router key, and wires the appropriate config files automatically. Node ≥ 18 is required. For self-hosting, the stack runs via make full-setup, which boots Postgres and the router on port 8080 with a local dashboard. The router exposes standard endpoints: POST /v1/messages (Anthropic format), POST /v1/chat/completions (OpenAI format), POST /v1beta/models/:action (Gemini format), and POST /v1/route for inspecting routing decisions without making an upstream call. OTLP traces are emitted out of the box and can be sent to Honeycomb, Datadog, Grafana, or the built-in Weave dashboard.
Current Status
The GitHub repository was created in April 2026 and last pushed in late June 2026, with 601 stars and 28 forks as of that date. The project is written in Go (requiring Go 1.25+) and is actively maintained with 41 open issues. The license is Elastic License 2.0, which permits self-hosting and modification but prohibits offering the software as a managed service to third parties. Workweave describes itself as "The #1 engineering intelligence platform" and lists the router as part of a broader platform that includes code intelligence, agent observability, and engineering analytics products.
Community Discussions
Be the first to start a conversation about Weave Router
Share your experience with Weave Router, ask questions, or help others learn from your insights.
Pricing
Self-Hosted
Run the full router stack on your own infrastructure using your own provider keys. No managed service fees.
- Full router source code on GitHub
- BYOK (Bring Your Own Key) for all providers
- Anthropic, OpenAI, Gemini, and OpenRouter support
- Built-in dashboard
- OTLP observability
Managed Service
Hosted Weave Router with a single unified usage-based credit balance. No juggling separate provider invoices.
- Single unified usage-based credit balance
- No separate Anthropic, OpenAI, OpenRouter, and Gemini invoices
- Hosted router endpoint
- Per-request model routing
- Zero-retention proxy by default
- SOC 2 Type II available on request under NDA
Capabilities
Key Features
- Per-request model routing using on-box ONNX embedder and cluster scoring
- Drop-in proxy for Claude Code, Codex, Cursor, and opencode
- Supports Anthropic Messages, OpenAI Chat Completions, and Gemini native APIs
- Streaming, tool use, and vision support across all providers
- OpenRouter integration for open-source models (DeepSeek, Kimi, Llama, Mistral, etc.)
- BYOK (Bring Your Own Key) with provider keys encrypted at rest
- Zero-retention proxy by default — only routing metadata stored
- Prompt cache-aware model selection
- OTLP traces out of the box (Honeycomb, Datadog, Grafana compatible)
- Built-in dashboard at /ui/dashboard
- Managed hosted service or self-hosted deployment options
- Single npx command installer with automatic client config wiring
- Per-repo (project) or user-level scope for config
- Toggle routing on/off per client without discarding config
- SOC 2 Type II available on request (enterprise)
