# Crux

> Open-source TypeScript toolkit for harness engineering — typed building blocks for prompts, context, memory, retrieval, guardrails, and observability around your LLM calls.

Crux is an open-source TypeScript toolkit for what its authors call "harness engineering" — the discipline of deliberately assembling, inspecting, and testing everything around a model call. It sits alongside your existing SDK (Vercel AI SDK, OpenAI, Anthropic, Google GenAI) rather than replacing it, providing typed building blocks for the pieces that most commonly cause AI feature failures: stale context, missing memory, dropped instructions, unsafe inputs, and untested regressions. The project is licensed under Apache-2.0 and is currently in public alpha.

## What It Is

Crux is a modular TypeScript library that structures the "harness" around LLM calls. The core idea is that bad model output is rarely a model problem — it is usually a problem with what gets sent to the model. Crux makes those surrounding pieces explicit: typed `prompt()` definitions with Zod input/output schemas, composable `context()` blocks, `memory()` for recent messages and facts, `retriever()` for RAG pipelines, `guardrail()` for PII and injection filtering, `constraint()` for semantic output validation with retry, and `evaluate()` for quality suites and CI-friendly baselines. All of these plug into a single `use:` array on a prompt definition, and the SDK you already use still makes the actual model call.

## Architecture: Define → Resolve → Adapt → Observe

Every Crux execution follows a four-stage pipeline:

- **Define** — Author pure TypeScript definitions (prompts, contexts, memory blocks, tools, agents, flows, tests) that do not import a provider SDK.
- **Resolve** — At call time, Crux validates input, filters conditional blocks, merges tools and settings, applies token budgets, and produces a provider-agnostic resolved prompt.
- **Adapt** — An adapter maps the resolved prompt to Vercel AI SDK, OpenAI, Anthropic, Google GenAI, Convex Agent, or another runner.
- **Observe** — Hooks emit structured events for generations, context resolution, memory reads/writes, retrieval, tools, evals, judge scores, artifacts, errors, and cost.

This separation means you can inspect what the model will see before the call runs, execute the same prompt through multiple providers, and keep quality checks tied to the definitions they protect.

## Package Ecosystem

Crux ships as a family of focused packages rather than a monolithic framework:

- `@crux/core` — SDK-agnostic primitives for prompts, contexts, memory, retrieval, safety, routing, quality, agents, and observability
- `@crux/ai` — Vercel AI SDK adapter for `generate`, `stream`, and structured output
- `@crux/openai`, `@crux/anthropic`, `@crux/google` — Provider-specific adapters
- `@crux/convex` — Convex storage, server boundaries, agent bridge, and swarm integration
- `@crux/upstash` — Upstash Vector and Redis-backed storage adapters
- `@crux/otel` — OpenTelemetry integration for production traces (Datadog, Honeycomb, Grafana, New Relic)
- `@crux/local` — Native local runtime, CLI, TUI, HTTP/WS server, embedded devtools, eval runner, and catalog
- `@crux/devtools` — React devtools UI for traces, evals, source catalog, memory, plans, and runtime inspection

## Observability and Evaluation

Crux provides two observability surfaces. In development, `crux dev` and `crux traces` open a visual devtools UI and terminal dashboard showing live trace timelines, resolved system previews, memory operations, and rolling quality averages. In production, `@crux/otel` exports OpenTelemetry spans to any compatible platform and is documented to work in Lambda, Convex, and Cloudflare Workers. For evaluation, Crux supports local quality suites with built-in judges (`faithfulness`, `relevance`, `safety`), prompt tests, variants, cassettes, baselines, and CI-friendly runs via `crux quality run`.

## Current Status: Alpha

The GitHub README explicitly labels Crux as alpha software: "APIs may change, things may break, and examples may lag behind the implementation until the first stable release." The repository was created in May 2026 and last pushed in June 2026. The shipped foundation includes typed prompts and contexts, conditional and budgeted context resolution, memory blocks, retrieval and grounding, guardrails and constraints, routing and fallback, quality suites, a canonical observability graph, local devtools/runtime, and OpenTelemetry export. The README notes that a deeper "proof layer" — whole-call decision reports, richer rationale artifacts, a unified freshness vocabulary, and a polished harness-decision matcher library — is still in progress. Public npm packages are listed as "pending" on the homepage. TypeScript compatibility is verified against `>=5.5 <7`, with TypeScript 7 tracked as a preview lane.

## Features
- Typed prompt() definitions with Zod input/output schemas
- Composable context() blocks for brand voice, policies, and shared tools
- Memory blocks: recent messages, facts, episodes, procedures, and policies
- Retrieval: indexers, corpora, retrievers, rerankers, grounding, and citations
- Guardrails for PII detection, prompt injection, and safety filtering
- Constraints for semantic output validation with retry and feedback
- Model routing, fallback, semantic cache, pricing tables, and budgets
- Quality suites with built-in judges (faithfulness, relevance, safety)
- CI-friendly evaluation runner with baselines and variants
- Local devtools with live trace timeline and terminal dashboard
- OpenTelemetry export for Datadog, Honeycomb, Grafana, and New Relic
- Agent composition: pipelines, parallel runs, consensus, swarms, blackboards, handoffs
- SDK-agnostic prompt definitions with provider adapters
- Single use: array composition model for all blocks
- TypeScript >=5.5 <7 compatibility

## Integrations
Vercel AI SDK, OpenAI SDK, Anthropic SDK, Google GenAI SDK, Convex, Upstash Vector, Upstash Redis, Datadog, Honeycomb, Grafana, New Relic, OpenTelemetry, Next.js, Node.js, Expo / React Native, Cloudflare Workers, AWS Lambda, Zod

## Platforms
LINUX, WEB, API, DEVELOPER_SDK, CLI

## Pricing
Open Source

## Version
alpha

## Links
- Website: https://cruxjs.dev/
- Documentation: https://cruxjs.dev/docs
- Repository: https://github.com/use-crux/crux
- EveryDev.ai: https://www.everydev.ai/tools/crux
