Main Menu
  • Tools
  • Developers
  • Topics
  • Discussions
  • News
  • Blogs
  • Builds
  • Contests
  • Compare
  • Arena
Create
    EveryDev.ai
    Sign inSubscribe
    Home
    Tools

    2,012+ AI tools

    • New
    • Trending
    • Featured
    • Compare
    • Arena
    Categories
    • Agents1104
    • Coding995
    • Infrastructure429
    • Marketing408
    • Design354
    • Projects323
    • Analytics311
    • Research297
    • Testing194
    • Data166
    • Integration164
    • Security162
    • MCP152
    • Learning143
    • Communication126
    • Extensions118
    • Commerce112
    • Prompts109
    • Voice105
    • DevOps89
    • Web73
    • Finance19
    1. Home
    2. Tools
    3. autocontext
    autocontext icon

    autocontext

    Agent Harness

    A recursive self-improving agent harness that runs LLM agents through structured scenarios, evaluates outputs, and accumulates validated knowledge so repeated runs get better over time.

    Visit Website

    At a Glance

    Pricing
    Open Source

    Fully free and open-source under the Apache License 2.0. Install via pip or npm and use without restrictions.

    Engagement

    Available On

    macOS
    Linux
    API
    SDK
    CLI

    Resources

    WebsiteDocsGitHubllms.txt

    Topics

    Agent HarnessMulti-agent SystemsLLM Evaluations

    Alternatives

    AlphaClawKelosOh My OpenAgent
    Developer
    Greyhaven AIGreyhaven AI builds autocontext, a recursive self-improving…

    Listed Apr 2026

    About autocontext

    autocontext is an open-source recursive self-improving harness for LLM agents that closes the loop between execution, evaluation, and knowledge accumulation. Instead of starting every agent run cold, autocontext persists what worked — traces, playbooks, artifacts, and distilled models — so each subsequent run builds on validated prior success. It supports Python and TypeScript surfaces, multiple LLM providers, and a structured multi-agent loop with roles for proposing, analyzing, coaching, and curating knowledge.

    • Scenario Families: 11 reusable scenario families (game, agent_task, simulation, investigation, workflow, negotiation, coordination, and more) executable in both Python and TypeScript.
    • Multi-Agent Loop: A structured internal loop with competitor, analyst, coach, architect, and curator roles that propose, evaluate, and gate knowledge persistence.
    • Persistent Knowledge: Validated playbooks, hints, tools, reports, and progress snapshots accumulate across runs rather than being discarded.
    • Multiple Surfaces: Access via CLI (autoctx), REST API server, MCP server, TypeScript/TUI operator surfaces, and external agent integration.
    • Provider Routing: Supports Anthropic, OpenAI-compatible endpoints, Gemini, Mistral, Groq, OpenRouter, Azure OpenAI, MLX (Apple Silicon), Pi, and deterministic testing backends.
    • Frontier-to-Local Distillation: Export stable training data and distill it into cheaper local runtimes using MLX on Apple Silicon.
    • Replay and Analysis: Inspect, compare, and replay runs, simulations, investigations, and missions to understand regressions and stable wins.
    • Notification Hooks: Route notifications via Slack, HTTP webhooks, stdout, or composite routing using AUTOCONTEXT_NOTIFY_* env vars.
    • Quick Start: Install via pip install autocontext or npm install autoctx, then run autoctx solve --description "your task" --gens 3 to hand the harness a plain-language task and let it iterate.
    autocontext - 1

    Community Discussions

    Be the first to start a conversation about autocontext

    Share your experience with autocontext, ask questions, or help others learn from your insights.

    Pricing

    OPEN SOURCE

    Open Source

    Fully free and open-source under the Apache License 2.0. Install via pip or npm and use without restrictions.

    • Full Python control-plane CLI
    • TypeScript package with CLI and library surface
    • All 11 scenario families
    • Multi-agent loop execution
    • Persistent knowledge and artifacts

    Capabilities

    Key Features

    • Recursive self-improving agent harness
    • 11 reusable scenario families
    • Structured multi-agent loop (competitor, analyst, coach, architect, curator)
    • Persistent playbooks, hints, and knowledge across runs
    • Staged validation and harness-aware execution
    • Replays, checkpoints, and exported artifacts
    • Frontier-to-local distillation with MLX on Apple Silicon
    • CLI, API server, MCP, and TypeScript/TUI surfaces
    • Multi-provider LLM routing (Anthropic, OpenAI-compatible, Gemini, Mistral, Groq, OpenRouter, Azure, MLX, Pi)
    • Notification hooks via Slack, HTTP webhooks, stdout
    • Export training data for downstream systems
    • Verifier-driven missions with checkpoints and completion criteria
    • Campaign coordination for multi-mission workflows
    • V8 isolate codegen for secure TypeScript execution
    • Subprocess-based executors for Python with SSH and sandboxed options

    Integrations

    Anthropic Claude
    OpenAI
    Gemini
    Mistral
    Groq
    OpenRouter
    Azure OpenAI
    MLX
    Pi
    Claude CLI
    Codex CLI
    Hermes CLI
    vLLM
    Ollama
    Slack
    PrimeIntellect
    MCP (Model Context Protocol)
    OpenClaw
    API Available
    View Docs

    Reviews & Ratings

    No ratings yet

    Be the first to rate autocontext and help others make informed decisions.

    Developer

    Greyhaven AI

    Greyhaven AI builds autocontext, a recursive self-improving harness for LLM agents that accumulates validated knowledge across runs. The project is open-source under the Apache License 2.0 and supports Python and TypeScript surfaces. It integrates with major LLM providers including Anthropic, OpenAI-compatible endpoints, Gemini, Mistral, and local runtimes via MLX.

    Read more about Greyhaven AI
    WebsiteGitHub
    1 tool in directory

    Similar Tools

    AlphaClaw icon

    AlphaClaw

    AlphaClaw is an open-source OpenClaw harness and fleet manager that lets you deploy, monitor, and manage AI agents with self-healing watchdog, auto git backup, and a browser dashboard — no SSH required.

    Kelos icon

    Kelos

    Kelos is an open-source Kubernetes-native framework for orchestrating autonomous AI coding agents like Claude Code, Codex, Gemini, and OpenCode.

    Oh My OpenAgent icon

    Oh My OpenAgent

    An open-source agent harness for CLI-based AI coding workflows, featuring specialized agents, parallel execution, session continuity, and 40+ lifecycle hooks.

    Browse all tools

    Related Topics

    Agent Harness

    Infrastructure, orchestrators, and task runners that wrap around LLM coding agents — covering session management, context delivery, worktree isolation, architecture enforcement, and issue-to-PR pipelines.

    53 tools

    Multi-agent Systems

    Platforms for creating and managing teams of AI agents that can collaborate.

    106 tools

    LLM Evaluations

    Platforms and frameworks for evaluating, testing, and benchmarking LLM systems and AI applications. These tools provide evaluators and evaluation models to score AI outputs, measure hallucinations, assess RAG quality, detect failures, and optimize model performance. Features include automated testing with LLM-as-a-judge metrics, component-level evaluation with tracing, regression testing in CI/CD pipelines, custom evaluator creation, dataset curation, and real-time monitoring of production systems. Teams use these solutions to validate prompt effectiveness, compare models side-by-side, ensure answer correctness and relevance, identify bias and toxicity, prevent PII leakage, and continuously improve AI product quality through experiments, benchmarks, and performance analytics.

    59 tools
    Browse all topics
    Back to all tools
    Explore AI Tools
    • AI Coding Assistants
    • Agent Frameworks
    • MCP Servers
    • AI Prompt Tools
    • Vibe Coding Tools
    • AI Design Tools
    • AI Database Tools
    • AI Website Builders
    • AI Testing Tools
    • LLM Evaluations
    Follow Us
    • X / Twitter
    • LinkedIn
    • Reddit
    • Discord
    • Threads
    • Bluesky
    • Mastodon
    • YouTube
    • GitHub
    • Instagram
    Get Started
    • About
    • Editorial Standards
    • Corrections & Disclosures
    • Community Guidelines
    • Advertise
    • Contact Us
    • Newsletter
    • Submit a Tool
    • Start a Discussion
    • Write A Blog
    • Share A Build
    • Terms of Service
    • Privacy Policy
    Explore with AI
    • ChatGPT
    • Gemini
    • Claude
    • Grok
    • Perplexity
    Agent Experience
    • llms.txt
    Theme
    With AI, Everyone is a Dev. EveryDev.ai © 2026
    Discussions