AgentMemory

Name: AgentMemory
Availability: OnlineOnly
Author: Rohit Ghumare

Persistent memory engine for AI coding agents that captures every session, recalls context in milliseconds, and runs locally with zero external databases.

Visit Website

At a Glance

Pricing

Open Source

Fully open-source under Apache-2.0. Free to use, modify, and distribute.

Engagement

Available On

Windows

macOS

Linux

Web

API

Rohit GhumareLondon, United KingdomEst. 2026

Listed Jun 2026

About AgentMemory

AgentMemory is an open-source, self-hosted memory runtime for AI coding agents, published under the Apache-2.0 license by Rohit Ghumare. It runs as a single Node.js process on your machine, capturing everything your agent observes across sessions and injecting the right context at the start of each new one. The project reached v0.9.27 and reports over 23,000 GitHub stars as of mid-2026.

What It Is

AgentMemory is a complete memory layer — not a library or a vector store — that sits between your coding agent and its context window. It silently intercepts agent lifecycle events via hooks, compresses raw observations into structured memories, and retrieves them using a triple-stream hybrid search (BM25 + vector + knowledge graph). The entire runtime is one process with no Redis, Kafka, Postgres, Qdrant, or Neo4j required; state lives on disk as SQLite/JSON.

How the Memory Pipeline Works

The system operates in three stages that the project calls Hooks, Recall, and Consolidate:

Hooks — 12 auto-capture hooks fire on every PreToolUse, PostToolUse, SessionStart, Stop, and related events, piping observations into the memory pipeline without any manual glue code.
Recall — Triple-stream retrieval fuses BM25 lexical matching, dense vector cosine similarity, and knowledge graph traversal via Reciprocal Rank Fusion (RRF). The project reports a p50 latency under 20 ms on a laptop and a 95.2% R@5 score on the LongMemEval-S benchmark (500 questions).
Consolidate — Hourly sweeps compress raw observations into a 4-tier memory hierarchy (Working → Episodic → Semantic → Procedural), merge duplicates, apply Ebbinghaus-curve decay to stale rows, and emit a batched audit row on every delete.

Agent Compatibility and MCP Surface

AgentMemory ships native plugins for Claude Code (12 hooks + MCP + skills), Codex CLI (6 hooks + MCP), GitHub Copilot CLI, OpenClaw, Hermes, pi, and OpenHuman. Any other MCP-compatible agent — Cursor, Windsurf, Cline, Roo Code, Gemini CLI, Warp, Continue, Zed, Aider, Goose, and more — connects via a universal mcpServers JSON block. The MCP surface exposes 53 tools, 6 resources, and 3 prompts; every MCP tool also has a REST twin under /agentmemory/* across 128 total endpoints on port 3111.

Architecture: Built on the iii Engine

AgentMemory runs on the iii engine, a worker/function/trigger runtime that replaces Express, SQLite migrations, pm2, and Prometheus with three primitives. This means the entire stack is one process, and capabilities like durable queues, pub/sub federation, OTEL observability, and sandboxed code execution can be added with a single iii worker add command. The project ships two UIs: a real-time memory viewer on port 3113 (live observation stream, session explorer, knowledge graph visualization) and the iii console on port 3114 (OpenTelemetry waterfall, KV browser, function invocation).

Update: v0.9.27

The latest release is v0.9.27, published 2026-06-07. The GitHub repository was last pushed 2026-06-15 and lists 322 open issues, indicating active development. The project pins iii-engine to v0.11.2 while a refactor for the v0.11.6 sandbox model is in progress. Recent changelog activity includes fixes for Claude Code hook path resolution after upgrades (#508), Codex Desktop plugin hook dispatch (#16430 upstream), and a new agentmemory connect antigravity adapter for the post-Gemini-CLI-sunset Antigravity agent. The project self-reports 1,428 passing tests across 174 source files and approximately 37,800 lines of TypeScript.

Tradeoffs to Know

No external databases is a design constraint, not just a feature: the in-process SQLite/JSON state means horizontal scaling requires the optional iii-pubsub worker for P2P mesh federation.
LLM calls are opt-in: by default, no LLM provider is configured and compression falls back to synthetic BM25 summarization. Full semantic compression requires setting an API key or pointing at a local Ollama/LM Studio server.
Windows support requires a separately installed iii-engine binary (no PowerShell installer or scoop/winget package exists); WSL2 is the recommended fast path.
The @agentmemory/mcp shim exposes only 7 tools when no running agentmemory server is reachable; the full 53-tool surface requires the server process to be running.

Community Discussions

Be the first to start a conversation about AgentMemory

Share your experience with AgentMemory, ask questions, or help others learn from your insights.

Pricing

OPEN SOURCE

Open Source

Fully open-source under Apache-2.0. Free to use, modify, and distribute.

53 MCP tools
12 auto-capture hooks
128 REST endpoints
Triple-stream hybrid retrieval (BM25 + vector + knowledge graph)
4-tier memory consolidation

Capabilities

Key Features

12 auto-capture hooks for every agent lifecycle event
Triple-stream hybrid retrieval: BM25 + vector + knowledge graph
53 MCP tools with REST API twins (128 endpoints)
4-tier memory consolidation pipeline (Working, Episodic, Semantic, Procedural)
Zero external databases — runs as a single Node.js process on SQLite
Real-time memory viewer on port 3113
iii engine console with OpenTelemetry traces on port 3114
Knowledge graph extraction with entity and relation support
Hourly consolidation sweeps with Ebbinghaus-curve memory decay
Session replay with scrubbing, play/pause, and speed control
JSONL transcript import for Claude Code session backfill
P2P mesh federation via iii-pubsub worker
Obsidian vault export with frontmatter-tagged markdown
Multi-agent scoping with AGENT_ID tagging and isolated/shared modes
Git-versioned memory snapshots
Privacy filter strips API keys and secrets before storage
Support for 5+ LLM providers including local Ollama/LM Studio
Local embeddings via all-MiniLM-L6-v2 (no API key required)
One-click deploy templates for fly.io, Railway, Render, and Coolify
15 native skills installable via npx skills add

Integrations

Claude Code

GitHub Copilot CLI

Codex CLI

Cursor

Windsurf

Cline

Roo Code

Kilo Code

Continue

Zed

Warp

Gemini CLI

OpenCode

Goose

Aider

Claude Desktop

OpenClaw

Hermes

OpenHuman

Droid (Factory.ai)

Antigravity

Kiro (AWS)

Qwen Code

Anthropic API

OpenAI API

Gemini API

OpenRouter

MiniMax

Ollama

LM Studio

vLLM

Voyage AI

Cohere

Jaeger

Honeycomb

Grafana Tempo

Obsidian

fly.io

Railway

Render

Coolify

API Available

View Docs

Back to all tools Suggest an edit