# PMB

> Local-first persistent memory for AI coding agents (Claude Code, Cursor, Codex) over MCP, storing decisions, lessons, and facts in a single SQLite file on your disk with no cloud or API keys required.

PMB is an open-source, local-first memory layer for AI coding agents, published under the Apache 2.0 license and installable via `pip install pmb-ai`. It solves the session-amnesia problem: every time you start a new chat with Claude Code, Cursor, Codex, or another MCP-aware agent, the agent forgets everything from the previous session. PMB captures decisions, lessons, personal facts, project structure, and commit intent in a single SQLite file on your disk, then feeds the right context back to the agent through the Model Context Protocol before each response.

## What It Is

PMB is a CLI tool and MCP server that gives AI coding agents persistent, queryable memory without any cloud dependency. It sits between your agent and a local SQLite + LanceDB store, exposing 29 MCP tools (including the core `prepare(message)` call) over stdio. The agent never needs to remember to call a tool — lifecycle hooks inject the right memory before the model thinks and journal the agent's work after, automatically. The entire workspace is a directory under `~/.pmb/<name>/` that you own, can copy, export, or back up with a single `cp`.

## How Recall Works

PMB uses a four-layer hybrid retrieval pipeline fused with Reciprocal-Rank-Fusion:

- **BM25** — lexical ranking, self-compiles a lexicon from your own traffic
- **Dense vectors** — `paraphrase-multilingual-MiniLM-L12-v2` covers 50+ languages; a query in one language finds facts stored in another
- **Entity graph** — multi-hop Personalized PageRank diffusion, gated by intent
- **Optional cross-encoder rerank** — off by default (the project notes it regresses LoCoMo benchmark scores)

Writes are async: the MCP tool returns in under a millisecond; the embed and LanceDB vector insert happen on a background thread. The project publishes benchmark numbers measured on its own engine: recall@10 of 94.5% on the LoCoMo-10 dataset (997 questions, no LLM grader), p50 warm recall latency of ~35 ms, and `prepare(message)` returning in 4–16 ms.

## Hooks and Ambient Memory

The hooks system is what separates PMB from a plain RAG pipeline. Four lifecycle hooks wire at the protocol level via `pmb hooks install claude-code`:

- **UserPromptSubmit → auto-recall**: every message is classified in sub-millisecond; matching lessons, decisions, and project overview are injected before the model reasons
- **PostToolUse → ambient observe**: every tool the agent runs is appended to a lightweight action journal (a single SQLite INSERT, no model call)
- **SessionStart → session-restore**: after a context compaction the agent rebuilds "where you left off" from what the session recorded
- **Stop → follow-through + ambient auto-write**: checks which surfaced lessons appeared in what the agent did, marks them followed deterministically, and synthesizes one activity entry if the agent didn't call a `record_*` tool

The ambient write side is template-based by default (instant, no model) and can be pointed at a local Ollama instance for richer summaries.

## Self-Improvement and Memory Hygiene

Every surfaced lesson carries a `surface_id`. PMB tracks whether the agent actually followed it — confirmed by the agent or auto-detected from the Stop hook — and surfaces per-rule stats in the Lessons dashboard tab: `★ USEFUL` (followed ≥ 2×), `? UNVERIFIED`, and `💀 DEAD` (repeatedly ignored ≥ 2×). Memory also decays, archives, and deduplicates on its own across four layers (exact match, cosine ≥ 0.92 auto-merge, cosine 0.80–0.92 borderline, manual review). Old values are archived, never deleted; full history is available via `keyed_fact_as_of(t)`.

## Update: v1.2.2

The latest release is v1.2.2, published on 2026-06-30, according to the GitHub repository. The project was created in May 2026 and has seen active development, with the last push on the same date as the release. The roadmap (per `docs/ROADMAP.md`) lists litestream backup, optional cloud-sync (BYO bucket), tree-sitter project indexing, and image OCR as upcoming directions. The project is maintained by one full-time maintainer and accepts community contributions via GitHub issues and PRs.

## Features
- Local-first SQLite + LanceDB storage, no cloud or API keys
- MCP-native: 29 tools over stdio, wires to Claude Code, Cursor, Codex, Zed, Windsurf, VS Code, and more
- Hybrid recall: BM25 + dense vectors + entity graph + optional rerank, fused with RRF
- Auto-recall hooks inject memory before the model thinks on every prompt
- Ambient auto-write journals agent work even when the agent forgets to call record_*
- Lesson follow-through scoring: marks rules USEFUL, UNVERIFIED, or DEAD
- Local web dashboard with Map (entity graph), Timeline, Lessons, Duplicates, and Performance tabs
- Multilingual embeddings (50+ languages) via paraphrase-multilingual-MiniLM-L12-v2
- PDF, codebase, Markdown, and ChatGPT export ingestion
- Memory decay, dedup, and archival — nothing deleted behind your back
- Optional local Ollama integration for graph extraction and summaries
- Team/shared workspace via optional HTTP mode with bearer-token auth
- Export to Markdown/JSON with pmb export
- Secret auto-redaction at write time (API keys, tokens)
- 105 tunables via pmb config, 25 default-tier and 80 advanced

## Integrations
Claude Code, Cursor, Codex, Zed, Windsurf, Gemini, GitHub Copilot, VS Code, OpenCode, Continue, Ollama, LanceDB, SQLite, MCP (Model Context Protocol), PyPI, npm (npx pmb-ai setup)

## Platforms
WINDOWS, MACOS, LINUX, WEB, API, VSC_EXTENSION, CLI

## Pricing
Open Source

## Version
v1.2.2

## Links
- Website: https://pmbai.dev
- Documentation: https://docs.pmbai.dev/
- Repository: https://github.com/oleksiijko/pmb
- EveryDev.ai: https://www.everydev.ai/tools/pmb