# codebase-memory-mcp

> High-performance MCP server that indexes codebases into a persistent knowledge graph with 158-language support, sub-millisecond queries, and zero dependencies.

codebase-memory-mcp is an open-source code intelligence engine built by DeusData that ships as a single static binary for macOS, Linux, and Windows. It indexes repositories into a persistent knowledge graph using tree-sitter AST analysis across 158 languages, enhanced with Hybrid LSP semantic type resolution for 9 language families. The project is MIT-licensed and available on GitHub, with the latest release at v0.8.1.

## What It Is

codebase-memory-mcp is a structural analysis backend that connects to AI coding agents via the Model Context Protocol (MCP). It builds a knowledge graph of functions, classes, call chains, HTTP routes, and cross-service links from your codebase, then exposes 14 MCP tools so agents like Claude Code, Codex CLI, Gemini CLI, and others can query that graph instead of reading files one by one. The tool does not include an LLM — it relies on the connected agent as the intelligence layer, acting purely as a fast, queryable structural index.

## Architecture and Performance

The indexing pipeline is RAM-first: LZ4 compression, in-memory SQLite, and a single dump at the end, with memory released after indexing completes. According to the project's README and an accompanying arXiv preprint (arXiv:2603.27277), the tool full-indexes the Linux kernel (28M LOC, 75K files) in 3 minutes on Apple M3 Pro hardware, producing 4.81M nodes and 7.72M edges. Cypher-like graph queries return in under 1ms. The preprint reports 83% answer quality, 10× fewer tokens, and 2.1× fewer tool calls versus file-by-file exploration across 31 real-world repositories.

Key architectural components:
- **158 vendored tree-sitter grammars** compiled directly into the binary — nothing to install
- **Hybrid LSP layer** — a lightweight C implementation of type-resolution algorithms structurally inspired by tsserver, pyright, gopls, Roslyn, Eclipse JDT, and rust-analyzer, embedded in the binary with no language server process required
- **SQLite-backed persistence** in WAL mode at `~/.cache/codebase-memory-mcp/`
- **openCypher read subset** for graph queries including `MATCH`, `WHERE`, `WITH`, variable-length paths, and aggregates

## Hybrid LSP: Semantic Type Resolution

Beyond syntactic tree-sitter parsing, codebase-memory-mcp ships a two-layer architecture. The tree-sitter pass runs for all 158 languages and extracts definitions, calls, and imports. The Hybrid LSP pass then refines `CALLS`, `USAGE`, and `RESOLVED_CALLS` edges with type information for 9 language families: Python, TypeScript/JavaScript/JSX/TSX, PHP, C#, Go, C/C++, Java, Kotlin, and Rust. Languages without a Hybrid LSP pass fall back to textual resolution. The v0.8.0 release added Java, Kotlin, and Rust; v0.7.0 added Python, PHP, and C#.

## Multi-Agent Support and Distribution

The `install` command auto-detects 11 coding agents — Claude Code, Codex CLI, Gemini CLI, Zed, OpenCode, Antigravity, Aider, KiloCode, VS Code, OpenClaw, and Kiro — and configures MCP server entries, instruction files, and pre-tool hooks for each. For Claude Code, a `PreToolUse` hook intercepts `Grep`/`Glob` calls and injects structured graph context alongside normal search results without blocking the read-before-edit invariant.

The binary is distributed via npm, PyPI, Homebrew, Scoop, Winget, Chocolatey, AUR, and `go install`, in addition to direct GitHub release downloads. Every release binary is statically linked with zero shared library dependencies, signed with Sigstore cosign, verified with SLSA Level 3 build provenance, and scanned by 70+ antivirus engines via VirusTotal before publication.

## Update: v0.8.1

The latest release is v0.8.1, published June 12, 2026. The v0.8.0 milestone added Hybrid LSP support for Java, Kotlin, and Rust, expanding full semantic type resolution to 9 language families. v0.7.0 previously added Python, PHP, and C# Hybrid LSP, along with sharpened Go and C/C++ resolution. The project has been actively maintained since its February 2026 creation, with the repository showing 11,860 stars and 871 forks as reported by GitHub metadata at the time of indexing.

## Features
- 158-language codebase indexing via vendored tree-sitter grammars
- Hybrid LSP semantic type resolution for Python, TypeScript, JavaScript, PHP, C#, Go, C/C++, Java, Kotlin, and Rust
- 14 MCP tools including search, trace, architecture overview, impact analysis, and dead code detection
- Cypher-like graph query language (openCypher read subset)
- Single static binary with zero runtime dependencies
- Auto-detects and configures 11 coding agents on install
- Built-in 3D interactive graph visualization UI at localhost:9749
- RAM-first indexing pipeline with LZ4 compression and in-memory SQLite
- Background file watcher for automatic re-indexing on changes
- Cross-service HTTP/gRPC/GraphQL/tRPC route linking
- Team-shared graph artifact (.codebase-memory/graph.db.zst) for skipping reindex
- Infrastructure-as-code indexing for Dockerfiles, Kubernetes manifests, and Kustomize overlays
- BM25 full-text search with camelCase/snake_case-aware tokenizer
- Semantic vector search using bundled Nomic nomic-embed-code embeddings
- Architecture Decision Records (ADR) management
- Git diff impact mapping with risk classification
- CLI mode for invoking any MCP tool from the command line
- SLSA Level 3 build provenance and Sigstore cosign signatures on all release binaries

## Integrations
Claude Code, Codex CLI, Gemini CLI, Zed, OpenCode, Antigravity, Aider, KiloCode, VS Code, OpenClaw, Kiro, npm, PyPI, Homebrew, Scoop, Winget, Chocolatey, AUR

## Platforms
WINDOWS, MACOS, LINUX, WEB, API, VSC_EXTENSION, CLI

## Pricing
Open Source

## Version
v0.8.1

## Links
- Website: https://deusdata.github.io/codebase-memory-mcp/
- Documentation: https://github.com/DeusData/codebase-memory-mcp
- Repository: https://github.com/DeusData/codebase-memory-mcp
- EveryDev.ai: https://www.everydev.ai/tools/codebase-memory-mcp
