codebase-memory-mcp
High-performance MCP server that indexes codebases into a persistent knowledge graph with 158-language support, sub-millisecond queries, and zero dependencies.
At a Glance
Fully free and open source under the MIT License. Download, use, modify, and distribute without restriction.
Engagement
Available On
Alternatives
Listed Jun 2026
About codebase-memory-mcp
codebase-memory-mcp is an open-source code intelligence engine built by DeusData that ships as a single static binary for macOS, Linux, and Windows. It indexes repositories into a persistent knowledge graph using tree-sitter AST analysis across 158 languages, enhanced with Hybrid LSP semantic type resolution for 9 language families. The project is MIT-licensed and available on GitHub, with the latest release at v0.8.1.
What It Is
codebase-memory-mcp is a structural analysis backend that connects to AI coding agents via the Model Context Protocol (MCP). It builds a knowledge graph of functions, classes, call chains, HTTP routes, and cross-service links from your codebase, then exposes 14 MCP tools so agents like Claude Code, Codex CLI, Gemini CLI, and others can query that graph instead of reading files one by one. The tool does not include an LLM — it relies on the connected agent as the intelligence layer, acting purely as a fast, queryable structural index.
Architecture and Performance
The indexing pipeline is RAM-first: LZ4 compression, in-memory SQLite, and a single dump at the end, with memory released after indexing completes. According to the project's README and an accompanying arXiv preprint (arXiv:2603.27277), the tool full-indexes the Linux kernel (28M LOC, 75K files) in 3 minutes on Apple M3 Pro hardware, producing 4.81M nodes and 7.72M edges. Cypher-like graph queries return in under 1ms. The preprint reports 83% answer quality, 10× fewer tokens, and 2.1× fewer tool calls versus file-by-file exploration across 31 real-world repositories.
Key architectural components:
- 158 vendored tree-sitter grammars compiled directly into the binary — nothing to install
- Hybrid LSP layer — a lightweight C implementation of type-resolution algorithms structurally inspired by tsserver, pyright, gopls, Roslyn, Eclipse JDT, and rust-analyzer, embedded in the binary with no language server process required
- SQLite-backed persistence in WAL mode at
~/.cache/codebase-memory-mcp/ - openCypher read subset for graph queries including
MATCH,WHERE,WITH, variable-length paths, and aggregates
Hybrid LSP: Semantic Type Resolution
Beyond syntactic tree-sitter parsing, codebase-memory-mcp ships a two-layer architecture. The tree-sitter pass runs for all 158 languages and extracts definitions, calls, and imports. The Hybrid LSP pass then refines CALLS, USAGE, and RESOLVED_CALLS edges with type information for 9 language families: Python, TypeScript/JavaScript/JSX/TSX, PHP, C#, Go, C/C++, Java, Kotlin, and Rust. Languages without a Hybrid LSP pass fall back to textual resolution. The v0.8.0 release added Java, Kotlin, and Rust; v0.7.0 added Python, PHP, and C#.
Multi-Agent Support and Distribution
The install command auto-detects 11 coding agents — Claude Code, Codex CLI, Gemini CLI, Zed, OpenCode, Antigravity, Aider, KiloCode, VS Code, OpenClaw, and Kiro — and configures MCP server entries, instruction files, and pre-tool hooks for each. For Claude Code, a PreToolUse hook intercepts Grep/Glob calls and injects structured graph context alongside normal search results without blocking the read-before-edit invariant.
The binary is distributed via npm, PyPI, Homebrew, Scoop, Winget, Chocolatey, AUR, and go install, in addition to direct GitHub release downloads. Every release binary is statically linked with zero shared library dependencies, signed with Sigstore cosign, verified with SLSA Level 3 build provenance, and scanned by 70+ antivirus engines via VirusTotal before publication.
Update: v0.8.1
The latest release is v0.8.1, published June 12, 2026. The v0.8.0 milestone added Hybrid LSP support for Java, Kotlin, and Rust, expanding full semantic type resolution to 9 language families. v0.7.0 previously added Python, PHP, and C# Hybrid LSP, along with sharpened Go and C/C++ resolution. The project has been actively maintained since its February 2026 creation, with the repository showing 11,860 stars and 871 forks as reported by GitHub metadata at the time of indexing.
Community Discussions
Be the first to start a conversation about codebase-memory-mcp
Share your experience with codebase-memory-mcp, ask questions, or help others learn from your insights.
Pricing
Open Source
Fully free and open source under the MIT License. Download, use, modify, and distribute without restriction.
- Full codebase indexing across 158 languages
- 14 MCP tools
- Hybrid LSP semantic type resolution for 9 language families
- Single static binary for macOS, Linux, and Windows
- Auto-configuration for 11 coding agents
Capabilities
Key Features
- 158-language codebase indexing via vendored tree-sitter grammars
- Hybrid LSP semantic type resolution for Python, TypeScript, JavaScript, PHP, C#, Go, C/C++, Java, Kotlin, and Rust
- 14 MCP tools including search, trace, architecture overview, impact analysis, and dead code detection
- Cypher-like graph query language (openCypher read subset)
- Single static binary with zero runtime dependencies
- Auto-detects and configures 11 coding agents on install
- Built-in 3D interactive graph visualization UI at localhost:9749
- RAM-first indexing pipeline with LZ4 compression and in-memory SQLite
- Background file watcher for automatic re-indexing on changes
- Cross-service HTTP/gRPC/GraphQL/tRPC route linking
- Team-shared graph artifact (.codebase-memory/graph.db.zst) for skipping reindex
- Infrastructure-as-code indexing for Dockerfiles, Kubernetes manifests, and Kustomize overlays
- BM25 full-text search with camelCase/snake_case-aware tokenizer
- Semantic vector search using bundled Nomic nomic-embed-code embeddings
- Architecture Decision Records (ADR) management
- Git diff impact mapping with risk classification
- CLI mode for invoking any MCP tool from the command line
- SLSA Level 3 build provenance and Sigstore cosign signatures on all release binaries
