# QMD

> QMD is a local search engine that indexes Markdown files and combines BM25 keyword search, vector semantic search, and LLM re-ranking for AI agent memory and retrieval.

QMD (Query Markup Documents) is an open-source, on-device search engine that indexes Markdown files and provides hybrid retrieval combining BM25 full-text search, vector semantic search, and LLM re-ranking. Built by Tobi Lütke, it runs entirely locally using GGUF models via node-llama-cpp with no API keys or cloud dependencies required. QMD is widely adopted as a memory backend for AI coding agents such as Claude Code and OpenClaw, replacing basic keyword search with intelligent, context-aware retrieval.

- **Hybrid search pipeline** - Combines BM25 keyword matching via SQLite FTS5 with vector semantic search and LLM-based re-ranking for high-quality results across different query types.
- **Query expansion** - Uses a fine-tuned 1.7B parameter model to generate alternative phrasings of your search query, broadening recall without sacrificing precision.
- **LLM re-ranking** - A local Qwen3 reranker model re-scores the top candidates using yes/no classification with log-probability confidence, improving result ordering.
- **Collection and context management** - Organize documents into named collections with glob patterns and attach hierarchical context descriptions that are returned alongside search results, giving LLMs richer information for decision-making.
- **MCP server integration** - Exposes search, retrieval, and status tools via Model Context Protocol over stdio or HTTP transport, enabling direct integration with Claude Desktop, Claude Code, and other MCP-compatible agents.
- **Smart document chunking** - Splits documents into approximately 900-token chunks with 15 percent overlap using a scoring algorithm that finds natural Markdown break points rather than cutting at arbitrary token boundaries.
- **Multiple output formats** - Supports JSON, CSV, Markdown, XML, and file-list output modes designed for agentic workflows where structured data is needed.
- **Document retrieval by ID** - Each indexed document receives a six-character hash identifier, enabling fast retrieval by docid, file path with optional line offset, or glob pattern via multi-get.
- **Fully local and private** - All three GGUF models (embedding, reranker, query expansion) totaling approximately 2 GB run on-device. No data leaves the machine.

To get started, install with `npm install -g @tobilu/qmd` or `bun install -g @tobilu/qmd`, add collections pointing to your Markdown directories, run `qmd embed` to generate vector embeddings, and search with `qmd search`, `qmd vsearch`, or `qmd query` for the full hybrid pipeline.

## Features
- Hybrid search combining BM25, vector, and LLM re-ranking
- Local vector embeddings via embeddinggemma-300M GGUF model
- LLM re-ranking with qwen3-reranker-0.6b
- Fine-tuned query expansion model for broader recall
- Reciprocal Rank Fusion with position-aware blending
- MCP server for Claude Desktop and Claude Code integration
- HTTP transport mode with daemon support for shared server
- Collection-based document organization with glob patterns
- Hierarchical context annotations for search results
- Smart ~900-token chunking with natural Markdown break points
- Document retrieval by path, docid hash, or glob pattern
- Multi-get for batch document retrieval
- JSON, CSV, XML, Markdown, and file-list output formats
- Runs fully on-device with no API keys or cloud services
- Auto-downloads GGUF models from HuggingFace on first use

## Integrations
Claude Desktop, Claude Code, OpenClaw, MCP (Model Context Protocol), node-llama-cpp, SQLite FTS5, HuggingFace GGUF models, Obsidian, Git

## Platforms
MACOS, LINUX, DEVELOPER_SDK

## Pricing
Open Source

## Links
- Website: https://github.com/tobi/qmd
- Documentation: https://github.com/tobi/qmd
- Repository: https://github.com/tobi/qmd
- EveryDev.ai: https://www.everydev.ai/tools/qmd