EveryDev.ai
Sign inSubscribe
Home
Tools

1,248+ AI tools

  • Trending
  • New
  • Featured
Categories
  • Coding669
  • Agents557
  • Infrastructure277
  • Marketing268
  • Analytics206
  • Design203
  • Research195
  • Projects184
  • Integration145
  • Testing116
  • Data115
  • Learning104
  • Security98
  • MCP95
  • Extensions83
  • Prompts71
  • Commerce67
  • Communication62
  • Voice59
  • Web59
  • DevOps45
  • Finance11
Sign In
  1. Home
  2. Tools
  3. QMD
QMD icon

QMD

Agent Memory

QMD is a local search engine that indexes Markdown files and combines BM25 keyword search, vector semantic search, and LLM re-ranking for AI agent memory and retrieval.

Visit Website

At a Glance

Pricing

Open Source

Fully free and open-source CLI tool available on GitHub under the MIT license.

Engagement

Available On

macOS
Linux
SDK

Resources

WebsiteDocsGitHubllms.txt

Topics

Agent MemoryLocal InferenceMCP Tools

About QMD

QMD (Query Markup Documents) is an open-source, on-device search engine that indexes Markdown files and provides hybrid retrieval combining BM25 full-text search, vector semantic search, and LLM re-ranking. Built by Tobi Lütke, it runs entirely locally using GGUF models via node-llama-cpp with no API keys or cloud dependencies required. QMD is widely adopted as a memory backend for AI coding agents such as Claude Code and OpenClaw, replacing basic keyword search with intelligent, context-aware retrieval.

  • Hybrid search pipeline - Combines BM25 keyword matching via SQLite FTS5 with vector semantic search and LLM-based re-ranking for high-quality results across different query types.
  • Query expansion - Uses a fine-tuned 1.7B parameter model to generate alternative phrasings of your search query, broadening recall without sacrificing precision.
  • LLM re-ranking - A local Qwen3 reranker model re-scores the top candidates using yes/no classification with log-probability confidence, improving result ordering.
  • Collection and context management - Organize documents into named collections with glob patterns and attach hierarchical context descriptions that are returned alongside search results, giving LLMs richer information for decision-making.
  • MCP server integration - Exposes search, retrieval, and status tools via Model Context Protocol over stdio or HTTP transport, enabling direct integration with Claude Desktop, Claude Code, and other MCP-compatible agents.
  • Smart document chunking - Splits documents into approximately 900-token chunks with 15 percent overlap using a scoring algorithm that finds natural Markdown break points rather than cutting at arbitrary token boundaries.
  • Multiple output formats - Supports JSON, CSV, Markdown, XML, and file-list output modes designed for agentic workflows where structured data is needed.
  • Document retrieval by ID - Each indexed document receives a six-character hash identifier, enabling fast retrieval by docid, file path with optional line offset, or glob pattern via multi-get.
  • Fully local and private - All three GGUF models (embedding, reranker, query expansion) totaling approximately 2 GB run on-device. No data leaves the machine.

To get started, install with npm install -g @tobilu/qmd or bun install -g @tobilu/qmd, add collections pointing to your Markdown directories, run qmd embed to generate vector embeddings, and search with qmd search, qmd vsearch, or qmd query for the full hybrid pipeline.

QMD - 1

Community Discussions

Be the first to start a conversation about QMD

Share your experience with QMD, ask questions, or help others learn from your insights.

Pricing

OPEN SOURCE

Open Source

Fully free and open-source CLI tool available on GitHub under the MIT license.

  • BM25 full-text search via SQLite FTS5
  • Vector semantic search with local embeddings
  • LLM re-ranking with Qwen3 reranker
  • Query expansion with fine-tuned model
  • MCP server with stdio and HTTP transport
View official pricing

Capabilities

Key Features

  • Hybrid search combining BM25, vector, and LLM re-ranking
  • Local vector embeddings via embeddinggemma-300M GGUF model
  • LLM re-ranking with qwen3-reranker-0.6b
  • Fine-tuned query expansion model for broader recall
  • Reciprocal Rank Fusion with position-aware blending
  • MCP server for Claude Desktop and Claude Code integration
  • HTTP transport mode with daemon support for shared server
  • Collection-based document organization with glob patterns
  • Hierarchical context annotations for search results
  • Smart ~900-token chunking with natural Markdown break points
  • Document retrieval by path, docid hash, or glob pattern
  • Multi-get for batch document retrieval
  • JSON, CSV, XML, Markdown, and file-list output formats
  • Runs fully on-device with no API keys or cloud services
  • Auto-downloads GGUF models from HuggingFace on first use

Integrations

Claude Desktop
Claude Code
OpenClaw
MCP (Model Context Protocol)
node-llama-cpp
SQLite FTS5
HuggingFace GGUF models
Obsidian
Git

Reviews & Ratings

No ratings yet

Be the first to rate QMD and help others make informed decisions.

Developer

Tobi Lütke

Tobi Lütke is the founder and CEO of Shopify. He builds open-source developer tools including QMD, a local hybrid search engine for Markdown files designed for AI agent workflows. His projects focus on local-first, privacy-respecting tooling that runs entirely on-device.

Read more about Tobi Lütke
WebsiteGitHubX / Twitter
1 tool in directory

Similar Tools

TIMPs icon

TIMPs

Open source AI memory agent that stores facts, preferences, goals, and reflections with persistent memory across sessions using PostgreSQL and Qdrant.

Pieces icon

Pieces

AI-powered desktop app for developers that captures workflow context, builds on-device long-term memory, and integrates with IDEs, browsers, CLIs, and local LLMs for context-aware coding.

Tacnode icon

Tacnode

PostgreSQL-compatible context lake that gives AI agents and automated systems shared, live, and semantically consistent context at decision time.

Browse all tools

Related Topics

Agent Memory

Memory layers, frameworks, and services that enable AI agents to store, recall, and manage information across sessions. These tools provide persistent, semantic, and contextual memory for agents, supporting personalization, long-term context retention, graph-based relationships, and hybrid RAG + memory workflows.

21 tools

Local Inference

Tools and platforms for running AI inference locally without cloud dependence.

43 tools

MCP Tools

Tools built with the Model Context Protocol for specific tasks.

22 tools
Browse all topics
Back to all tools
Explore AI Tools
  • AI Coding Assistants
  • Agent Frameworks
  • MCP Servers
  • AI Prompt Tools
  • Vibe Coding Tools
  • AI Design Tools
  • AI Database Tools
  • AI Website Builders
  • AI Testing Tools
  • LLM Evaluations
Follow Us
  • X / Twitter
  • LinkedIn
  • Reddit
  • Discord
  • Threads
  • Bluesky
  • Mastodon
  • YouTube
  • GitHub
  • Instagram
Get Started
  • About
  • Editorial Standards
  • Corrections & Disclosures
  • Community Guidelines
  • Advertise
  • Contact Us
  • Newsletter
  • Submit a Tool
  • Start a Discussion
  • Write A Blog
  • Share A Build
  • Terms of Service
  • Privacy Policy
Explore with AI
  • ChatGPT
  • Gemini
  • Claude
  • Grok
  • Perplexity
Agent Experience
  • llms.txt
Theme
With AI, Everyone is a Dev. EveryDev.ai © 2026
Main Menu
  • Tools
  • Developers
  • Topics
  • Discussions
  • News
  • Blogs
  • Builds
  • Contests
Create
Sign In
    Sign in
    0views
    0upvotes
    0discussions