# crawlie

> Free, open-source technical SEO and GEO crawler with 46 checks, plain-English fixes, a CLI, and an MCP server for AI agent-driven audits.

crawlie is a free, MIT-licensed technical SEO and Generative Engine Optimization (GEO) crawler built by Sean Ryan of Spronta Ltd. It runs locally as a tiny async Rust binary, requires no account or cloud connection, and ships both a CLI and a Model Context Protocol (MCP) server so AI agents can drive audits end-to-end. The project reached v0.4.0 in June 2026 and has 54 GitHub stars.

## What It Is

crawlie is a command-line site auditor that crawls any website and runs 46 checks across two scoring dimensions: a **Health score** for classic technical SEO (broken links, redirects, metadata, canonicals, robots, performance, security, mobile/international signals) and a **GEO score** for AI-search readiness (structured data, semantic HTML, answer-ready content, authorship/E-E-A-T, dated content). Every finding includes plain-English guidance explaining why it matters, how to fix it, and what happens if it's ignored — not just an error code. The tool is written in Rust, targets `wasm32` for future cloud-worker deployment, and is MIT licensed.

## Agent-Native Architecture

crawlie's standout design choice is its first-class MCP server (`crawlie-mcp`). Any MCP-compatible agent — Claude, Cursor, Cline, or a custom agent — can call tools like `crawl_site`, `audit_url`, `audit_urls`, `explain_issue`, `list_rules`, and `list_reports` over JSON-RPC via stdio. A one-step Claude Code plugin bundles the MCP server and a set of pre-built audit skills (full-site SEO + GEO, broken-link fixes, pre-launch gates, AI-search readiness) that run on demand via `npx` without requiring a pre-installed binary. The skills folder also works with any agent even without the MCP server.

## What It Checks

The 46 rules span four categories:

- **Technical SEO**: broken links, 4xx/5xx, redirect chains, titles and meta descriptions (missing/duplicate/length), H1s, canonicals, noindex/nofollow/X-Robots-Tag, robots.txt blocking, missing alt text, thin and duplicate content, orphan and deep pages
- **Performance & security**: slow responses, large pages, missing compression, HTTPS, mixed content, HSTS
- **Mobile, international & social**: viewport, `lang`, hreflang, Open Graph, Twitter cards, structured data
- **GEO**: structured data, semantic HTML, answer-readiness, authorship/E-E-A-T, dated content, question-style headings, extractable blocks

## Setup Path and Output Formats

Installation is a single npm command (`npm i -g crawlie`), which automatically installs the correct native Rust binary as a platform package. A signed, notarized macOS `.dmg` desktop app (Tauri v2 + React) is available from GitHub Releases. Building from source requires Rust and, for the desktop app, pnpm and Node. Output formats include `pretty` (terminal), `json` (machine-readable default), `csv` (issues only), and `html` (self-contained shareable report). CI/CD gating is supported via `--fail-on error|warning` for non-zero exit codes.

## Architecture and Codebase

The repository is organized into a `crates/` directory with three crates — `crawlie-core` (engine, audit, scoring, knowledge base, reports), `crawlie-cli` (the `crawlie` command), and `crawlie-mcp` (the MCP server) — plus an `apps/desktop` Tauri app. The core crate has zero host dependencies and already targets `wasm32`, enabling the same audited engine to run in a future cloud worker. The roadmap lists JavaScript rendering for SPAs, crawl-to-crawl comparison and regression alerts, and an internal-link graph visualization.

## Update: v0.4.0

The latest release is v0.4.0, published on 2026-06-22. The repository was created on 2026-06-18 and last pushed on 2026-06-22, indicating rapid early development. The project's GitHub topics include `aeo`, `crawler`, `geo`, `marketing-tools`, `mcp`, and `seo`, reflecting its dual focus on traditional SEO and AI-answer-engine optimization.

## Features
- 46 SEO and GEO checks
- Health score for technical SEO
- GEO score for AI-search readiness
- Plain-English fix guidance for every finding
- CLI with JSON, CSV, HTML, and pretty output formats
- MCP server for agent-driven audits
- Local-first, no account or cloud required
- Shareable self-contained HTML reports
- CI/CD gating with --fail-on flag
- Broken link and redirect detection
- Structured data and schema validation
- E-E-A-T and authorship checks
- Robots.txt and canonical checks
- Hreflang and international SEO checks
- macOS desktop app (Tauri + React)
- Claude Code plugin with pre-built audit skills
- Report history and saved crawls
- Single-page and URL-list audit modes

## Integrations
Claude Desktop, Claude Code, Cursor, Cline, npm, Model Context Protocol (MCP), GitHub Actions (CI/CD)

## Platforms
CLI, MACOS, WINDOWS, LINUX, API

## Pricing
Open Source

## Version
v0.4.0

## Links
- Website: https://crawlie.dev
- Documentation: https://crawlie.dev/docs/getting-started
- Repository: https://github.com/spronta/crawlie
- EveryDev.ai: https://www.everydev.ai/tools/crawlie
