EveryDev.ai
Sign inSubscribe
Explore AI Tools
  • AI Coding Assistants
  • Agent Frameworks
  • MCP Servers
  • AI Prompt Tools
  • Vibe Coding Tools
  • AI Design Tools
  • AI Database Tools
  • AI Website Builders
  • AI Testing Tools
  • LLM Evaluations
Follow Us
  • X / Twitter
  • LinkedIn
  • Reddit
  • Discord
  • Threads
  • Bluesky
  • Mastodon
  • YouTube
  • GitHub
  • Instagram
Get Started
  • About
  • Editorial Standards
  • Corrections & Disclosures
  • Community Guidelines
  • Advertise
  • Contact Us
  • Newsletter
  • Submit a Tool
  • Start a Discussion
  • Write A Blog
  • Share A Build
  • Terms of Service
  • Privacy Policy
Explore with AI
  • ChatGPT
  • Gemini
  • Claude
  • Grok
  • Perplexity
Agent Experience
  • llms.txt
Theme
With AI, Everyone is a Dev. EveryDev.ai © 2026
Main Menu
  • Tools
  • Developers
  • Topics
  • Discussions
  • Communities
  • News
  • Podcasts
  • Blogs
  • Builds
  • Contests
  • Compare
  • Arena
Create
    Home
    Tools

    2,480+ AI tools

    • New
    • Trending
    • Featured
    • Compare
    • Arena
    Categories
    • Agents1596
    • Coding1181
    • Infrastructure526
    • Marketing447
    • Design427
    • Projects384
    • Research357
    • Analytics331
    • Testing221
    • MCP216
    • Data205
    • Security196
    • Integration169
    • Learning154
    • Communication146
    • Prompts140
    • Extensions137
    • Commerce123
    • Voice122
    • DevOps99
    • Web77
    • Finance21
    1. Home
    2. Tools
    3. webclaw
    webclaw icon

    webclaw

    Browser Automation
    Featured

    webclaw is an open-source web extraction engine built in Rust that turns any website into clean markdown, JSON, or LLM-ready structured data via CLI, REST API, and MCP server.

    Visit Website

    At a Glance

    Pricing
    Open Source
    Free tier available

    Self-host forever under AGPL-3.0. CLI, server, and MCP server with no usage limits on your own hardware.

    Starter: $15/mo
    Growth: $39/mo
    Pro: $79/mo
    +2 more plans

    Engagement

    Available On

    macOS
    Linux
    Web
    API
    SDK

    Resources

    WebsiteDocsGitHubllms.txt

    Topics

    Browser AutomationMCP ServersRetrieval-Augmented Generation

    Alternatives

    FirecrawlApifyCrawl4AI
    Developer
    webclawItalyEst. 2024

    Listed May 2026

    About webclaw

    webclaw is a web extraction toolkit built in Rust and licensed under AGPL-3.0. It converts any URL into clean markdown, JSON, plain text, or token-optimized output without requiring a headless browser, using browser-grade TLS fingerprint impersonation instead. The project ships as three standalone binaries — a CLI, a REST API server, and an MCP server — all powered by the same extraction core. A hosted cloud API at webclaw.io complements the open-source self-hosted path.

    What It Is

    webclaw sits in the web scraping and data extraction category, specifically designed for AI agent and RAG pipeline workflows. Rather than spinning up Playwright or Puppeteer, it uses raw HTTP with Chrome and Firefox TLS fingerprint profiles to fetch pages fast and lightweight. The extraction engine (webclaw-core) is a pure Rust crate with no network I/O — it takes raw HTML and returns structured output — making it WASM-compatible and independently usable. The hosted API adds protected-site access, JavaScript rendering, async crawl jobs, web search, and production usage tracking on top of the open-source core.

    Architecture and Deployment Model

    The project is a Rust workspace split into focused crates:

    • webclaw-core — pure extraction engine: readability scoring, noise filtering, markdown conversion, LLM optimization, CSS selector filtering, diff engine, brand extraction
    • webclaw-fetch — HTTP client with browser TLS impersonation, BFS crawler, sitemap discovery, batch operations, proxy pool rotation
    • webclaw-llm — LLM provider chain (Ollama → OpenAI → Anthropic) for JSON schema extraction, prompt extraction, and summarization
    • webclaw-pdf — PDF text extraction
    • webclaw-server — axum-based REST API with auth, CORS, gzip, and async job management
    • webclaw-mcp — MCP server over stdio transport exposing tools for AI agents
    • webclaw-cli — command-line interface

    Users can self-host the entire stack on their own hardware with no usage limits, or use the hosted cloud API with an API key.

    Ten Extraction Endpoints

    The hosted API and self-hosted server expose ten endpoints covering the full extraction surface: /v1/scrape (single-page extraction), /v1/crawl (BFS same-origin crawling), /v1/search (web search), /v1/map (URL discovery without full extraction), /v1/batch (parallel multi-URL scraping), /v1/extract (LLM-powered structured JSON extraction), /v1/summarize, /v1/brand (colors, fonts, logos, favicon), /v1/diff (content change tracking), and /v1/research (multi-source research workflow). The site states that the LLM-optimized output format runs a 9-step pipeline that strips navigation, ads, and boilerplate, with the site claiming a median 95% token reduction measured on 18 production sites.

    MCP Integration and AI Agent Workflow

    webclaw ships an MCP server binary that exposes tools over the Model Context Protocol stdio transport, compatible with Claude Desktop, Claude Code, Cursor, Windsurf, OpenCode, Codex, and Antigravity. The one-command setup npx create-webclaw auto-detects supported MCP clients and configures the server automatically. The docs list 8 tools available locally (scrape, crawl, map, batch, extract, summarize, diff, brand) and 2 that require the hosted API (search, research). SDKs are available for TypeScript, Python, and Go, and the API is documented as a drop-in Firecrawl replacement with compatible /v2 endpoints.

    Update: v0.6.4

    The latest release is v0.6.4, published on May 21, 2026, according to the GitHub repository. The repository was created in March 2026 and has seen active development, with the last push also on May 21, 2026. The GitHub repository reports 1,184 stars and 141 forks. Recent blog posts from May 2026 cover JavaScript rendering fallback strategies, anti-bot signal detection, and evaluation frameworks for scraping APIs in AI agent workflows, indicating active product development and content direction focused on the AI agent use case.

    webclaw - 1

    Community Discussions

    Be the first to start a conversation about webclaw

    Share your experience with webclaw, ask questions, or help others learn from your insights.

    Pricing

    OPEN SOURCE

    Open Source

    Self-host forever under AGPL-3.0. CLI, server, and MCP server with no usage limits on your own hardware.

    • CLI tool
    • REST API server (self-hosted)
    • MCP server
    • No usage limits on your hardware
    • AGPL-3.0 license

    Starter

    Entry-level hosted plan with 10,000 credits/month and 3 research runs.

    $15/mo
    billed annually
    $19/mo monthly
    • 10,000 credits/month
    • 3 research runs/month
    • Max 10 sources per research
    • 5 concurrent requests
    • Email support

    Growth

    Popular

    Popular mid-tier plan with 100,000 credits/month and 10 research runs.

    $39/mo
    billed annually
    $49/mo monthly
    • 100,000 credits/month
    • 10 research runs/month
    • Max 20 sources per research
    • 20 concurrent requests
    • Priority support

    Pro

    High-volume plan with 250,000 credits/month and 20 research runs.

    $79/mo
    billed annually
    $99/mo monthly
    • 250,000 credits/month
    • 20 research runs/month
    • Max 30 sources per research
    • 50 concurrent requests
    • Priority support

    Scale

    Large-scale plan with 1,000,000 credits/month and 60 research runs.

    $319/mo
    billed annually
    $399/mo monthly
    • 1,000,000 credits/month
    • 60 research runs/month
    • Max 100 sources per research
    • 100 concurrent requests
    • Priority + Slack support

    Dedicated

    Single-tenant deployment on your cloud with unlimited pages, unlimited research, and 200 concurrent requests.

    Custom
    contact sales
    • Unlimited pages
    • Unlimited research
    • 200 concurrent requests
    • Single-tenant on your cloud
    • Your proxies, your rules
    • Dedicated Slack channel
    • SLA
    View official pricing

    Capabilities

    Key Features

    • Single-page scraping with clean markdown, JSON, HTML, plain text, and LLM-optimized output
    • BFS same-origin crawler with configurable depth, concurrency, and delay
    • Sitemap.xml and robots.txt discovery
    • Batch multi-URL scraping in parallel
    • LLM-powered structured JSON extraction via schema or prompt
    • Page summarization
    • Content diff and change tracking
    • Brand identity extraction (colors, fonts, logos, favicon)
    • Web search with scraped results
    • Multi-source deep research workflow
    • MCP server with 8+ tools for Claude, Cursor, Windsurf, and other MCP clients
    • Browser-grade TLS fingerprint impersonation (Chrome and Firefox profiles)
    • Anti-bot and CAPTCHA handling
    • CSS selector include/exclude filtering
    • 9-step LLM optimization pipeline for token reduction
    • PDF and DOCX auto-detection and extraction
    • YouTube transcript extraction
    • Proxy pool rotation
    • Drop-in Firecrawl /v2 API compatibility
    • Self-hostable under AGPL-3.0
    • TypeScript, Python, and Go SDKs

    Integrations

    Claude Desktop
    Claude Code
    Cursor
    Windsurf
    OpenCode
    Codex
    Antigravity
    LangChain
    Ollama
    OpenAI
    Anthropic
    Docker
    Homebrew
    npm (create-webclaw)
    OpenClaw
    Hermes Agent
    API Available
    View Docs

    Reviews & Ratings

    No ratings yet

    Be the first to rate webclaw and help others make informed decisions.

    Developer

    webclaw Team

    webclaw builds a fast, open-source web extraction engine in Rust for AI agents and RAG pipelines. The project ships a CLI, REST API server, and MCP server — all powered by the same extraction core — alongside a hosted cloud API at webclaw.io. webclaw uses browser-grade TLS fingerprint impersonation instead of headless browsers, delivering sub-200ms response times with no Playwright or Puppeteer dependency. The codebase is AGPL-3.0 licensed and self-hostable, with SDKs for TypeScript, Python, and Go.

    Founded 2024
    Italy
    1 employees
    Read more about webclaw Team
    WebsiteGitHubX / Twitter
    1 tool in directory

    Similar Tools

    Firecrawl icon

    Firecrawl

    An open-source API to search, scrape, crawl, and interact with the web, converting any website into clean, LLM-ready markdown or structured JSON for AI agents and applications.

    Apify icon

    Apify

    Apify is a web scraping and automation platform that provides 30,000+ ready-made Actors, cloud infrastructure, and open-source tools to extract real-time web data for AI apps, agents, and business intelligence.

    Crawl4AI icon

    Crawl4AI

    Open-source, LLM-friendly async web crawler and scraper designed for AI agents, RAG pipelines, and data extraction at scale.

    Browse all tools

    Related Topics

    Browser Automation

    AI-powered agents that autonomously navigate and interact with web applications to automate repetitive tasks, extract data, fill forms, and perform web-based workflows using intelligent understanding of page structure and content.

    81 tools

    MCP Servers

    Model Context Protocol servers that extend AI capabilities.

    91 tools

    Retrieval-Augmented Generation

    RAG Systems that enhance LLM outputs by retrieving relevant information from external knowledge bases, combining the power of generative AI with information retrieval for more accurate and contextual responses.

    72 tools
    Browse all topics
    Back to all tools
    Discussions