Firecrawl
An open-source API to search, scrape, crawl, and interact with the web, converting any website into clean, LLM-ready markdown or structured JSON for AI agents and applications.
At a Glance
A lightweight way to get started. No cost, no card, no hassle.
Engagement
Available On
Updated May 2026
About Firecrawl
Firecrawl is a web data infrastructure platform built by Mendable/Sideguide Technologies, backed by Y Combinator (S22), that turns the live web into clean, structured data for AI systems. The core platform is open source under the AGPL-3.0 license, with a hosted cloud service at firecrawl.dev that adds proprietary infrastructure for proxies, rendering, and browser interaction. The project has accumulated over 121,000 GitHub stars and the team describes it as one of the fastest-growing open source projects in the space.
What It Is
Firecrawl is a web context API — a developer-facing infrastructure layer that lets AI agents, RAG pipelines, and LLM applications reliably find, read, and act on live web content. Rather than requiring developers to stitch together proxies, headless browsers, and post-processing scripts, Firecrawl exposes three core capabilities through a single REST API: Search (query the web and receive full-page markdown from results), Scrape (convert any URL into clean markdown, HTML, screenshots, or structured JSON), and Interact (click, scroll, fill forms, and navigate dynamic pages using AI prompts or code). Additional endpoints include Crawl (follow links across an entire site), Map (discover all URLs on a domain), Batch Scrape, and an autonomous Agent endpoint that accepts a natural-language prompt and retrieves data without requiring a known URL.
Core Architecture and Output Formats
Firecrawl's hosted version runs on Fire-engine, a proprietary infrastructure layer that handles rotating proxies, JavaScript rendering, smart wait logic, and rate-limit management automatically. The platform claims a P95 latency of 3.4 seconds across millions of pages and states it covers 96% of the web including JavaScript-heavy single-page applications. Output formats include:
- Clean markdown optimized for LLM context windows
- Structured JSON via user-defined schemas (pass a schema to
/scrapeand receive matching structured data) - Raw HTML, page screenshots, and metadata
- Parsed content from web-hosted PDFs and DOCX files
The open-source repository is primarily written in TypeScript and is licensed AGPL-3.0, while the SDKs and some UI components use the MIT license.
Agent and MCP Integration
Firecrawl ships an official MCP (Model Context Protocol) server, enabling AI coding tools like Cursor, Claude, and Windsurf to search and scrape the web directly from within the editor. A CLI tool (firecrawl-cli) installs agent skills with a single command, and the team reports over 400,000 MCP server installations according to the homepage FAQ. Official SDKs are available for Python, Node.js, Go, Rust, Java, and Elixir, and the REST API can be called from any language. Platform integrations include Zapier, n8n, and Lovable.
Use Cases and Audience
The platform targets developers and AI teams building:
- Deep research agents — autonomous loops that search, scrape, and synthesize information
- RAG pipelines — feeding real-time web content into retrieval-augmented generation systems
- Lead enrichment — extracting and filtering contact and company data from directories
- Competitive intelligence — monitoring competitor pages for pricing, feature, and content changes
- E-commerce monitoring — tracking product pricing and inventory across sites
- AI model training — collecting high-quality web data for pre-training and fine-tuning pipelines
- Content generation — pulling structured web data to power AI writing workflows
The about page states over 500,000 developers have signed up for the hosted service.
Update: Firecrawl v2.10
The latest GitHub release is v2.10, published May 15, 2026. The repository shows active development with recent pull requests covering Python SDK async improvements, an o4-mini crawler feature, Extract v2 reranking improvements, and cost-limit controls for the extract endpoint. The homepage highlights two recently launched output formats — Highlights and Question — which return grounded answers or verbatim excerpts from any page in a single API call. The project also announced a Wikipedia data partnership, providing fair-access sourcing for Wikipedia content within the platform.
Open Source vs. Hosted Tradeoffs
The self-hostable open-source version provides the core scraping and crawling engine but lacks Fire-engine's proprietary proxy infrastructure, the Interact capability for browser automation, the hosted dashboard and analytics, and the managed scaling layer. The hosted cloud version handles all infrastructure concerns and is accessible via API key with no local setup required. Self-hosting instructions are available in the contributing guide and a dedicated self-host documentation page.
Community Discussions
Be the first to start a conversation about Firecrawl
Share your experience with Firecrawl, ask questions, or help others learn from your insights.
Pricing
Free Plan
A lightweight way to get started. No cost, no card, no hassle.
- 1,000 credits per month
- Scrape 1,000 pages
- 2 concurrent requests
- Low rate limits
Hobby
Great for side projects and small tools.
- 5,000 credits per month
- Scrape 5,000 pages
- 5 concurrent requests
- Basic support
- $9 per extra 1.5k credits
Standard
Perfect for scaling with less effort. Simple, solid, dependable.
- 100,000 credits per month
- Scrape 100,000 pages
- 50 concurrent requests
- Standard support
- $47 per extra 35k credits
Growth
Built for high volume and speed. Firecrawl at full force.
- 500,000 credits per month
- Scrape 500,000 pages
- 100 concurrent requests
- Priority support
- $177 per extra 175k credits
Scale
For teams scaling their data pipelines.
- 1,000,000 credits per month
- Scrape 1,000,000 pages
- 150 concurrent requests
- Priority support
- $397 per extra 350k credits
Enterprise
Power at your pace with custom solutions.
- Custom credits
- Unlimited pages
- Custom concurrent requests
- Dedicated support & SLA
- Bulk discounts
- Zero-data retention
- SSO & advanced security
Capabilities
Key Features
- Web scraping to clean markdown or structured JSON
- Web search with full-page content from results
- Browser interaction (click, scroll, fill forms, navigate)
- Full-site crawling with depth and path controls
- URL mapping and discovery
- Batch scraping for thousands of URLs
- Autonomous AI agent endpoint (natural language prompts)
- JavaScript rendering for SPAs and dynamic pages
- Media parsing for PDFs and DOCX files
- Screenshot capture
- Structured data extraction via JSON schema
- Smart wait for dynamic content loading
- Caching with configurable patterns
- MCP server for AI coding tools
- CLI with agent skills
- Official SDKs for Python, Node.js, Go, Rust, Java, Elixir
- SOC 2 Type II certified
- Zero-data retention option
- Auto-recharge credit packs
- Highlights and Question output formats
Integrations
Demo Video

