# Crawl4AI > Open-source, LLM-friendly async web crawler and scraper designed for AI agents, RAG pipelines, and data extraction at scale. Crawl4AI is the #1 trending open-source web crawler and scraper built specifically for large language models, AI agents, and data pipelines. It delivers blazing-fast, AI-ready content extraction with clean Markdown output, structured data parsing, and advanced browser control — all without forced API keys or paywalls. Actively maintained by a vibrant community with 61.7k+ GitHub stars, it supports everything from simple single-page crawls to complex adaptive multi-URL pipelines. - **Clean Markdown Generation** — *Produces minimally processed, well-structured Markdown output perfect for direct ingestion into LLMs or RAG pipelines.* - **Structured Extraction** — *Parse repeated patterns using CSS selectors, XPath, or LLM-based extraction strategies for precise data retrieval.* - **Adaptive Web Crawling** — *Uses advanced information foraging algorithms to intelligently determine when sufficient data has been gathered to answer a query, stopping automatically.* - **Advanced Browser Control** — *Fine-grained control over hooks, proxies, stealth/undetected modes, session reuse, and anti-bot fallback mechanisms.* - **High-Performance Parallel Crawling** — *Supports multi-URL crawling, crawl dispatching, chunk-based extraction, and real-time use cases for large-scale pipelines.* - **Deep & URL-Seeded Crawling** — *Supports deep crawling, virtual scroll handling, lazy loading, and identity-based crawling for comprehensive site coverage.* - **C4A-Script** — *A custom scripting language for defining complex crawl and interaction workflows, with a dedicated editor app.* - **LLM Context Builder** — *Built-in app to generate LLM-ready context files (llms.txt) from crawled content.* - **PDF Parsing & File Downloading** — *Handles PDF documents and file downloads natively as part of the crawl pipeline.* - **Self-Hosting & Docker Support** — *Easily deploy via pip or Docker for full control over your crawling infrastructure.* - **AI Assistant Skill Package** — *Downloadable skill package (23K+ word SDK reference) compatible with Claude, Cursor, Windsurf, and other AI coding assistants.* - **Open Source & Free** — *No API keys required, no paywalls — fully transparent and configurable for everyone.* ## Features - Async web crawling with AsyncWebCrawler - Clean Markdown generation for LLMs - Structured extraction via CSS, XPath, and LLM strategies - Adaptive crawling with information foraging algorithms - Deep crawling and URL seeding - Multi-URL parallel crawling - Advanced browser control (hooks, proxies, stealth mode) - Anti-bot and fallback mechanisms - Session management and identity-based crawling - Virtual scroll and lazy loading support - PDF parsing - File downloading - C4A-Script custom scripting language - LLM Context Builder app - Cache modes - Network and console capture - SSL certificate handling - Self-hosting via Docker - AI assistant skill package for Claude/Cursor/Windsurf - Command-line interface (CLI) ## Integrations Claude, Cursor, Windsurf, Docker, PyPI, LLM pipelines, RAG pipelines, AI agents ## Platforms WINDOWS, MACOS, LINUX, WEB, API, DEVELOPER_SDK ## Pricing Open Source ## Version 0.8.x ## Links - Website: https://docs.crawl4ai.com - Documentation: https://docs.crawl4ai.com/core/quickstart/ - Repository: https://github.com/unclecode/crawl4ai - EveryDev.ai: https://www.everydev.ai/tools/crawl4ai