Scrapling
An adaptive Python web scraping framework that handles everything from single HTTP requests to full-scale concurrent crawls, with built-in anti-bot bypass and smart element tracking.
At a Glance
Fully free and open-source under the BSD-3-Clause license. Free to use, modify, and distribute.
Engagement
Available On
Alternatives
Listed May 2026
About Scrapling
Scrapling is an adaptive web scraping framework for Python that handles everything from a single request to a full-scale crawl. Its parser learns from website changes and automatically relocates elements when pages update, while its fetchers bypass anti-bot systems like Cloudflare Turnstile out of the box. The spider framework enables concurrent, multi-session crawls with pause/resume and automatic proxy rotation — all in a few lines of Python.
- Scrapy-like Spider API: Define spiders with
start_urls, asyncparsecallbacks, andRequest/Responseobjects for full crawling workflows. - Anti-bot Bypass:
StealthyFetcherandDynamicFetcherclasses bypass Cloudflare Turnstile/Interstitial with fingerprint spoofing and headless browser automation via Playwright. - Adaptive Element Tracking: Smart similarity algorithms relocate scraped elements automatically after website redesigns — pass
adaptive=Trueto find them again. - Multiple Fetcher Types:
Fetcherfor fast HTTP requests with TLS fingerprint impersonation,StealthyFetcherfor stealth mode, andDynamicFetcherfor full browser automation. - Session Management: Persistent sessions (
FetcherSession,StealthySession,DynamicSession) with cookie and state management across requests, including async variants. - Proxy Rotation: Built-in
ProxyRotatorwith cyclic or custom rotation strategies across all session types, plus per-request proxy overrides. - Pause & Resume Crawls: Checkpoint-based crawl persistence — press Ctrl+C for graceful shutdown and restart to resume from where you left off.
- Streaming Mode: Stream scraped items in real time via
async for item in spider.stream()with live stats, ideal for pipelines and long-running crawls. - MCP Server: Built-in MCP server for AI-assisted web scraping with Claude, Cursor, and other AI tools, minimizing token usage by extracting targeted content first.
- CLI & Interactive Shell: Scrape URLs directly from the terminal without writing code, or launch an IPython-based interactive shell with Scrapling integration.
- Rich Selection API: CSS selectors, XPath, filter-based search, text search, regex search, and BeautifulSoup-style
find_all— all chainable. - High Performance: Benchmarked faster than Parsel/Scrapy, PyQuery, and BeautifulSoup for text extraction and element similarity search.
- Docker Support: Ready-to-use Docker image with all browsers pre-installed, automatically built and pushed with each release.
- Install via pip: Run
pip install scraplingfor the parser, orpip install "scrapling[all]"for all features including fetchers, MCP server, and shell.
Community Discussions
Be the first to start a conversation about Scrapling
Share your experience with Scrapling, ask questions, or help others learn from your insights.
Pricing
Open Source
Fully free and open-source under the BSD-3-Clause license. Free to use, modify, and distribute.
- Full parser engine
- HTTP, Stealthy, and Dynamic fetchers
- Spider framework with concurrent crawling
- MCP server for AI integration
- CLI and interactive shell
Capabilities
Key Features
- Adaptive element tracking after website changes
- Anti-bot bypass (Cloudflare Turnstile/Interstitial)
- Scrapy-like Spider API with async parse callbacks
- Concurrent crawling with configurable concurrency limits
- Pause and resume crawls with checkpoint persistence
- Streaming mode with real-time stats
- Multiple fetcher types: HTTP, Stealthy, Dynamic
- Session management with cookie/state persistence
- Proxy rotation with cyclic or custom strategies
- MCP server for AI-assisted web scraping
- CLI and interactive IPython shell
- CSS, XPath, regex, text, and filter-based selectors
- BeautifulSoup-style find_all API
- Auto CSS/XPath selector generation
- DNS-over-HTTPS support for DNS leak prevention
- Domain and ad blocking in browser-based fetchers
- Built-in JSON/JSONL export
- Docker image with all browsers pre-installed
- Full async support across all fetchers
- 92% test coverage and full type hints
Integrations
Demo Video

