Scrapling

Name: Scrapling
Availability: OnlineOnly
Author: Karim Shoair (D4Vinci)

An adaptive Python web scraping framework that handles everything from single HTTP requests to full-scale concurrent crawls, with built-in anti-bot bypass and smart element tracking.

Visit Website

At a Glance

Pricing

Open Source

Fully free and open-source under the BSD-3-Clause license. Free to use, modify, and distribute.

Engagement

Available On

Windows

Linux

Web

API

SDK

Karim Shoair (D4Vinci)Karim Shoair (D4Vinci) builds Scrapling, an adaptive Python…

Listed May 2026

About Scrapling

Scrapling is an adaptive web scraping framework for Python that handles everything from a single request to a full-scale crawl. Its parser learns from website changes and automatically relocates elements when pages update, while its fetchers bypass anti-bot systems like Cloudflare Turnstile out of the box. The spider framework enables concurrent, multi-session crawls with pause/resume and automatic proxy rotation — all in a few lines of Python.

Scrapy-like Spider API: Define spiders with start_urls, async parse callbacks, and Request/Response objects for full crawling workflows.
Anti-bot Bypass: StealthyFetcher and DynamicFetcher classes bypass Cloudflare Turnstile/Interstitial with fingerprint spoofing and headless browser automation via Playwright.
Adaptive Element Tracking: Smart similarity algorithms relocate scraped elements automatically after website redesigns — pass adaptive=True to find them again.
Multiple Fetcher Types: Fetcher for fast HTTP requests with TLS fingerprint impersonation, StealthyFetcher for stealth mode, and DynamicFetcher for full browser automation.
Session Management: Persistent sessions (FetcherSession, StealthySession, DynamicSession) with cookie and state management across requests, including async variants.
Proxy Rotation: Built-in ProxyRotator with cyclic or custom rotation strategies across all session types, plus per-request proxy overrides.
Pause & Resume Crawls: Checkpoint-based crawl persistence — press Ctrl+C for graceful shutdown and restart to resume from where you left off.
Streaming Mode: Stream scraped items in real time via async for item in spider.stream() with live stats, ideal for pipelines and long-running crawls.
MCP Server: Built-in MCP server for AI-assisted web scraping with Claude, Cursor, and other AI tools, minimizing token usage by extracting targeted content first.
CLI & Interactive Shell: Scrape URLs directly from the terminal without writing code, or launch an IPython-based interactive shell with Scrapling integration.
Rich Selection API: CSS selectors, XPath, filter-based search, text search, regex search, and BeautifulSoup-style find_all — all chainable.
High Performance: Benchmarked faster than Parsel/Scrapy, PyQuery, and BeautifulSoup for text extraction and element similarity search.
Docker Support: Ready-to-use Docker image with all browsers pre-installed, automatically built and pushed with each release.
Install via pip: Run pip install scrapling for the parser, or pip install "scrapling[all]" for all features including fetchers, MCP server, and shell.

Community Discussions

Be the first to start a conversation about Scrapling

Share your experience with Scrapling, ask questions, or help others learn from your insights.

Pricing

OPEN SOURCE

Open Source

Fully free and open-source under the BSD-3-Clause license. Free to use, modify, and distribute.

Full parser engine
HTTP, Stealthy, and Dynamic fetchers
Spider framework with concurrent crawling
MCP server for AI integration
CLI and interactive shell

Capabilities

Key Features

Adaptive element tracking after website changes
Anti-bot bypass (Cloudflare Turnstile/Interstitial)
Scrapy-like Spider API with async parse callbacks
Concurrent crawling with configurable concurrency limits
Pause and resume crawls with checkpoint persistence
Streaming mode with real-time stats
Multiple fetcher types: HTTP, Stealthy, Dynamic
Session management with cookie/state persistence
Proxy rotation with cyclic or custom strategies
MCP server for AI-assisted web scraping
CLI and interactive IPython shell
CSS, XPath, regex, text, and filter-based selectors
BeautifulSoup-style find_all API
Auto CSS/XPath selector generation
DNS-over-HTTPS support for DNS leak prevention
Domain and ad blocking in browser-based fetchers
Built-in JSON/JSONL export
Docker image with all browsers pre-installed
Full async support across all fetchers
92% test coverage and full type hints

Integrations

Playwright

Chromium

Google Chrome

IPython

Docker

Claude (MCP)

Cursor (MCP)

HTTP/3

Cloudflare DoH

API Available

View Docs

Demo Video

Watch on YouTube

Back to all tools

Scrapling