scrapling

Web scraping using Scrapling — a Python framework with anti-bot bypass (Cloudflare Turnstile, fingerprint spoofing), adaptive element tracking, stealth headless browser, and full CSS/XPath extraction. Use when web_fetch fails (Cloudflare, JS-rendered pages), or when extracting structured data from websites (prices, articles, lists). Supports HTTP, stealth, and full browser modes. Source: github.com/D4Vinci/Scrapling (PyPI: scrapling). Only use on sites you have permission to scrape.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "scrapling" with this command: npx skills add damirikys/scrapling-fetcher

Scrapling Skill

Source: https://github.com/D4Vinci/Scrapling (open source, MIT-like license) PyPI: scrapling — install before first use (see below)

⚠️ Only scrape sites you have permission to access. Respect robots.txt and Terms of Service. Do not use stealth modes to bypass paywalls or access restricted content without authorization.

Installation (one-time, confirm with user before running)

pip install scrapling[all]
patchright install chromium  # required for stealth/dynamic modes
  • scrapling[all] installs patchright (a stealth fork of Playwright, bundled as a PyPI package — not a typo), curl_cffi, MCP server deps, and IPython shell.
  • patchright install chromium downloads Chromium (~100 MB) via patchright's own installer (same mechanism as playwright install chromium).
  • Confirm with user before running — installs ~200 MB of dependencies and browser binaries.

Script

scripts/scrape.py — CLI wrapper for all three fetcher modes.

# Basic fetch (text output)
python3 ~/skills/scrapling/scripts/scrape.py <url> -q

# CSS selector extraction
python3 ~/skills/scrapling/scripts/scrape.py <url> --selector ".class" -q

# Stealth mode (Cloudflare bypass) — only on sites you're authorized to access
python3 ~/skills/scrapling/scripts/scrape.py <url> --mode stealth -q

# JSON output
python3 ~/skills/scrapling/scripts/scrape.py <url> --selector "h2" --json -q

Fetcher Modes

  • http (default) — Fast HTTP with browser TLS fingerprint spoofing. Most sites.
  • stealth — Headless Chrome with anti-detect. For Cloudflare/anti-bot.
  • dynamic — Full Playwright browser. For heavy JS SPAs.

When to Use Each Mode

  • web_fetch returns 403/429/Cloudflare challenge → use --mode stealth
  • Page content requires JS execution → use --mode dynamic
  • Regular site, just need text/data → use --mode http (default)

Python Inline Usage

For custom logic beyond the CLI, write inline Python. See references/patterns.md for:

  • Adaptive scraping (auto_save / adaptive — saves element fingerprints locally)
  • Session/cookie handling
  • Async usage
  • XPath, find_similar, attribute extraction

Notes

  • MCP server (scrapling mcp): starts a local network service for AI-native scraping. Only start if explicitly needed and trusted — it exposes a local HTTP server.
  • auto_save=True: persists element fingerprints to disk for adaptive re-scraping. Creates local state in working directory.
  • Stealth/dynamic modes use Chromium headless — no xvfb-run needed.
  • For large-scale crawls, use the Spider API (see Scrapling docs).

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

Browser Web Search

一行命令搜遍全网 — 55 个平台 91+ 个命令,头条、知乎、豆瓣、YouTube、GitHub、Reddit、Hacker News 等。专为 OpenClaw 设计,复用浏览器登录态,返回结构化 JSON,天然适配 AI Agent 工具调用。

Registry SourceRecently Updated
Coding

Nox Influencer - Creator Discovery & Influencer Marketing

Runs NoxInfluencer creator and marketing-ops workflows via CLI, including creator discovery for influencer marketing, creator marketing, UGC, social media ma...

Registry SourceRecently Updated
Coding

AntV Skills

Generate G2 v5 chart code. Use when user asks for G2 charts, bar charts, line charts, pie charts, scatter plots, area charts, or any data visualization with...

Registry SourceRecently Updated
1851lxfu1
Coding

TCM Clinic - English Edition

A full-featured management tool for solo Traditional Chinese Medicine (TCM) practitioners. Manages patient records, medical charts (Four Diagnostic Methods,...

Registry SourceRecently Updated
1090slamw