ScrapeNinja

High-performance web scraping API with Chrome TLS fingerprint, rotating proxies, smart retries, and optional JavaScript rendering.

Official docs: https://scrapeninja.net/docs/

When to Use

Use this skill when you need to:

Scrape websites with anti-bot protection (Cloudflare, Datadome)
Extract data without running a full browser (fast /scrape endpoint)
Render JavaScript-heavy pages (/scrape-js endpoint)
Use rotating proxies with geo selection (US, EU, Brazil, etc.)
Extract structured data with Cheerio extractors
Intercept AJAX requests
Take screenshots of pages

Prerequisites

Get an API key from RapidAPI or APIRoad:

RapidAPI: https://rapidapi.com/restyler/api/scrapeninja
APIRoad: https://apiroad.net/marketplace/apis/scrapeninja

Set environment variable:

# For RapidAPI
export SCRAPENINJA_TOKEN="your-rapidapi-key"

# For APIRoad (use X-Apiroad-Key header instead)
export SCRAPENINJA_TOKEN="your-apiroad-key"

How to Use

1. Basic Scrape (Non-JS, Fast)

High-performance scraping with Chrome TLS fingerprint, no JavaScript:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com"
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $(printenv SCRAPENINJA_TOKEN)" -d @/tmp/scrapeninja_request.json | jq '{status: .info.statusCode, url: .info.finalUrl, bodyLength: (.body | length)}'

With custom headers and retries:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "headers": ["Accept-Language: en-US"],
  "retryNum": 3,
  "timeout": 15
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $(printenv SCRAPENINJA_TOKEN)" -d @/tmp/scrapeninja_request.json

2. Scrape with JavaScript Rendering

For JavaScript-heavy sites (React, Vue, etc.):

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "waitForSelector": "h1",
  "timeout": 20
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape-js" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $(printenv SCRAPENINJA_TOKEN)" -d @/tmp/scrapeninja_request.json | jq '{status: .info.statusCode, bodyLength: (.body | length)}'

With screenshot:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "screenshot": true
}

Then run:

# Get screenshot URL from response
curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape-js" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $(printenv SCRAPENINJA_TOKEN)" -d @/tmp/scrapeninja_request.json | jq -r '.info.screenshot'

3. Geo-Based Proxy Selection

Use proxies from specific regions:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "geo": "eu"
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $(printenv SCRAPENINJA_TOKEN)" -d @/tmp/scrapeninja_request.json | jq .info

Available geos: us, eu, br (Brazil), fr (France), de (Germany), 4g-eu

4. Smart Retries

Retry on specific HTTP status codes or text patterns:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "retryNum": 3,
  "statusNotExpected": [403, 429, 503],
  "textNotExpected": ["captcha", "Access Denied"]
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $(printenv SCRAPENINJA_TOKEN)" -d @/tmp/scrapeninja_request.json

5. Extract Data with Cheerio

Extract structured JSON using Cheerio extractor functions:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://news.ycombinator.com",
  "extractor": "function(input, cheerio) { let $ = cheerio.load(input); return $(\".titleline > a\").slice(0,5).map((i,el) => ({title: $(el).text(), url: $(el).attr(\"href\")})).get(); }"
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $(printenv SCRAPENINJA_TOKEN)" -d @/tmp/scrapeninja_request.json | jq '.extractor'

6. Intercept AJAX Requests

Capture XHR/fetch responses:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "catchAjaxHeadersUrlMask": "api/data"
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape-js" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $(printenv SCRAPENINJA_TOKEN)" -d @/tmp/scrapeninja_request.json | jq '.info.catchedAjax'

7. Block Resources for Speed

Speed up JS rendering by blocking images and media:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "blockImages": true,
  "blockMedia": true
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape-js" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $(printenv SCRAPENINJA_TOKEN)" -d @/tmp/scrapeninja_request.json

API Endpoints

Endpoint	Description
`/scrape`	Fast non-JS scraping with Chrome TLS fingerprint
`/scrape-js`	Full Chrome browser with JS rendering
`/v2/scrape-js`	Enhanced JS rendering for protected sites (APIRoad only)

Request Parameters

Common Parameters (all endpoints)

Parameter	Type	Default	Description
`url`	string	required	URL to scrape
`headers`	string[]	-	Custom HTTP headers
`retryNum`	int	1	Number of retry attempts
`geo`	string	`us`	Proxy geo: us, eu, br, fr, de, 4g-eu
`proxy`	string	-	Custom proxy URL (overrides geo)
`timeout`	int	10/16	Timeout per attempt in seconds
`textNotExpected`	string[]	-	Text patterns that trigger retry
`statusNotExpected`	int[]	[403, 502]	HTTP status codes that trigger retry
`extractor`	string	-	Cheerio extractor function

JS Rendering Parameters (`/scrape-js`, `/v2/scrape-js`)

Parameter	Type	Default	Description
`waitForSelector`	string	-	CSS selector to wait for
`postWaitTime`	int	-	Extra wait time after load (1-12s)
`screenshot`	bool	true	Take page screenshot
`blockImages`	bool	false	Block image loading
`blockMedia`	bool	false	Block CSS/fonts loading
`catchAjaxHeadersUrlMask`	string	-	URL pattern to intercept AJAX
`viewport`	object	1920x1080	Custom viewport size

Response Format

{
  "info": {
  "statusCode": 200,
  "finalUrl": "https://example.com",
  "headers": ["content-type: text/html"],
  "screenshot": "base64-encoded-png",
  "catchedAjax": {
  "url": "https://example.com/api/data",
  "method": "GET",
  "body": "...",
  "status": 200
  }
  },
  "body": "<html>...</html>",
  "extractor": { "extracted": "data" }
}

Guidelines

Start with /scrape: Use the fast non-JS endpoint first, only switch to /scrape-js if needed
Retries: Set retryNum to 2-3 for unreliable sites
Geo Selection: Use eu for European sites, us for American sites
Extractors: Test extractors at https://scrapeninja.net/cheerio-sandbox/
Blocked Sites: For Cloudflare/Datadome protected sites, use /v2/scrape-js via APIRoad
Screenshots: Set screenshot: false to speed up JS rendering
Rate Limits: Check your plan limits on RapidAPI/APIRoad dashboard

Tools

Playground: https://scrapeninja.net/scraper-sandbox
Cheerio Sandbox: https://scrapeninja.net/cheerio-sandbox
cURL Converter: https://scrapeninja.net/curl-to-scraper

scrapeninja

Safety Notice

Copy this and send it to your AI assistant to learn

ScrapeNinja

When to Use

Prerequisites

How to Use

1. Basic Scrape (Non-JS, Fast)

2. Scrape with JavaScript Rendering

3. Geo-Based Proxy Selection

4. Smart Retries

5. Extract Data with Cheerio

6. Intercept AJAX Requests

7. Block Resources for Speed

API Endpoints

Request Parameters

Common Parameters (all endpoints)

JS Rendering Parameters (`/scrape-js`, `/v2/scrape-js`)

Response Format

Guidelines

Tools

Source Transparency

Related Skills

google-sheets

apify

hackernews

serpapi

scrapeninja

Safety Notice

Copy this and send it to your AI assistant to learn

ScrapeNinja

When to Use

Prerequisites

How to Use

1. Basic Scrape (Non-JS, Fast)

2. Scrape with JavaScript Rendering

3. Geo-Based Proxy Selection

4. Smart Retries

5. Extract Data with Cheerio

6. Intercept AJAX Requests

7. Block Resources for Speed

API Endpoints

Request Parameters

Common Parameters (all endpoints)

JS Rendering Parameters (/scrape-js, /v2/scrape-js)

Response Format

Guidelines

Tools

Source Transparency

Related Skills

google-sheets

apify

hackernews

serpapi

JS Rendering Parameters (`/scrape-js`, `/v2/scrape-js`)