xhs

小红书 (Xiaohongshu/RedNote) research - search, analyze posts in depth, view images, read comments, output Chinese recommendations. Combines CLI tool usage with research methodology.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "xhs" with this command: npx skills add cryinglee/openclaw-skill-xhs/cryinglee-openclaw-skill-xhs-xhs

小红书 Research 📕

Research tool for Chinese user-generated content — travel, food, lifestyle, local discoveries.

When to Use

  • Travel planning and itineraries
  • Restaurant/cafe/bar recommendations
  • Activity and weekend planning
  • Product reviews and comparisons
  • Local discovery and hidden gems
  • Any question where Chinese perspectives help

Recommended Model

When spawning as a sub-agent: Sonnet 4.5 (model: "claude-sonnet-4-5-20250929")

  • Fast enough for the slow XHS API calls
  • Good at Chinese content understanding
  • More cost-effective than Opus for research grunt work
  • Opus overkill for search → synthesize workflow

Context Management (Always Use)

ALWAYS use dynamic context monitoring — even 5 posts with images can hit 75-300k tokens.

The Problem

  • Each post with images = 15-60k tokens
  • 200k context fills fast
  • Context is append-only (can't "forget" within session)

The Solution: Monitor + Checkpoint + Continue

1. After EACH post, do two things:

a) Write findings to disk immediately:
   /research/{task-id}/findings/post-{n}.md

b) Check context usage:
   session_status → look for "Context: XXXk/200k (YY%)"

2. When context hits 70%, STOP and checkpoint:

Write state file:
/research/{task-id}/state.json
{
  "processed": 15,
  "pendingUrls": ["url16", "url17", ...],
  "summaries": ["Post 1: 火塘...", ...]
}

Return to caller:
{
  "complete": false,
  "processed": 15,
  "remaining": 25,
  "statePath": "/research/{task-id}/state.json",
  "findingsDir": "/research/{task-id}/findings/"
}

3. Caller spawns fresh sub-agent to continue:

spawn_subagent(
  task="Continue XHS research from /research/{task-id}/state.json",
  model="claude-sonnet-4-5-20250929"
)

New sub-agent has fresh 200k context, reads state.json, continues from post 16.

State File Schema

{
  "taskId": "kunming-food-2026-02-01",
  "query": "昆明美食",
  "searchesCompleted": ["昆明美食", "昆明美食推荐"],  // Keywords already searched
  "processedUrls": ["url1", "url2", ...],             // Explicit URL tracking (prevents duplicates)
  "pendingUrls": ["url3", "url4", ...],               // Remaining URLs to process
  "nextPostNumber": 16,                                // Next post-XXX.md number
  "summaries": [                                       // 1-liner per post for final synthesis
    "Post 1: 火塘餐厅 | 🟢 | ¥80 | 本地人推荐",
    "Post 2: 野生菌火锅 | 🟢 | ¥120 | 菌子新鲜"
  ],
  "batchNumber": 1,
  "contextCheckpoint": "70%"
}

Critical fields for handoff:

  • processedUrls: Prevents re-processing same post across sub-agents
  • pendingUrls: Exact work remaining
  • nextPostNumber: Ensures sequential file naming
  • searchesCompleted: Prevents duplicate searches

Workflow for Large Research

Caller should use longer timeout:

sessions_spawn(
  task="...",
  model="claude-sonnet-4-5-20250929",
  runTimeoutSeconds=1800  // 30 minutes for research tasks
)

Default is 600s (10 min) — too short for XHS research with slow API calls.

Interleave search and processing (don't collect all URLs first):

[XHS Sub-agent 1]
    ├── Check for state.json (none = fresh start)
    ├── Search keyword 1 → get 20 URLs
    ├── Process 5-10 posts immediately (writing each to disk)
    ├── Search keyword 2 → get more URLs (dedupe)
    ├── Process more posts
    ├── Context hits 70% → write state.json
    └── Return {complete: false, remaining: N}

This prevents timeout from losing all work — each post is saved as processed.

Full continuation pattern:

[Caller]
    ↓ spawn (runTimeoutSeconds=1800)
[XHS Sub-agent 1]
    ├── Search + process interleaved
    ├── Context hits 70% → write state.json
    └── Return {complete: false, remaining: 25}
    
[Caller sees incomplete]
    ↓ spawn continuation (runTimeoutSeconds=1800)
[XHS Sub-agent 2]  ← fresh 200k context!
    ├── Read state.json (has processedUrls, pendingUrls)
    ├── Continue processing + more searches if needed
    ├── Context hits 70% → write state.json
    └── Return {complete: false, remaining: 10}
    
[Caller sees incomplete]
    ↓ spawn continuation
[XHS Sub-agent 3]
    ├── Read state.json
    ├── Process remaining posts
    ├── All done → write synthesis.md
    └── Return {complete: true, synthesisPath: "..."}

Output Directory Structure

/research/{task-id}/
├── state.json              # Checkpoint for continuation
├── findings/
│   ├── post-001.md         # Full analysis + image paths
│   ├── post-002.md
│   └── ...
├── images/
│   ├── post-001/
│   │   ├── 1.jpg
│   │   └── 2.jpg
│   └── ...
├── summaries.md            # All 1-liners (for quick scan)
└── synthesis.md            # Final output (when complete)

Key Rules (ALWAYS FOLLOW)

  1. Write after EVERY post — crash-safe, no work lost
  2. Check context after EVERY post — use session_status tool
  3. Stop at 70% — leave room for synthesis + buffer
  4. Return structured result — caller decides next step
  5. Read all images — they're pre-compressed (600px, q85)
  6. Skip videos — already marked in fetch-post

⚠️ This is not optional. Even small research can overflow context with image-heavy posts.


Scripts (Mechanical Tasks)

These scripts handle the repetitive CLI work:

ScriptPurpose
bin/preflightVerify tool is working before research
bin/search "keywords" [limit] [timeout] [sort]Search for posts (sort: general/newest/hot)
bin/get-content "url"Get full note content (text only)
bin/get-comments "url"Get comments on a note
bin/get-images "url" [dir]Download images only
bin/fetch-post "url" [cache] [retries]Fetch content + comments + images (with retries)

All scripts are at /root/clawd/skills/xhs/bin/

Preflight (always run first)

/root/clawd/skills/xhs/bin/preflight

Checks: rednote-mcp installed, cookies valid, stealth patches, test search. Don't proceed until preflight passes.

Search

/root/clawd/skills/xhs/bin/search "昆明美食推荐" [limit] [timeout] [sort]

Returns JSON with post results.

Parameters:

ParamDefaultDescription
keywords(required)Search terms in Chinese
limit10Max results (scroll pagination when >20)
timeout180Seconds before giving up
sortgeneralSort order (see below)

Sort options:

ValueXHS LabelWhen to use
general综合Default — XHS algorithm balances relevance + engagement. Best for most research.
newest最新舆情监控, breaking news, recent experiences, time-sensitive topics
hot最热Finding viral/popular posts, trending content

Examples:

# Default sort (recommended for most research)
bin/search "昆明美食推荐" 20

# Recent posts first (舆情, current events)
bin/search "某品牌 评价" 20 180 newest

# Most popular posts
bin/search "网红打卡地" 15 180 hot

Scroll pagination enabled (patched): When limit > 20, the tool scrolls to load more results via XHS infinite scroll. Actual results depend on available content.

For maximum coverage, combine:

  1. Higher limits (e.g., limit=50) to scroll for more
  2. Multiple keyword variations for different result sets:
    • 香蕉攀岩, 香蕉攀岩馆, 香蕉攀岩体验, 香蕉攀岩评价
    • 昆明美食, 昆明美食推荐, 昆明必吃, 昆明本地人推荐

Results vary by query — popular topics may return 30-50+, niche topics fewer.

Choosing sort order:

  • Most researchgeneral (default). Let XHS's algorithm surface the best content.
  • 舆情监控 / sentiment trackingnewest. You want recent opinions, not old viral posts.
  • Trend discoveryhot. See what's currently popular.

Get Content

/root/clawd/skills/xhs/bin/get-content "FULL_URL_WITH_XSEC_TOKEN"

⚠️ Must use full URL with xsec_token from search results.

Get Comments

/root/clawd/skills/xhs/bin/get-comments "FULL_URL_WITH_XSEC_TOKEN"

Get Images

Download all images from a post to local files:

/root/clawd/skills/xhs/bin/get-images "FULL_URL" /tmp/my-images

Fetch Post (Deep Dive with Images)

Fetch content, comments, and images in one call — with built-in retries:

/root/clawd/skills/xhs/bin/fetch-post "FULL_URL" /path/to/cache [max_retries]

Features:

  • Retries on timeout (60s → 90s → 120s)
  • Clear error reporting in JSON output
  • Images cached locally, bypassing CDN protection

Returns JSON:

{
  "success": true,
  "postId": "abc123",
  "content": { 
    "title": "...", 
    "author": "...", 
    "desc": "...", 
    "likes": "983", 
    "tags": [...],
    "postDate": "2025-09-04"  // ← Added via patch!
  },
  "comments": [{ "author": "...", "content": "...", "likes": "3" }, ...],
  "imagePaths": ["/cache/images/abc123/1.jpg", ...],
  "errors": []
}

Date filtering: Use postDate to filter out old posts. Skip posts older than your threshold (e.g., 6-12 months for restaurants).

Workflow:

1. fetch-post → JSON + cached images
2. Read each imagePath directly (Claude sees images natively)
3. Combine text + comments + what you see into findings

Viewing images:

Read("/path/to/1.jpg")  # Claude sees it directly - no special tool needed

Look for: visible text (addresses, prices, hours), atmosphere, food presentation, crowd levels.


Research Methodology (Judgment Tasks)

This is where you think. Scripts do the fetching; you do the analyzing.

Depth Levels

DepthPostsWhen to Use
Minimum5+Quick checks, simple queries
Standard8-10Default for most research
Deep15+Complex topics, trip planning

Minimum is 5 — unless fewer exist. Note limited coverage if <5 results.

Research Workflow

Step 0: Preflight

Run bin/preflight. Don't proceed until it passes.

Step 1: Plan Your Searches

Think: "What would a Chinese user search on 小红书?"

  • Include location when relevant
  • Add qualifiers: 推荐, 攻略, 测评, 探店, 打卡, 避坑
  • Consider synonyms and variations
  • Plan 2-3 different search angles

Date filtering: Posts include postDate field (e.g., "2025-09-04"). The calling agent specifies the date filter based on research type:

Research TypeSuggested FilterWhy
舆情监控 (sentiment)1-4 weeksOnly current discourse matters
Breaking news/events1-7 daysTime-critical
Travel planning6-12 monthsRecent but reasonable window
Product reviews1-2 yearsLonger product cycles
Trend analysisCustom rangeCompare specific periods
Historical/generalNo limitWant the full archive

Caller should specify in task description, e.g.:

  • "Only posts from last 30 days" (舆情)
  • "Posts from 2025 or later" (travel)
  • "No date filter" (general research)

If no filter specified: Default to 12 months (safe middle ground).

Fallback when postDate is null: Use keyword hints: 2025, 最近, 最新

Language strategy:

LocationLanguageExample
ChinaChinese昆明攀岩
English-named venuesBothRock Tenet 昆明
InternationalChinese巴黎旅游

Step 2: Search & Scan

Run your searches. Results are already ranked by XHS's algorithm (relevance + engagement).

Use judgment based on preview — like a human deciding what to click:

Think: "Given my research goal, would this post likely contain useful information?"

Research TypeWhat to prioritize
舆情监控 (sentiment)Any opinion/experience, even low engagement — complaints matter!
Travel planningHigh engagement + detailed experiences
Product reviewsMix of positive AND negative reviews
Trend analysisVariety of perspectives
Preview SignalAction
Relevant content in preview✅ Fetch
Matches research goal✅ Fetch
Low engagement but relevant opinion✅ Fetch (esp. for 舆情)
High engagement but off-topic❌ Skip
Official announcements only⚠️ Context-dependent
广告/合作 markers⚠️ Note as sponsored if fetching
Clearly off-topic❌ Skip
Duplicate content❌ Skip

Key insight: For 舆情监控, a 3-like complaint post may be more valuable than a 500-like promotional post. Engagement ≠ relevance for all research types.

Step 3: Deep Dive Each Post

For each selected post, use fetch-post to get everything:

bin/fetch-post "url_from_search" {{RESEARCH_DIR}}/xhs

Returns JSON with content, comments, and cached images. Has built-in retries. Then:

A. Review content

  • Extract key facts from title/description
  • Note author's perspective/bias
  • Check tags for categorization

B. View images (critical!) For each imagePath in the result, just read it:

Read("/path/to/1.jpg")  # You see it directly
  • Look for text overlays: addresses, prices, hours
  • Note visual details: ambiance, crowd levels, food presentation

⚠️ Don't describe images in isolation. Synthesize what you see with the post content and comments to form a holistic view. An image of a crowded restaurant + author saying "周末排队1小时" + comments confirming "人超多" = that's your finding about crowds.

C. Review comments (gold for updates)

  • "已经关门了" = already closed
  • Real experiences vs sponsored hype
  • Tips not in main post

D. Return picked images Include paths to the best/most informative images in your findings. The calling agent decides whether and how to use them (embed in reports, reference, etc.). You're curating — pick images that show something useful (venue exterior, menu with prices, actual food, atmosphere) not just decorative shots.

Step 4: Synthesize

  • What do multiple sources agree on?
  • Any contradictions?
  • What's the overall consensus?
  • What would you actually recommend?

Step 5: Output

Facts + Flavor — structured findings that preserve the XHS voice.

## XHS Research: [Topic]

### Search Summary
| Search | Results | Notes |
|--------|---------|-------|
| 昆明攀岩 | 10 | Good coverage |

### Findings

#### [Venue Name] (中文名)
- **Type:** Restaurant / Activity / Attraction
- **Address:** [from post or image]
- **Price:** ¥XX/person
- **Hours:** [if found]
- **The vibe:** [atmosphere, energy — preserved voice]
- **Why people like it:** [opinions, impressions]
- **Watch out for:** [warnings from comments]
- **Source:** [full URL]
- **Engagement:** X likes
- **Images:** [paths for calling agent to use]
  - `/path/to/1.jpg` — exterior/entrance
  - `/path/to/3.jpg` — menu with prices

> "引用原文..." — @username

### Overall Impressions
- Consensus across posts
- Patterns in preferences
- Things only locals know
- Disagreements worth noting

The XHS value is the human perspective. A recommendation that says "环境一般但是味道绝了" tells you more than "Rating: 4.2/5".

Think: "What would a friend who just spent an hour on XHS tell me?"


Quality Signals

Trustworthy:

  • 100+ likes with real comments
  • Detailed personal experience
  • Multiple photos from actual visit
  • Specific details (prices, hours)
  • Recent posts (look for date mentions in content: "上周", "昨天", "2025年X月")
  • Year in title (e.g., "2025上海咖啡必喝榜")

Checking recency:

  • Look for dates in post text/title
  • Check if prices seem current
  • Comments mentioning "还在吗" or "现在还有吗" = might be outdated
  • Comments with recent dates confirm post is still relevant

Suspicious:

  • 广告/合作/赞助 markers
  • Overly positive, no specifics
  • Stock photos only
  • No comments or generic ones
  • Very old posts

Timing & Efficiency

XHS is SLOW — Plan Accordingly

The rednote-mcp CLI is slow (30-90s per search). Don't rapid-fire poll.

When running searches via exec:

# GOOD: Give it time to complete
exec(command, yieldMs: 60000)  # Wait 60s before checking
process(poll)  # Then poll every 30s if still running

DON'T:

  • Poll every 2-3 seconds (wastes tokens, no benefit)
  • Start multiple searches simultaneously (overloads)
  • Wait indefinitely without writing partial results

Write Incrementally

Don't wait until you've analyzed everything to start writing. After each batch of 3-5 posts:

  • Append findings to your output file
  • This protects against timeout/termination losing all work
## Findings (in progress)

### Batch 1: 美食搜索 (3 posts analyzed)
[findings...]

### Batch 2: 攻略搜索 (analyzing...)

Time Budget Awareness

If you've been running 15+ minutes:

  • Prioritize writing what you have
  • Note incomplete searches in output
  • Better to deliver 80% findings than lose 100% to termination

Retry Pattern

rednote-mcp is slow. If a command times out:

Attempt 1: default timeout
Attempt 2: +60s
Attempt 3: +120s

If all fail, report the failure. Do NOT fall back to web_search — defeats the purpose.


Error Handling

ErrorCauseFix
TimeoutNetwork/XHS slowRetry with longer timeout
Login/cookie errorSession expiredxvfb-run -a rednote-mcp init
404 / xsec_tokenMissing tokenUse full URL from search
Empty resultsNo postsTry different keywords

Setup & Maintenance

First-Time Setup

npm install -g rednote-mcp
npx playwright install
/root/clawd/skills/xhs/patches/apply-all.sh
xvfb-run -a rednote-mcp init

Re-login (when cookies expire)

xvfb-run -a rednote-mcp init

After rednote-mcp updates

/root/clawd/skills/xhs/patches/apply-all.sh

Role Clarification

This skill = Research tool that outputs structured findings Calling agent = Synthesizes XHS + other sources into final reports, decides which images to embed

You return:

  • Synthesized findings (text + images + comments → holistic view)
  • Curated image paths (calling agent decides how to use them)
  • Preserved human voice (opinions, vibes, tips)

You don't:

  • Describe images in isolation ("I see a restaurant...")
  • Generate final reports (that's the caller's job)
  • Decide image layout/placement

XHS is like having a Chinese-speaking friend spend an hour researching for you. They'd give you facts, but also opinions, vibes, and insider tips. That's what you're capturing.


Remember: Research like a curious human. Explore, cross-reference, look at pictures, read comments. The "这家真的绝了" matters as much as the address.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

xhs

No summary provided by upstream source.

Repository SourceNeeds Review
General

XHS Viral Content Factory

小红书爆款图文全自动生产工厂。支持从 PDF、Markdown 或文件夹提取内容,自动匹配治愈、干货、反直觉、视觉流 4 大模式。

Registry SourceRecently Updated
660Profile unavailable
Automation

xhs-skill-pusher

小红书内容发布技能 - 规范化cookie管理 + xhs-kit自动化发布

Registry SourceRecently Updated
520Profile unavailable
Coding

Cron Expression

Cron表达式生成、解释、常用示例、验证、下次执行时间、平台转换(Linux/AWS/GitHub Actions). Use when you need cron expression capabilities. Triggers on: cron expression.

Registry SourceRecently Updated