book-scout

Expert book recommendation engine via web search. Finds high-quality books (Douban ≥7.5 or Goodreads ≥3.8) based on topic, with deduplication and comprehensive scoring. Use when you need to recommend books for reading tasks, skill building, or research.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "book-scout" with this command: npx skills add kedoupi/book-scout

Book Scout

Expert book recommendation engine that finds high-quality books via web search.

When to Use

  • Recommending books for a specific topic (e.g., "user growth", "decision science")
  • Finding books for reading tasks (morning/noon/evening reading reports)
  • Building a reading list for skill development
  • Need to avoid previously analyzed books

Input

  • topic (required): Subject/theme (e.g., "用户增长", "决策科学", "AI技术")
  • used_models (optional): Array of book title strings to exclude (e.g., ["《精益创业》", "《从0到1》"])

Output

JSON object with the highest-scoring book:

{
  "book_title": "书名",
  "author": "作者",
  "author_nationality": "国籍或'未知'",
  "publish_date": "YYYY-MM或YYYY",
  "rating": 8.9,
  "review_count": 15000,
  "score": 112.08,
  "summary": "100字核心简介",
  "reasoning": "推荐理由"
}

Core Workflow (Two-Phase Search)

Phase 1: Discover Book Titles

Goal: Get a list of 5-8 candidate book names. Do NOT try to get ratings here.

Search Queries (execute 2-3 queries in parallel):

Query TypeTemplateExample
Chinese book lists"{topic} 经典书籍推荐 书单""用户增长 经典书籍推荐 书单"
English book lists"{topic_en} best books goodreads""user growth best books goodreads"
Community picks"{topic} 必读书 知乎推荐""用户增长 必读书 知乎推荐"

Extract: Collect book titles + authors from search results. Ignore ratings at this stage.

Deduplicate immediately: Compare against used_models — remove any matches.

Minimum: Need at least 3 candidate books after dedup. If fewer, broaden the topic and search again.

Phase 2: Get Ratings (Per-Book Lookup)

Goal: Get accurate rating + review_count for each candidate.

Strategy (try in order, stop at first success):

Method A: WebFetch Douban Page (Preferred)

For each candidate book, search for its Douban page then fetch it:

  1. web_search: "{book_title}" site:book.douban.com
  2. If a book.douban.com/subject/ URL is found → web_fetch that URL
  3. Extract: rating, review_count, publish_date, author from the page

Why this works: Douban book pages have structured rating data that WebFetch can reliably parse.

Method B: Direct Search (Fallback)

If Method A fails (no Douban URL found, or WebFetch blocked):

  • web_search: "{book_title}" "{author}" 豆瓣评分 评价人数
  • Extract rating and review_count from search snippets

Method C: Goodreads Lookup (For English Books)

  • web_search: "{book_title}" "{author}" site:goodreads.com
  • If URL found → web_fetch the Goodreads page
  • Extract rating and ratings_count

Important Rules:

  • Each book gets its OWN individual lookup — never combine multiple books into one query
  • Each book gets up to 2 attempts (e.g., Method A fails → try Method B)
  • Process books in parallel when possible

Phase 2.5: Handle Missing Data

After Phase 2, some books may still lack ratings. Apply these rules:

Missing FieldAction
rating missing after 2 attemptsUse LLM estimate from search context (mark as "rating_source": "estimated"). If no context at all, drop the book.
review_count missingDefault to 500 (neutral — neither penalized nor boosted)
publish_date missingDefault to 2020
author_nationality missingOutput "未知" (NEVER fabricate)

LLM Estimation Rule: If multiple search results consistently describe a book as "高分" / "经典" / "highly rated" but no exact number is found, estimate conservatively (7.5-8.0 for Chinese, 3.8-4.0 for English). Always mark estimated ratings.

Phase 3: 3D Scoring Algorithm

Action: Collect ALL surviving candidate books into a single JSON array. Pass this entire array to scripts/score_books.py via stdin for batch scoring. The script returns sorted results.

(If script unavailable, calculate manually using the formula below.)

Formula:

Total Score = (Base Quality + Popularity Bonus) × Recency Multiplier

A. Base Quality:

Base = rating × 10
If review_count < 100: Base = Base × 0.8 (small sample penalty)

B. Popularity Bonus:

Bonus = log₁₀(review_count) × 2

C. Recency Multiplier (based on publish_date):

Published within 2 years (2024-now):  × 1.2
Published 3-5 years ago (2021-2023):  × 1.0
Published 5+ years ago (≤2020):       × 0.8

Example:

《增长黑客》: rating=8.5, review_count=10000, publish=2015
Base = 8.5 × 10 = 85
Bonus = log₁₀(10000) × 2 = 8
Recency = 0.8
Total = (85 + 8) × 0.8 = 74.4

Phase 4: Output

Return the highest-scoring book in the structured JSON format.

Reasoning field must include: score justification, recency consideration, author background (if known).

If rating_source is "estimated", add a note: "注意:评分为根据多源信息估算,非精确数据"

Quality Filters

Minimum Standards:

  • Douban rating ≥ 7.5 OR Goodreads rating ≥ 3.8
  • Estimated ratings: apply the same thresholds

Exclusions:

  • Books with "21天", "速成", "一本通" in title
  • Marketing-heavy books with no substance

Fallback & Error Handling

Scenario 1: Web Search Failure

  • Retry once after 2-3 seconds
  • If still fails, try alternative query phrasing
  • After 3 total failures, return error:
{
  "error": "网络连接连续 3 次超时,无法获取最新书单数据,请稍后重试。"
}

Scenario 2: Topic Too Niche

  • Broaden search: remove professional jargon, use parent category
  • Example: "认知负荷理论" → "认知心理学 经典书籍"

If broad search also fails:

{
  "error": "该主题下未找到具备足够评价数据的经典书籍,请尝试更换更宽泛的主题或行业大词。"
}

Scenario 3: All Candidates Dropped

If after Phase 2.5 no books survive:

  • Return to Phase 1 with broader topic
  • Lower quality filter temporarily to ≥ 7.0 / ≥ 3.5
  • If still nothing, return the best estimated candidate with a warning

Implementation Notes

  • Phase 1 (discover): pure web_search, focus on book list articles
  • Phase 2 (ratings): web_search + web_fetch combo, target Douban/Goodreads pages
  • Phase 3 (scoring): scripts/score_books.py (deterministic)
  • Parallelism: Phase 1 queries can run in parallel; Phase 2 per-book lookups can run in parallel
  • Prioritize Douban/Goodreads/Zhihu/Reddit sources; ignore ads and promotional content

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Apple Books

Read your Apple Books library, highlights, notes, and reading progress directly from the local SQLite databases on macOS.

Registry SourceRecently Updated
3530Profile unavailable
General

F.A.C.E.T. Framework (心智模型:F.A.C.E.T. 框架)

F.A.C.E.T. cognitive framework for extracting mental models from classic books and theories. Use when (1) user asks for "reading notes", "extract models", or...

Registry Source
1840Profile unavailable
General

Hardcover Bookshelf Skill

Talk to a user's Hardcover bookshelf via the Hardcover GraphQL API. Use when the user wants to manage reading activity in natural language: start a book, fin...

Registry Source
1670Profile unavailable
General

Douban Bookmark

输入图书名称,把它加入豆瓣读书的「想读」收藏。适用于“把《xxx》加入豆瓣想读 / 愿望清单 / 想读列表”这类请求。实现方式:先用 HTTP 解析豆瓣搜索结果拿到最优图书详情页,再用 Playwright 持久化浏览器登录态打开详情页,执行豆瓣真实的两段式收藏流程(点“想读”→ 点“保存”)。首次登录后可长期复...

Registry Source
2060Profile unavailable