local-web-search

Free, private, real-time web search for OpenClaw — zero API keys required. Powered by self-hosted SearXNG + Scrapling anti-bot engine. Multi-engine parallel search (Bing/DuckDuckGo/Google/Startpage/Qwant), intent-aware Agent Reach query expansion, three-tier Browse/Viewing (Fetcher → StealthyFetcher → DynamicFetcher for Cloudflare/JS sites), cross-engine anti-hallucination validation, and automatic public fallback.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "local-web-search" with this command: npx skills add wd041216-bit/openclaw-free-web-search

Local Free Web Search v3.0

Use this skill when the user needs current or real-time web information. Powered by Scrapling (anti-bot) + SearXNG (self-hosted search). Zero API keys. Zero cost. Runs entirely locally.


External Endpoints

EndpointData SentPurpose
http://127.0.0.1:18080 (local)Search query string onlyLocal SearXNG instance
https://searx.be (fallback only)Search query string onlyPublic fallback when local SearXNG is down
Any URL passed to browse_page.pyHTTP GET request onlyFetch page content for reading

No personal data, no credentials, no conversation history is ever sent to any endpoint.


Security & Privacy

  • All search queries go to your local SearXNG instance by default — no third-party tracking
  • Public fallback (searx.be) is only used when local service is unavailable, and only receives the raw query string
  • browse_page.py makes standard HTTP GET requests to URLs you explicitly pass — no data is posted
  • Scrapling runs entirely locally — no cloud API calls, no telemetry
  • No API keys required or stored
  • No conversation history or personal data leaves your machine

Trust Statement: This skill sends search queries to your local SearXNG instance (default) or searx.be (fallback). Page content is fetched via standard HTTP GET. No personal data is transmitted. Only install if you trust the public SearXNG instance at searx.be as a fallback.


Model Invocation Note

This skill is invoked autonomously by the agent when a query requires live web information. You can disable autonomous invocation by removing this skill from your workspace. The agent will only use this skill when it determines real-time information is needed.


Tool 1 — Web Search

python3 ~/.openclaw/workspace/skills/local-web-search/scripts/search_local_web.py \
  --query "YOUR QUERY" \
  --intent general \
  --limit 5

Intent options (controls engine selection + query expansion):

IntentBest for
generalDefault, mixed queries
factualFacts, definitions, official docs
newsLatest events, breaking news
researchPapers, GitHub, technical depth
tutorialHow-to guides, code examples
comparisonA vs B, pros/cons
privacySensitive queries (ddg/startpage/qwant only)

Additional flags:

FlagDescription
--engines bing,duckduckgo,...Override engine selection
--freshness hour|day|week|month|yearFilter by recency
--max-age-days NDownrank results older than N days
--browseAuto-fetch top result with browse_page.py
--no-expandDisable Agent Reach query expansion
--jsonMachine-readable JSON output

Tool 2 — Browse/Viewing (read full page)

python3 ~/.openclaw/workspace/skills/local-web-search/scripts/browse_page.py \
  --url "https://example.com/article" \
  --max-words 600

Fetcher modes (use --mode flag):

ModeFetcherUse case
autoTier 1 → 2 → 3Default — tries fast first
fastFetcherNormal sites
stealthStealthyFetcherCloudflare / anti-bot sites
dynamicDynamicFetcherHeavy JS / SPA sites

Returns: title, published date, word count, confidence (HIGH/MEDIUM/LOW), full extracted text, and anti-hallucination advisory.


Recommended Workflow

  1. Run search_local_web.py — review results by Score and [cross-validated] tag
  2. Run browse_page.py on the top URL — check Confidence level
  3. If Confidence is LOW (paywall/blocked) — retry with --mode stealth or try next URL
  4. Answer only after reading HIGH-confidence page content
  5. Never state facts from snippets alone

Rules

  • Always use --intent to match the query type for best results.
  • When local SearXNG is unavailable, both scripts automatically fall back to searx.be.
  • If the fallback also fails, tell the user to start local SearXNG:
cd "$(cat ~/.openclaw/workspace/skills/local-web-search/.project_root)" && ./start_local_search.sh
  • Do NOT invent search results if all sources fail.
  • search_local_web.py and browse_page.py are complementary: search first, browse second.
  • Prefer [cross-validated] results (appeared in multiple engines) for factual claims.
  • For sites behind Cloudflare or requiring JS, use browse_page.py --mode stealth.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

Agent Fact Check Verify

嚴謹多來源資訊查核與可信度判定技能。用於「查證/核實/核實這個/是真的嗎/是否正確」類請求,整合政府、官方、主流媒體、事實查核站、X(Twitter)、Reddit 等來源,採用內部 100 分制規則化評分(不對使用者公開分數),對外輸出中立且整合式結論。

Registry SourceRecently Updated
130Profile unavailable
Security

News Trust Check

Verify suspicious news, announcements, screenshots, and viral claims using a high-trust source pool (official channels + Chinese mainstream media + internati...

Registry SourceRecently Updated
951Profile unavailable
Automation

ArcAgent MCP

Execute ArcAgent bounty workflows end-to-end via MCP tools. Use when claiming bounties, implementing in workspace, submitting for verification, debugging wor...

Registry SourceRecently Updated
2000Profile unavailable
Automation

Context Verifier

Know the file you're editing is the file you think it is — verify integrity before you act

Registry SourceRecently Updated
5430Profile unavailable