References (archive): SCAFFOLD_SKILLS_ARCHIVE_MAP.md — embedding/semantic tool discovery from everything-claude-code backend-patterns, tdd-workflow.

Tool Search Skill

Identity

Tool Search - Provides semantic tool discovery using embeddings to scale from dozens to thousands of tools with 90%+ context reduction.

Capabilities

Semantic Tool Search: Find relevant tools based on task context
Embedding-Based Matching: Use embeddings for accurate tool discovery
On-Demand Loading: Load tools only when needed
Context Efficiency: 90%+ reduction in tool definition tokens

The Problem

Traditional tool loading:

All tools loaded upfront
58 tools = ~55K tokens
Context fills quickly
Hard to scale beyond ~100 tools

The Solution

Tool Search with Embeddings:

Only Tool Search Tool loaded initially (~500 tokens)
Tools discovered on-demand via semantic search
3-5 relevant tools loaded per search (~3K tokens)
Total: ~8.7K tokens vs. ~77K traditional (85% reduction)

How It Works

Initial State: Only Tool Search Tool + critical tools loaded
Tool Discovery: Agent searches for tools based on task
Semantic Matching: Embeddings match tools to task context
Tool Expansion: Matching tools expanded into full definitions
Tool Use: Agent uses discovered tools

Configuration

MCP Configuration (.claude/.mcp.json )

{ "betaFeatures": ["advanced-tool-use-2025-11-20"], "toolSearch": { "enabled": true, "autoEnableThreshold": 20, "defaultDeferLoading": true }, "mcpServers": { "repo": { "deferLoading": true, "alwaysLoadTools": ["search_code", "read_file"] }, "github": { "deferLoading": true, "alwaysLoadTools": ["create_pull_request", "get_issue"] } } }

Always Load Critical Tools

Keep 3-5 most-used tools always loaded:

Core file operations: read_file , write_file , search_code
Essential integrations: create_pull_request , get_issue
Frequently used: take_screenshot , navigate_page

Usage Patterns

When to Use Tool Search

Most Beneficial When:

Tool definitions consuming >10K tokens
Tool library has 10+ tools
Experiencing tool selection accuracy issues
Building MCP-powered systems with multiple servers

Less Beneficial When:

Small tool library (<10 tools)
All tools used frequently in every session
Tool definitions are compact

Tool Discovery

Agent Workflow:

Agent needs capability (e.g., "create a pull request")
Agent searches: "github pull request creation"
Tool Search returns: create_pull_request tool
Tool expanded into full definition
Agent uses tool

Example:

User: "Create a pull request for my changes"

Agent searches: "github pull request creation" Tool Search finds: create_pull_request tool Tool loaded and used

Best Practices

Clear Tool Names and Descriptions

Good:

{ "name": "search_customer_orders", "description": "Search for customer orders by date range, status, or total amount. Returns order details including items, shipping, and payment info." }

Bad:

{ "name": "query_db_orders", "description": "Execute order query" }

System Prompt Guidance

Add guidance in agent prompts:

You have access to tools for Slack messaging, Google Drive file management, Jira ticket tracking, and GitHub repository operations. Use the tool search to find specific capabilities when needed.

Keep Critical Tools Always Loaded

Don't defer loading for:

Core file operations
Essential integrations
Frequently used tools

Monitor Tool Usage

Track which tools are discovered:

Most searched tools
Tool discovery patterns
Context savings achieved

Implementation

Embedding-Based Tool Search

The tool search uses embeddings to match tools to queries:

Tool Indexing: Create embeddings for all tool definitions
Query Embedding: Create embedding for user query
Similarity Search: Find tools with similar embeddings
Tool Expansion: Load matching tools into context

Tool Search Tool

The Tool Search Tool itself:

Searches tool library semantically
Returns relevant tools based on query
Expands tools into full definitions
Maintains tool index

Benefits

Context Efficiency

85% reduction in tool definition tokens
52.5% total context (down from 87%)
Within optimal range (60-70% target)

Improved Accuracy

11% improvement in tool selection accuracy
79.5% → 88.1% (Opus 4.5)
Better tool matching for complex queries

Scalability

Scales to thousands of tools
No context limit concerns
Dynamic tool discovery

Examples

Example 1: GitHub Operations

User: "Create a pull request"

Agent workflow:

Searches: "github pull request creation"
Tool Search finds: create_pull_request tool
Tool loaded (3K tokens)
Agent uses tool
Total context: ~8.7K tokens (vs. 55K traditional)

Example 2: File Operations

User: "Search for authentication code"

Agent workflow:

Searches: "code search file operations"
Tool Search finds: search_code, read_file tools
Tools loaded (5K tokens)
Agent uses tools

Example 3: Multiple Integrations

User: "Check Slack messages and create Jira ticket"

Agent workflow:

Searches: "slack message reading"
Tool Search finds: read_slack_message tool
Searches: "jira ticket creation"
Tool Search finds: create_jira_ticket tool
Both tools loaded (6K tokens total)

Integration

With MCP Servers

Tool search works with MCP servers:

GitHub MCP: 35 tools → 3-5 loaded on-demand
Slack MCP: 11 tools → 2-3 loaded on-demand
Custom MCPs: Any number of tools → Loaded as needed

With Agent System

All agents benefit from tool search:

Reduced context usage
Better tool selection
Scalable tool libraries

Troubleshooting

Tools Not Found

Check tool names and descriptions are clear
Verify tool search is enabled
Review search queries
Check tool index is up to date

Context Still High

Verify deferLoading is enabled
Check alwaysLoadTools list (should be minimal)
Review tool definitions (may be too verbose)
Monitor actual tool usage

Tool Selection Issues

Improve tool descriptions
Add more context to search queries
Review tool naming conventions
Check embedding quality

Integration with Programmatic Tool Calling (PTC)

Tool Search works excellently with Programmatic Tool Calling:

Tool Search finds relevant tools (on-demand loading)
PTC orchestrates tools efficiently (reduced context)
Result: Optimal tool usage with minimal token consumption

Example Workflow:

Tool Search finds tools

tools = search_tools("github issue management")

PTC orchestrates multiple tool calls

team = await get_team_members("engineering") issues = await asyncio.gather(*[ get_issue(member["github_username"]) for member in team ])

Only final results in context, not all intermediate data

See PTC Patterns Guide for comprehensive PTC documentation.

Search for git-related tools

node .claude/tools/tool_search.mjs --query "git"

Search for database tools

node .claude/tools/tool_search.mjs --query "database" --limit 3

Search for testing tools

node .claude/tools/tool_search.mjs --query "testing"

</usage_example>

Memory Protocol (MANDATORY)

Before starting: Read .claude/context/memory/learnings.md

After completing:

New pattern -> .claude/context/memory/learnings.md
Issue found -> .claude/context/memory/issues.md
Decision made -> .claude/context/memory/decisions.md

ASSUME INTERRUPTION: If it's not in memory, it didn't happen.

tool-search

Safety Notice

Copy this and send it to your AI assistant to learn

Tool Search finds tools

PTC orchestrates multiple tool calls

Only final results in context, not all intermediate data

Search for git-related tools

Search for database tools

Search for testing tools

Source Transparency

Related Skills

pyqt6-ui-development-rules

code-analyzer

gcloud-cli

github-mcp