Code Semantic Search
Overview
Semantic code search using Phase 1 vector embeddings and Phase 2 hybrid search (semantic + structural). Find code by meaning, not just keywords.
Core principle: Search code by what it does, not what it's called.
Phase 2: Hybrid Search
This skill now supports three search modes:
Hybrid (Default):
-
Combines semantic + structural search
-
Best accuracy (95%+)
-
Slightly slower but still <150ms
-
Recommended for all searches
Semantic-Only:
-
Uses only Phase 1 semantic vectors
-
Fastest (<50ms)
-
Good for conceptual searches
-
Use when structure doesn't matter
Structural-Only:
-
Uses only ast-grep patterns
-
Precise for exact matches
-
Best for finding function/class definitions
-
Use when you need exact structural patterns
Performance Comparison
Mode Speed Accuracy Best For
Hybrid <150ms 95% General search
Semantic-only <50ms 85% Concepts
Structural-only <50ms 100% Exact patterns
Phase 1 only <50ms 80% Legacy (fallback)
When to Use
Always:
-
Finding authentication logic without knowing function names
-
Searching for error handling patterns
-
Locating database queries
-
Finding similar code to a concept
-
Discovering implementation patterns
Don't Use:
-
Exact text matching (use Grep instead)
-
File name searches (use Glob instead)
-
Simple keyword searches (use ripgrep instead)
Usage Examples
Hybrid Search (Recommended)
// Basic hybrid search Skill({ skill: 'code-semantic-search', args: 'find authentication logic' });
// With options Skill({ skill: 'code-semantic-search', args: 'database queries', options: { mode: 'hybrid', language: 'javascript', limit: 10, }, });
Semantic-Only Search
// Fast conceptual search Skill({ skill: 'code-semantic-search', args: 'find authentication', options: { mode: 'semantic-only' }, });
Structural-Only Search
// Exact pattern matching Skill({ skill: 'code-semantic-search', args: 'find function authenticate', options: { mode: 'structural-only' }, });
Implementation Reference
Hybrid Search: .claude/lib/code-indexing/hybrid-search.cjs
Query Analysis: .claude/lib/code-indexing/query-analyzer.cjs
Result Ranking: .claude/lib/code-indexing/result-ranker.cjs
Integration Points
-
developer: Code exploration, implementation discovery
-
architect: System understanding, pattern analysis
-
code-reviewer: Finding similar patterns, consistency checks
-
reverse-engineer: Understanding unfamiliar codebases
-
researcher: Research existing implementations
Iron Laws
-
ALWAYS use hybrid mode (semantic + structural) for general searches — semantic-only misses exact matches; structural-only misses conceptual variants; hybrid provides 95% accuracy vs 85% for single mode.
-
ALWAYS use ripgrep/keyword search first for fast keyword discovery — semantic search is for meaning-based queries; exact strings, function names, and filenames are found faster with ripgrep.
-
NEVER use semantic search without a meaningful natural-language query — single-word or code-syntax queries produce poor semantic results; describe what the code does, not what it's called.
-
ALWAYS combine with code-structural-search for precision refinement — start broad with semantic discovery, then use ast-grep patterns to find exact structural matches from the semantic results.
-
NEVER ignore low-similarity results without checking them — similarity scores are approximations; a 0.7 score result may be more relevant than a 0.9 score result for uncommon patterns.
Anti-Patterns
Anti-Pattern Why It Fails Correct Approach
Semantic search for exact string matching Slower and less accurate than text search Use ripgrep/Grep for exact keyword matching
Single-word queries ("auth") Too vague for semantic matching; returns noise Use natural-language descriptions ("authentication token validation logic")
Using semantic-only mode for general searches Misses structural variants; 85% vs 95% accuracy Use hybrid mode (default) for general queries
Ignoring search results that don't match expectations Semantic results find surprising-but-relevant code Read all results; unexpected matches are often the most valuable
Not combining with structural search Finds concepts but not exact patterns Use semantic for discovery → structural for precision
Memory Protocol (MANDATORY)
Before starting: Read .claude/context/memory/learnings.md
After completing:
-
New pattern -> .claude/context/memory/learnings.md
-
Issue found -> .claude/context/memory/issues.md
-
Decision made -> .claude/context/memory/decisions.md
ASSUME INTERRUPTION: If it's not in memory, it didn't happen.