index-knowledge

Generate hierarchical AGENTS.md files. Root + complexity-scored subdirectories.

Usage

--create-new # Read existing → remove all → regenerate from scratch --max-depth=2 # Limit directory depth (default: 5)

Default: Update mode (modify existing + create new where warranted)

Workflow (High-Level)

Discovery + Analysis (concurrent)
Launch parallel explore agents (multiple Task calls in one message)
Main session: bash structure + LSP codemap + read existing AGENTS.md
Score & Decide - Determine AGENTS.md locations from merged findings
Generate - Root first, then subdirs in parallel
Review - Deduplicate, trim, validate

TodoWrite([ { id: "discovery", content: "Fire explore agents + LSP codemap + read existing", status: "pending", priority: "high" }, { id: "scoring", content: "Score directories, determine locations", status: "pending", priority: "high" }, { id: "generate", content: "Generate AGENTS.md files (root + subdirs)", status: "pending", priority: "high" }, { id: "review", content: "Deduplicate, validate, trim", status: "pending", priority: "medium" } ])

Phase 1: Discovery + Analysis (Concurrent)

Mark "discovery" as in_progress.

Launch Parallel Explore Agents

Multiple Task calls in a single message execute in parallel. Results return directly.

// All Task calls in ONE message = parallel execution

Task( description="project structure", subagent_type="explore", prompt="Project structure: PREDICT standard patterns for detected language → REPORT deviations only" )

Task( description="entry points", subagent_type="explore", prompt="Entry points: FIND main files → REPORT non-standard organization" )

Task( description="conventions", subagent_type="explore", prompt="Conventions: FIND config files (.eslintrc, pyproject.toml, .editorconfig) → REPORT project-specific rules" )

Task( description="anti-patterns", subagent_type="explore", prompt="Anti-patterns: FIND 'DO NOT', 'NEVER', 'ALWAYS', 'DEPRECATED' comments → LIST forbidden patterns" )

Task( description="build/ci", subagent_type="explore", prompt="Build/CI: FIND .github/workflows, Makefile → REPORT non-standard patterns" )

Task( description="test patterns", subagent_type="explore", prompt="Test patterns: FIND test configs, test structure → REPORT unique conventions" )

Factor Threshold Additional Agents

Total files

100 +1 per 100 files

Total lines

10k +1 per 10k lines

Directory depth ≥4 +2 for deep exploration

Large files (>500 lines)

10 files +1 for complexity hotspots

Monorepo detected +1 per package/workspace

Multiple languages

1 +1 per language

Measure project scale first

total_files=$(find . -type f -not -path '/node_modules/' -not -path '/.git/' | wc -l) total_lines=$(find . -type f ( -name ".ts" -o -name ".py" -o -name ".go" ) -not -path '/node_modules/' -exec wc -l {} + 2>/dev/null | tail -1 | awk '{print $1}') large_files=$(find . -type f ( -name ".ts" -o -name ".py" ) -not -path '/node_modules/' -exec wc -l {} + 2>/dev/null | awk '$1 > 500 {count++} END {print count+0}') max_depth=$(find . -type d -not -path '/node_modules/' -not -path '/.git/*' | awk -F/ '{print NF}' | sort -rn | head -1)

Example spawning (all in ONE message for parallel execution):

// 500 files, 50k lines, depth 6, 15 large files → spawn additional agents Task( description="large files", subagent_type="explore", prompt="Large file analysis: FIND files >500 lines, REPORT complexity hotspots" )

Task( description="deep modules", subagent_type="explore", prompt="Deep modules at depth 4+: FIND hidden patterns, internal conventions" )

Task( description="cross-cutting", subagent_type="explore", prompt="Cross-cutting concerns: FIND shared utilities across directories" ) // ... more based on calculation

Main Session: Concurrent Analysis

While Task agents execute, main session does:

Bash Structural Analysis

Directory depth + file counts

find . -type d -not -path '/.' -not -path '/node_modules/' -not -path '/venv/' -not -path '/dist/' -not -path '/build/' | awk -F/ '{print NF-1}' | sort -n | uniq -c

Files per directory (top 30)

find . -type f -not -path '/.' -not -path '/node_modules/' | sed 's|/[^/]*$||' | sort | uniq -c | sort -rn | head -30

Code concentration by extension

find . -type f ( -name ".py" -o -name ".ts" -o -name ".tsx" -o -name ".js" -o -name ".go" -o -name ".rs" ) -not -path '/node_modules/' | sed 's|/[^/]*$||' | sort | uniq -c | sort -rn | head -20

Existing AGENTS.md / CLAUDE.md

find . -type f ( -name "AGENTS.md" -o -name "CLAUDE.md" ) -not -path '/node_modules/' 2>/dev/null

Read Existing AGENTS.md

For each existing file found: Read(filePath=file) Extract: key insights, conventions, anti-patterns Store in EXISTING_AGENTS map

If --create-new : Read all existing first (preserve context) → then delete all → regenerate.

LSP Codemap (if available)

lsp_servers() # Check availability

Entry points (parallel)

lsp_document_symbols(filePath="src/index.ts") lsp_document_symbols(filePath="main.py")

Key symbols (parallel)

lsp_workspace_symbols(filePath=".", query="class") lsp_workspace_symbols(filePath=".", query="interface") lsp_workspace_symbols(filePath=".", query="function")

Centrality for top exports

lsp_find_references(filePath="...", line=X, character=Y)

LSP Fallback: If unavailable, rely on explore agents + AST-grep.

Merge: bash + LSP + existing + Task agent results. Mark "discovery" as completed.

Phase 2: Scoring & Location Decision

Mark "scoring" as in_progress.

Scoring Matrix

Factor Weight High Threshold Source

File count 3x

20 bash

Subdir count 2x

5 bash

Code ratio 2x

70% bash

Unique patterns 1x Has own config explore

Module boundary 2x Has index.ts/init.py bash

Symbol density 2x

30 symbols LSP

Export count 2x

10 exports LSP

Reference centrality 3x

20 refs LSP

Decision Rules

Score Action

Root (.) ALWAYS create

15 Create AGENTS.md

8-15 Create if distinct domain

<8 Skip (parent covers)

Output

AGENTS_LOCATIONS = [ { path: ".", type: "root" }, { path: "src/hooks", score: 18, reason: "high complexity" }, { path: "src/api", score: 12, reason: "distinct domain" } ]

Mark "scoring" as completed.

Phase 3: Generate AGENTS.md

Mark "generate" as in_progress.

Root AGENTS.md (Full Treatment)

PROJECT KNOWLEDGE BASE

Generated: {TIMESTAMP} Commit: {SHORT_SHA} Branch: {BRANCH}

OVERVIEW

{1-2 sentences: what + core stack}

STRUCTURE

``` {root}/ ├── {dir}/ # {non-obvious purpose only} └── {entry} ```

WHERE TO LOOK

Task	Location	Notes

CODE MAP

{From LSP - skip if unavailable or project <10 files}

CONVENTIONS

{ONLY deviations from standard}

ANTI-PATTERNS (THIS PROJECT)

{Explicitly forbidden here}

UNIQUE STYLES

{Project-specific}

COMMANDS

```bash {dev/test/build} ```

NOTES

{Gotchas}

Quality gates: 50-150 lines, no generic advice, no obvious info.

Subdirectory AGENTS.md (Parallel)

Launch general agents for each location in ONE message (parallel execution):

// All in single message = parallel Task( description="AGENTS.md for src/hooks", subagent_type="general", prompt="Generate AGENTS.md for: src/hooks - Reason: high complexity - 30-80 lines max - NEVER repeat parent content - Sections: OVERVIEW (1 line), STRUCTURE (if >5 subdirs), WHERE TO LOOK, CONVENTIONS (if different), ANTI-PATTERNS - Write directly to src/hooks/AGENTS.md" )

Task( description="AGENTS.md for src/api", subagent_type="general", prompt="Generate AGENTS.md for: src/api - Reason: distinct domain - 30-80 lines max - NEVER repeat parent content - Sections: OVERVIEW (1 line), STRUCTURE (if >5 subdirs), WHERE TO LOOK, CONVENTIONS (if different), ANTI-PATTERNS - Write directly to src/api/AGENTS.md" ) // ... one Task per AGENTS_LOCATIONS entry

Results return directly. Mark "generate" as completed.

Phase 4: Review & Deduplicate

Mark "review" as in_progress.

For each generated file:

Remove generic advice
Remove parent duplicates
Trim to size limits
Verify telegraphic style

Mark "review" as completed.

Final Report

=== index-knowledge Complete ===

Mode: {update | create-new}

Files: ✓ ./AGENTS.md (root, {N} lines) ✓ ./src/hooks/AGENTS.md ({N} lines)

Dirs Analyzed: {N} AGENTS.md Created: {N} AGENTS.md Updated: {N}

Hierarchy: ./AGENTS.md └── src/hooks/AGENTS.md

Anti-Patterns

Static agent count: MUST vary agents based on project size/depth
Sequential execution: MUST parallel (multiple Task calls in one message)
Ignoring existing: ALWAYS read existing first, even with --create-new
Over-documenting: Not every dir needs AGENTS.md
Redundancy: Child never repeats parent
Generic content: Remove anything that applies to ALL projects
Verbose style: Telegraphic or die