Deep Dive Analysis Skill
Overview
This skill combines mechanical structure extraction with Claude's semantic understanding to produce comprehensive codebase documentation. Unlike simple AST parsing, this skill captures:
-
WHAT the code does (structure, functions, classes)
-
WHY it exists (business purpose, design decisions)
-
HOW it integrates (dependencies, contracts, flows)
-
CONSEQUENCES of changes (side effects, failure modes)
Capabilities
Mechanical Analysis (Scripts):
-
Extract code structure (classes, functions, imports)
-
Map dependencies (internal/external)
-
Find symbol usages across the codebase
-
Track analysis progress
-
Classify files by criticality
Semantic Analysis (Claude AI):
-
Recognize architectural and design patterns
-
Identify red flags and anti-patterns
-
Trace data and control flows
-
Document contracts and invariants
-
Assess quality and maintainability
Documentation Maintenance:
-
Review and maintain documentation (Phase 8)
-
Fix broken links and update navigation indexes
-
Analyze and rewrite code comments (antirez standards)
Use this skill when:
-
Analyzing a codebase you're unfamiliar with
-
Generating documentation that explains WHY, not just WHAT
-
Identifying architectural patterns and anti-patterns
-
Performing code review with semantic understanding
-
Onboarding to a new project
Prerequisites
-
analysis_progress.json must exist in project root (created by DEEP_DIVE_PLAN setup)
-
DEEP_DIVE_PLAN.md should be reviewed to understand phase structure
CRITICAL PRINCIPLE: ABSOLUTE SOURCE OF TRUTH
THE DOCUMENTATION GENERATED BY THIS SKILL IS THE ABSOLUTE AND UNQUESTIONABLE SOURCE OF TRUTH FOR YOUR PROJECT.
ANY INFORMATION NOT VERIFIED WITH IRREFUTABLE EVIDENCE FROM SOURCE CODE IS FALSE, UNRELIABLE, AND UNACCEPTABLE.
Mandatory Rules (VIOLATION = FAILURE)
-
NEVER document anything without reading the actual source code first
-
NEVER assume any existing documentation, comment, or docstring is accurate
-
NEVER write documentation based on memory, inference, or "what should be"
-
ALWAYS derive truth EXCLUSIVELY from reading and tracing actual code
-
ALWAYS provide source file + line number for every technical claim
-
ALWAYS verify state machines, enums, constants against actual definitions
-
TREAT all pre-existing docs as unverified claims requiring validation
-
MARK any unverifiable statement as [UNVERIFIED - REQUIRES CODE CHECK]
See references/analysis-templates.md for the full verification trust model, temporal purity principle, and documentation status markers.
Output Usage Guide
After analysis completes, consult the right file for your task:
Your Task Start With Also Check
Onboarding / understanding the project 07-final-report, 01-structure 04-semantics
Writing new feature 01-structure (Where to Add), 02-interfaces 04-semantics
Fixing a bug 03-flows, 05-risks 01-structure
Refactoring 01-structure, 04-semantics, 05-risks 03-flows
Code review 02-interfaces, 05-risks 06-documentation
Updating documentation 06-documentation, 04-semantics 02-interfaces
Forbidden Files
The analysis NEVER reads or includes contents from sensitive files: .env , .env.* , credentials.* , secrets.* , *.pem , *.key , .p12 , .pfx , id_rsa , id_ed25519 , .npmrc , .pypirc , .netrc , or any file containing API keys, passwords, or tokens. If encountered, note file existence only - never quote contents.
Available Commands
- Analyze Single File
python .claude/skills/deep-dive-analysis/scripts/analyze_file.py
--file src/utils/circuit_breaker.py
--output-format markdown
Parameters:
-
--file / -f : Relative path to file - REQUIRED
-
--output-format / -o : Output format (json, markdown, summary) - default: summary
-
--find-usages / -u : Find all usages of exported symbols - default: false
-
--update-progress / -p : Update analysis_progress.json - default: false
- Check Progress
python .claude/skills/deep-dive-analysis/scripts/check_progress.py
--phase 1 --status pending
- Find Usages
python .claude/skills/deep-dive-analysis/scripts/analyze_file.py
--symbol CircuitBreaker --file src/utils/circuit_breaker.py
- Generate Phase Report
python .claude/skills/deep-dive-analysis/scripts/analyze_file.py
--phase 1 --output-format markdown --output-file docs/01_domains/COMMON_LIBRARY.md
Phase 8: Documentation Review Commands
- Scan Documentation Health
python .claude/skills/deep-dive-analysis/scripts/doc_review.py scan
--path docs/ --output doc_health_report.json
- Validate Links
python .claude/skills/deep-dive-analysis/scripts/doc_review.py validate-links
--path docs/ --fix
- Verify Against Source Code
python .claude/skills/deep-dive-analysis/scripts/doc_review.py verify
--doc docs/agents/lifecycle.md --source src/agents/lifecycle.py
- Update Navigation Indexes
python .claude/skills/deep-dive-analysis/scripts/doc_review.py update-indexes
--search-index docs/00_navigation/SEARCH_INDEX.md
--by-domain docs/00_navigation/BY_DOMAIN.md
- Full Documentation Maintenance
python .claude/skills/deep-dive-analysis/scripts/doc_review.py full-maintenance
--path docs/ --auto-fix --output doc_health_report.json
Executes: scan health, validate/fix links, identify obsolete files, update indexes, generate report.
Comment Quality Commands (Antirez Standards)
- Analyze Comment Quality
python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py analyze
src/main.py --report
- Scan Directory for Comment Issues
python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py scan
src/ --recursive --issues-only
- Generate Comment Health Report
python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py report
src/ --output comment_health.md
- Rewrite Comments
python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py rewrite
src/main.py --apply --backup
- View Standards Reference
python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py standards
File Classification Criteria
Classification Criteria Verification
Critical Handles authentication, security, encryption, sensitive data Mandatory
High-Complexity
300 LOC, >5 dependencies, state machines, async patterns Mandatory
Standard Normal business logic, data models, utilities Recommended
Utility Pure functions, helpers, constants Optional
AI-Powered Semantic Analysis
Five Layers of Understanding
Layer What Who Does It
-
WHAT Classes, functions, imports Scripts (AST)
-
HOW Algorithm details, data flow Claude's first pass
-
WHY Business purpose, design decisions Claude's deep analysis
-
WHEN Triggers, lifecycle, concurrency Claude's behavioral analysis
-
CONSEQUENCES Side effects, failure modes Claude's systems thinking
Pattern Recognition
Pattern Type Examples Documentation Focus
Architectural Repository, Service, CQRS, Event-Driven Responsibilities, boundaries
Behavioral State Machine, Strategy, Observer, Chain Transitions, variations
Resilience Circuit Breaker, Retry, Bulkhead, Timeout Thresholds, fallbacks
Data DTO, Value Object, Aggregate Invariants, relationships
Concurrency Producer-Consumer, Worker Pool Thread safety, backpressure
Red Flags to Identify
ARCHITECTURE:
- GOD CLASS: >10 public methods or >500 LOC
- CIRCULAR DEPENDENCY: A -> B -> C -> A
- LEAKY ABSTRACTION: Implementation details in interface
RELIABILITY:
- SWALLOWED EXCEPTION: Empty catch blocks
- MISSING TIMEOUT: Network calls without timeout
- RACE CONDITION: Shared mutable state without sync
SECURITY:
- HARDCODED SECRET: Passwords, API keys in code
- SQL INJECTION: String concatenation in queries
- MISSING VALIDATION: Unsanitized user input
AI Analysis Workflow
- SCRIPTS RUN FIRST -> classifier.py, ast_parser.py, usage_finder.py
- CLAUDE ANALYZES -> Read source, apply semantic questions, recognize patterns, identify red flags
- CLAUDE DOCUMENTS -> Use template, explain WHY not just WHAT, document contracts
- VERIFY -> Check against runtime behavior, validate with code traces
Analysis Loop Workflow
- CLASSIFY -> LOC, dependencies, critical patterns, assign classification
- READ & MAP -> AST structure, classes, functions, constants, state mutations
- DEPENDENCY CHECK -> Internal imports, external imports, external calls
- CONTEXT ANALYSIS -> Symbol usages, importing modules, message flows
- RUNTIME VERIFICATION (Critical/High-Complexity) -> Log analysis, flow verification
- DOCUMENTATION -> Update progress, generate report, cross-reference
Best Practices
Source Code Analysis (Phases 1-7)
-
Start with Phase 1 - foundation modules inform everything else
-
Track progress with --update-progress
-
Never skip runtime verification for critical/high-complexity files
-
Cross-reference with CONTEXT.md after analysis
Documentation Maintenance (Phase 8)
-
Run scan first to understand current state
-
Fix links before content - broken links indicate structural issues
-
Verify against code before updating documentation
-
Update indexes last to reflect final state
References
-
references/analysis-templates.md
-
Verification trust model, temporal purity principle, documentation status markers, comment classification, maintenance workflows
-
references/AI_ANALYSIS_METHODOLOGY.md
-
Complete analysis methodology
-
references/SEMANTIC_PATTERNS.md
-
Pattern recognition guide
-
references/ANTIREZ_COMMENTING_STANDARDS.md
-
Comment taxonomy
-
references/DEEP_DIVE_PLAN.md
-
Master analysis plan with all phase definitions
-
templates/semantic_analysis.md
-
AI-powered per-file analysis template
-
templates/analysis_report.md
-
Module-level report template
Resources
-
Scripts: scripts/
-
Python analysis tools
-
analyze_file.py
-
Source code analysis (Phases 1-7)
-
check_progress.py
-
Progress tracking
-
doc_review.py
-
Documentation maintenance (Phase 8)
-
comment_rewriter.py
-
Comment analysis engine
-
rewrite_comments.py
-
Comment quality CLI tool