codebase-analysis

Codebase Analysis

Evidence-based investigation → findings → confidence-tracked conclusions.

<when_to_use>

Codebase exploration and understanding
Architecture analysis and mapping
Pattern extraction and recognition
Technical research within code
Performance or security analysis

NOT for: wild guessing, assumptions without evidence, conclusions before investigation

</when_to_use>

Bar Lvl Name Action

░░░░░

0 Gathering Collect initial evidence

▓░░░░

1 Surveying Broad scan, surface patterns

▓▓░░░

2 Investigating Deep dive, verify patterns

▓▓▓░░

3 Analyzing Cross-reference, fill gaps

▓▓▓▓░

4 Synthesizing Connect findings, high confidence

▓▓▓▓▓

5 Concluded Deliver findings

Calibration: 0=0–19%, 1=20–39%, 2=40–59%, 3=60–74%, 4=75–89%, 5=90–100%

Start honest. Clear codebase + focused question → level 2–3. Vague or complex → level 0–1.

At level 4: "High confidence in findings. One more angle would reach full certainty. Continue or deliver now?"

Below level 5: include △ Caveats section.

Core Methodology

Evidence over assumption — investigate when you can, guess only when you must.

Multi-source gathering — code, docs, tests, history, web research, runtime behavior.

Multiple angles — examine from different perspectives before concluding.

Document gaps — flag uncertainty with △, track what's unknown.

Show your work — findings include supporting evidence, not just conclusions.

Calibrate confidence — distinguish fact from inference from assumption.

<evidence_gathering>

Source Priority

Direct observation — read code, run searches, examine files
Documentation — official docs, inline comments, ADRs
Tests — reveal intended behavior and edge cases
History — git log, commit messages, PR discussions
External research — library docs, Stack Overflow, RFCs
Inference — logical deduction from available evidence
Assumption — clearly flagged when other sources unavailable

Investigation Patterns

Start broad, then narrow:

File tree → identify relevant areas
Search patterns → locate specific code
Code structure → understand without full content
Read targeted files → examine implementation
Cross-reference → verify understanding

Layer evidence:

What does the code do? (direct observation)
Why was it written this way? (history, comments)
How does it fit the system? (architecture, dependencies)
What are the edge cases? (tests, error handling)

Follow the trail:

Function calls → trace execution paths
Imports/exports → map dependencies
Test files → understand usage patterns
Error messages → reveal assumptions
Comments → capture historical context

</evidence_gathering>

<output_format>

During Investigation

After each evidence-gathering step emit:

Confidence: {BAR} {NAME}
Found: { key discoveries }
Patterns: { emerging themes }
Gaps: { what's still unclear }
Next: { investigation direction }

At Delivery (Level 5)

Findings

{ numbered list of discoveries with supporting evidence }

{FINDING} — evidence: {SOURCE}
{FINDING} — evidence: {SOURCE}

Patterns

{ recurring themes or structures identified }

Implications

{ what findings mean for the question at hand }

Confidence Assessment

Overall: {BAR} {PERCENTAGE}%

High confidence areas:

{AREA} — {REASON}

Lower confidence areas:

{AREA} — {REASON}

Supporting Evidence

Code: { file paths and line ranges }
Docs: { references }
Tests: { relevant test files }
History: { commit SHAs if relevant }
External: { URLs if applicable }

Below Level 5

△ Caveats

Assumptions:

{ASSUMPTION} — { why necessary, impact if wrong }

Gaps:

{GAP} — { what's missing, how to fill }

Unknowns:

{UNKNOWN} — { noted for future investigation }

</output_format>

<specialized_techniques>

Load micro-skills for specialized analysis:

Pattern analysis → load pattern-analysis skill
Root cause investigation → load root-cause-analysis skill
Research synthesis → load report-findings skill
Architecture analysis → see architecture-analysis.md

These provide deep-dive methodologies for specific analysis types.

</specialized_techniques>

Loop: Gather → Analyze → Update Confidence → Next step

Calibrate starting confidence — what do we already know?
Identify evidence sources — where can we look?
Gather systematically — collect from multiple angles
Cross-reference findings — verify patterns hold
Flag uncertainties — mark gaps with △
Synthesize conclusions — connect evidence to insights
Deliver with confidence level — clear about certainty

At each step:

Document what you found (evidence)
Note what it means (interpretation)
Track what's still unclear (gaps)
Update confidence bar

Before concluding (level 4+):

Check evidence quality:

✓ Multiple sources confirm pattern?
✓ Direct observation vs inference clearly marked?
✓ Assumptions explicitly flagged?
✓ Counter-examples considered?

Check completeness:

✓ Original question fully addressed?
✓ Edge cases explored?
✓ Alternative explanations ruled out?
✓ Known unknowns documented?

Check deliverable:

✓ Findings supported by evidence?
✓ Confidence calibrated honestly?
✓ Caveats section included if <100%?
✓ Next steps clear if incomplete?

ALWAYS:

Investigate before concluding
Cite evidence sources with file paths/URLs
Use confidence bars to track certainty
Flag assumptions and gaps with △
Cross-reference from multiple angles
Document investigation trail
Distinguish fact from inference
Include caveats below level 5

NEVER:

Guess when you can investigate
State assumptions as facts
Conclude from single source
Hide uncertainty or gaps
Skip validation checks
Deliver without confidence assessment
Conflate evidence with interpretation

Core methodology:

confidence.md — confidence calibration (shared with pathfinding)
FORMATTING.md — formatting conventions

Micro-skills (load as needed):

pattern-analysis — extracting and validating patterns
root-cause-analysis — systematic problem diagnosis
report-findings — multi-source research synthesis

Local references:

architecture-analysis.md — system structure mapping

Related skills:

pathfinding — clarifying requirements before analysis
debugging-and-diagnosis — structured bug investigation (loads root-cause-analysis)

codebase-analysis

Safety Notice

Copy this and send it to your AI assistant to learn

Source Transparency

Related Skills

codebase-recon

graphite-stacks

code-review

hono-dev