Investigate

Systematic methodology for finding the root cause of bugs, failures, and unexpected behavior. Cycle through characterize-isolate-hypothesize-test steps, with oracle escalation for hard problems. Diagnose the root cause — do not apply fixes. Return results for the main agent to act on.

Optional: $ARGUMENTS contains the problem description or error message.

Step 1: Characterize

Gather the symptom and establish what is actually happening:

Collect evidence — error message, stack trace, test output, log entries, or user description of unexpected behavior
Classify the problem type:

Signal	Type
Stack trace / exception	Runtime error
Test assertion failure	Test failure
Compilation / bundler / build error	Build failure
Type checker error (tsc, mypy, pyright)	Type error
Slow response / high CPU / memory growth	Performance
"It does X instead of Y" / no error	Unexpected behavior

Establish reproduction — run the failing command, test, or operation. If the problem cannot be reproduced (intermittent, environment-specific), document the constraints and proceed with historical evidence.

Record the exact reproduction command and its output for verification.

Step 2: Isolate

Narrow from "something is wrong" to "the problem is in this area." Read references/problem-type-playbooks.md for type-specific first moves and tool sequences.

Git Archeology

For all problem types, check what changed recently near the failure point:

git log --oneline -20 -- <file>
git blame -L <start>,<end> <file>

If a known-good state exists (e.g., "this worked yesterday"), consider git bisect to pinpoint the breaking commit.

Scope Narrowing

Stack traces: Read the throwing function and its callers — full functions, not just the flagged line
Test failures: Read both the test and the system under test
Build errors: Read the config file and the referenced source
Unexpected behavior: Trace the data flow from input to the unexpected output

Step 3: Hypothesize

Generate 2-4 hypotheses ranked by likelihood. Each hypothesis must be falsifiable — specify what evidence would confirm or refute it.

Format:

H1 (most likely): [description] — confirmed if [X], refuted if [Y]
H2: [description] — confirmed if [X], refuted if [Y]
H3: [description] — confirmed if [X], refuted if [Y]

Parallel Investigation

For complex problems with 3+ hypotheses and a non-obvious root cause, spawn parallel background investigators simultaneously.

Spawn condition: 3+ hypotheses AND the problem is not a simple typo, missing import, or syntax error.

Skip when 1-2 hypotheses are obvious (e.g., stack trace points directly to the bug).

Launch in parallel (model: "opus", run_in_background: true):

One subagent per hypothesis — each receives the hypothesis, relevant file paths, what evidence to look for, and instructions to report confirmed / refuted / inconclusive with evidence. Budget: max 5 tool calls per subagent.
Codex exec (read-only) — run the /codex skill in exec mode with a focused prompt describing the problem, reproduction, and files examined. Provides an independent perspective that may spot patterns the hypothesis-driven subagents miss. Run the /evaluate-findings skill on its output.

After all investigators complete, merge results. Codex findings that overlap with a subagent's confirmed hypothesis reinforce confidence. Novel codex findings become additional hypotheses to test in Step 4.

Step 4: Test

Verify each hypothesis with minimal, targeted actions:

Action Type	Tool
Find usage or pattern	Grep
Read surrounding code	Read
Check recent changes	Bash (`git log`, `git blame`, `git diff`)
Run isolated test	Bash (specific test command)
Check dependency version	Bash (`npm ls`, `pip3 show`, etc.)
Inspect runtime state	Bash (add temporary logging, run, check output)

Record each result:

Hypothesis	Verdict	Evidence
H1	confirmed / refuted / inconclusive	[what was found]
H2	confirmed / refuted / inconclusive	[what was found]

Iteration

If all hypotheses are refuted or inconclusive:

Document what was learned — each refuted hypothesis eliminates a possibility and narrows the search
Return to Step 2 with the new information to re-isolate
Generate new hypotheses in Step 3 based on updated understanding

Cycle budget: maximum 2 full cycles (hypothesize → test → learn → repeat) before escalating.

Escalation

After 2 failed hypothesis cycles, offer escalation to /oracle via AskUserQuestion:

Investigation stalled after [N] hypothesis cycles.

Tested: [summary of hypotheses and evidence]
Remaining unknowns: [what is still unclear]

Escalate to Oracle? (consults external model with full context)

Proceed only if the user approves.

Investigation Report

Present results using AskUserQuestion:

Investigation Report:

Problem: [one-line description]
Type: [runtime error | test failure | build failure | type error | performance | unexpected behavior]
Root cause: [confirmed cause, or "unresolved" with best hypothesis]

Evidence:
- [what confirmed the root cause]

Suggested fix: [description of what to change, or "needs further investigation"]
Reproduction command: [command to verify the fix once applied]

Hypotheses tested:
1. [hypothesis] — [confirmed/refuted/inconclusive] — [evidence]
2. [hypothesis] — [confirmed/refuted/inconclusive] — [evidence]

Escalation: [none | oracle]

Rules

If the problem turns out to be environmental (wrong Node version, missing dependency, OS-specific), report that clearly — it may not require a code fix.
If the problem is in a dependency (not the project's code), document the dependency issue and suggest workaround options rather than patching the dependency.

investigate

Safety Notice

Copy this and send it to your AI assistant to learn

Investigate

Step 1: Characterize

Step 2: Isolate

Git Archeology

Scope Narrowing

Step 3: Hypothesize

Parallel Investigation

Step 4: Test

Iteration

Escalation

Investigation Report

Rules

Source Transparency

Related Skills

find-dead-code

codex

code-style

simplify-code