systematic-debugging

Systematic Debugging

Core Principle

NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST.

Never apply symptom-focused patches that mask underlying problems. Understand WHY something fails before attempting to fix it.

The Four-Phase Framework

Phase 1: Root Cause Investigation

Before touching any code:

Read error messages thoroughly - Every word matters
Reproduce the issue consistently - If you can't reproduce it, you can't verify a fix
Examine recent changes - What changed before this started failing?
Gather diagnostic evidence - Logs, stack traces, state dumps
Trace data flow - Follow the call chain to find where bad values originate

Root Cause Tracing Technique:

Observe the symptom - Where does the error manifest?
Find immediate cause - Which code directly produces the error?
Ask "What called this?" - Map the call chain upward
Keep tracing up - Follow invalid data backward through the stack
Find original trigger - Where did the problem actually start?

Key principle: Never fix problems solely where errors appear—always trace to the original trigger.

Phase 2: Pattern Analysis

Locate working examples - Find similar code that works correctly
Compare implementations completely - Don't just skim
Identify differences - What's different between working and broken?
Understand dependencies - What does this code depend on?

Phase 3: Hypothesis and Testing

Apply the scientific method:

Formulate ONE clear hypothesis - "The error occurs because X"
Design minimal test - Change ONE variable at a time
Predict the outcome - What should happen if hypothesis is correct?
Run the test - Execute and observe
Verify results - Did it behave as predicted?
Iterate or proceed - Refine hypothesis if wrong, implement if right

Phase 4: Implementation

Create failing test case - Captures the bug behavior
Implement single fix - Address root cause, not symptoms
Verify test passes - Confirms fix works
Run full test suite - Ensure no regressions
If fix fails, STOP - Re-evaluate hypothesis

Critical rule: If THREE or more fixes fail consecutively, STOP. This signals architectural problems requiring discussion, not more patches.

Red Flags - Process Violations

Stop immediately if you catch yourself thinking:

"Quick fix for now, investigate later"
"One more fix attempt" (after multiple failures)
"This should work" (without understanding why)
"Let me just try..." (without hypothesis)
"It works on my machine" (without investigating difference)

Warning Signs of Deeper Problems

Consecutive fixes revealing new problems in different areas indicates architectural issues:

Stop patching
Document what you've found
Discuss with team before proceeding
Consider if the design needs rethinking

Common Debugging Scenarios

Test Failures

Read the FULL error message and stack trace
Identify which assertion failed and why
Check test setup - is the test environment correct?
Check test data - are mocks/fixtures correct?
Trace to the source of unexpected value

Runtime Errors

Capture the full stack trace
Identify the line that throws
Check what values are undefined/null
Trace backward to find where bad value originated
Add validation at the source

"It worked before"

Use git bisect to find the breaking commit
Compare the change with previous working version
Identify what assumption changed
Fix at the source of the assumption violation

Intermittent Failures

Look for race conditions
Check for shared mutable state
Examine async operation ordering
Look for timing dependencies
Add deterministic waits or proper synchronization

Debugging Checklist

Before claiming a bug is fixed:

Root cause identified and documented
Hypothesis formed and tested
Fix addresses root cause, not symptoms
Failing test created that reproduces bug
Test now passes with fix
Full test suite passes
No "quick fix" rationalization used
Fix is minimal and focused

Success Metrics

Systematic debugging achieves ~95% first-time fix rate vs ~40% with ad-hoc approaches.

Signs you're doing it right:

Fixes don't create new bugs
You can explain WHY the bug occurred
Similar bugs don't recur
Code is better after the fix, not just "working"

Integration with Other Skills

testing-patterns: Create test that reproduces the bug before fixing

systematic-debugging

Safety Notice

Copy this and send it to your AI assistant to learn

Source Transparency

Related Skills

formik-patterns

core-components

testing-patterns

react-ui-patterns