Bug Investigation - Scientific Method

Apply scientific methodology to investigate and resolve software bugs systematically.

Scientific Method Process

Observe - Gather data about the problem
Hypothesize - Form testable explanations (ranked by likelihood)
Experiment - Test hypotheses with controlled changes
Analyze - Interpret results objectively
Conclude - Identify root cause and validate fix

Core Principles

Context first - Understand the project before investigating
Hypothesis-driven - Never jump to solutions without forming testable hypotheses
Isolate variables - Change one thing at a time
Reproduce reliably - Can't fix what you can't reproduce
Root causes over symptoms - Dig deeper than surface fixes
Validate rigorously - Confirm fix resolves issue without regressions

Investigation Workflow

Phase 1: Project Context (2-5 min)

Discover before investigating:

Language & version (Python 3.11, Java 17, Go 1.21, etc.)
Build system (Gradle, npm, Cargo, Make, etc.)
Key dependencies & frameworks
Architecture pattern (MVC, microservices, etc.)
Testing setup

Quick discovery:

# Find package managers
view package.json / requirements.txt / Cargo.toml / pom.xml

# Check config files  
view .env / config.yml / settings.py

# Identify entry points
view main.* / index.* / app.*

Output: One-line context

Python 3.11, Flask API, PostgreSQL, pytest, Docker

Phase 2: Problem Definition

Gather:

Error messages (full text, codes)
Stack traces / logs
Steps to reproduce
Expected vs actual behavior
Environment (OS, version, config)
Reproducibility (always/sometimes/rare)

Document:

Bug: [Short description]
Reproduces: [Always/Sometimes/Unable]
Error: [Key error message]
Steps:
1. [Action 1]
2. [Action 2]
3. [Failure occurs]
Expected: [What should happen]
Actual: [What happens]

Phase 3: Hypotheses (Ranked)

Form 2-4 testable hypotheses, ranked by likelihood.

H1: [Most likely cause]

Evidence for: [Why this is likely]
Test: [How to prove/disprove]
If true: [Expected result]
If false: [Expected result]

H2: [Alternative cause]

Evidence for: [Supporting observations]
Test: [Falsifiable experiment]

Common categories:

Logic errors (off-by-one, wrong operator, incorrect condition)
State issues (race condition, uninitialized, stale data)
Type/data (null/nil, type mismatch, parsing error)
Concurrency (data race, deadlock, thread safety)
Integration (API mismatch, version incompatibility)
Environment (config, platform-specific, resource limits)

Phase 4: Experiment

For each hypothesis:

Test H1: [Hypothesis name]

Change: [One variable to modify]
Measure: [What to observe]
Method: [Specific steps]
Result: [Actual outcome]
Conclusion: [Validated/Invalidated]

Techniques:

Add logging at key points
Use debugger breakpoints
Binary search (remove half the code)
Minimal reproduction (strip to essentials)
Diff working vs broken states
Isolate components

Phase 5: Root Cause

Identified: [Clear statement of actual cause]

Evidence: [Chain from observation → hypothesis → validation]

Why it occurred:

Immediate: [Technical reason]
Contributing: [What enabled this]
Systemic: [Deeper issue if any]

Phase 6: Solution & Validation

Fix: [Specific changes to make]

Why this works: [Explain causal connection]

Validation:

Reproduce bug (confirm failure)
Apply fix
Retest (confirm success)
Test edge cases
Run test suite (no regressions)
Add test for this bug

Prevention:

[Test to add]
[Assertion to include]
[Pattern to avoid]

Investigation Techniques

Binary Search: Remove half the code, test if bug persists. Repeat on failing half until isolated.

Minimal Reproduction: Strip to <50 lines that reproduce issue. Removes noise.

Differential Testing: Compare working vs broken (commits, versions, configs).

Strategic Logging: Add prints at key decision points to trace execution flow.

Rubber Duck: Explain code line-by-line aloud. Often reveals logic errors.

Common Bug Patterns (Language-Agnostic)

Logic Errors:

Off-by-one: i < n vs i <= n
Wrong operator: && vs ||, == vs =
Negation errors: !condition logic flipped

State Issues:

Race conditions: concurrent access without synchronization
Uninitialized: using variable before setting value
Stale state: using outdated cached data

Type/Data:

Null/nil dereference
Type coercion errors
Integer overflow
Floating-point precision

Concurrency:

Deadlock: mutual waiting for locks
Data race: unsynchronized shared access
Thread safety: non-thread-safe code on multiple threads

Integration:

API contract mismatch
Version incompatibility
Missing dependencies
Incorrect configuration

Output Format

# Bug Investigation: [Name]

## Context
[One-line: Language, framework, architecture]

## Problem
Error: [Message]
Reproduces: [Always/Sometimes]
Steps: [1,2,3]

## Hypotheses
H1: [Most likely] - Test by [method]
H2: [Alternative] - Test by [method]

## Investigation
Tested H1: [Result - validated/invalidated]
[If needed] Tested H2: [Result]

## Root Cause
[One sentence explanation]
Evidence: [What confirmed it]

## Solution
Fix: [Specific change]
Why it works: [Explanation]
Validated: [Tested successfully]

## Prevention
- Test added: [Description]
- Warning signs: [What to watch for]

Quick Decision Tree

Symptom → Likely Category:

Intermittent failure → Concurrency/state
Always fails same way → Logic error
Null/nil crash → Type/data
Specific environment only → Configuration
Performance degradation → Resource/algorithm
After dependency update → Integration

Critical Reminders

Start with context discovery
Form hypotheses before coding
Change ONE variable at a time
Reproduce before fixing
Validate fix rigorously
Add regression test
Document root cause

shared-bug-investigation

Safety Notice

Copy this and send it to your AI assistant to learn

Bug Investigation - Scientific Method

Scientific Method Process

Core Principles

Investigation Workflow

Phase 1: Project Context (2-5 min)

Phase 2: Problem Definition

Phase 3: Hypotheses (Ranked)

Phase 3: Hypotheses (Ranked)

Phase 4: Experiment

Phase 5: Root Cause

Phase 6: Solution & Validation

Investigation Techniques

Common Bug Patterns (Language-Agnostic)

Output Format

Quick Decision Tree

Critical Reminders

Source Transparency

Related Skills

ios-accessibility-validator

android-accessibility-validator

android-kotlin-api-design-reviewer

ios-swiftui-architecture-review