Debugging Methodology

Systematic approach to finding and fixing bugs.

Core Principles

Occam's Razor - Start with the simplest explanation
Binary Search - Isolate the problem area systematically
Preserve Evidence - Understand state before making changes
Document Hypotheses - Track what was tried and didn't work

Debugging Workflow

Understand → What is expected vs actual behavior?
Reproduce → Can you trigger the bug reliably?
Locate → Where in the code does it happen?
Diagnose → Why does it happen? (root cause)
Fix → Minimal change to resolve
Verify → Confirm fix works, no regressions

Common Bug Patterns

Symptom Likely Cause Check First

TypeError/null Missing null check Input validation

Off-by-one Loop bounds, array index Boundary conditions

Race condition Async timing Await/promise handling

Import error Path/module resolution File paths, exports

Type mismatch Wrong type passed Function signatures

Flaky test Timing, shared state Test isolation

System-Level Tools

Memory Analysis

Valgrind (C/C++/Rust)

valgrind --leak-check=full --show-leak-kinds=all ./program valgrind --tool=massif ./program # Heap profiling

Python

python -m memory_profiler script.py

Performance Profiling

Linux perf

perf record -g ./program perf report perf top # Real-time CPU usage

Python

python -m cProfile -s cumtime script.py

System Tracing (Traditional)

System calls (ptrace-based, high overhead)

strace -f -e trace=all -p PID

Library calls

ltrace -f -S ./program

Open files/sockets

lsof -p PID

Memory mapping

pmap -x PID

eBPF Tracing (Modern, Production-Safe)

eBPF is the modern replacement for strace/ptrace-based tracing. Key advantages:

Low overhead: Safe for production use
No recompilation: Works on running binaries
Non-intrusive: Doesn't stop program execution
Kernel-verified: Bounded execution, can't crash the system

BCC tools (install: apt install bpfcc-tools)

Trace syscalls with timing (like strace but faster)

sudo syscount -p PID # Count syscalls sudo opensnoop -p PID # Trace file opens sudo execsnoop # Trace new processes sudo tcpconnect # Trace TCP connections sudo funccount 'vfs_*' # Count kernel function calls

bpftrace (install: apt install bpftrace)

One-liner tracing scripts

sudo bpftrace -e 'tracepoint:syscalls:sys_enter_open { printf("%s %s\n", comm, str(args->filename)); }' sudo bpftrace -e 'uprobe:/bin/bash:readline { printf("readline\n"); }'

Trace function arguments in Go/other languages

sudo bpftrace -e 'uprobe:./myapp:main.handleRequest { printf("called\n"); }'

eBPF Tool Hierarchy:

Level Tool Use Case

High BCC tools Pre-built tracing scripts

Medium bpftrace One-liner custom traces

Low libbpf/gobpf Custom eBPF programs

When to use eBPF over strace:

Production systems (strace adds 10-100x overhead)
Long-running traces
High-frequency syscalls
When you can't afford to slow down the process

Network Debugging

Packet capture

tcpdump -i any port 8080

Connection status

ss -tuln netstat -tuln

Language-Specific Debugging

Python

Quick debug

import pdb; pdb.set_trace()

Better: ipdb or pudb

import ipdb; ipdb.set_trace()

Print with context

print(f"{var=}") # Python 3.8+

JavaScript/TypeScript

// Browser/Node debugger;

// Structured logging console.log({ var1, var2, context: 'function_name' });

Rust

// Debug print dbg!(&variable);

// Backtrace on panic RUST_BACKTRACE=1 cargo run

Debugging Questions

When stuck, ask:

What changed recently that could cause this?
Does it happen in all environments or just one?
Is the bug in my code or a dependency?
What assumptions am I making that might be wrong?
Can I write a minimal reproduction?

Effective Debugging Practices

Targeted changes: Form a hypothesis, change one thing at a time
Use proper debuggers: Step through code with breakpoints when possible
Find root causes: Trace issues to their origin, fix the source
Reproduce first: Create a minimal reproduction before attempting a fix
Verify the fix: Confirm the fix resolves the issue and passes tests

debugging methodology

Safety Notice

Copy this and send it to your AI assistant to learn

Valgrind (C/C++/Rust)

Python

Linux perf

Python

System calls (ptrace-based, high overhead)

Library calls

Open files/sockets

Memory mapping

BCC tools (install: apt install bpfcc-tools)

Trace syscalls with timing (like strace but faster)

bpftrace (install: apt install bpftrace)

One-liner tracing scripts

Trace function arguments in Go/other languages

Packet capture

Connection status

Quick debug

Better: ipdb or pudb

Print with context

Source Transparency

Related Skills

ruff-linting

imagemagick-conversion

jq json processing