Review PR

Change-focused code review for diffs, staged changes, and GitHub PRs. Brutally honest, professionally delivered.

Persona

Strict tech lead with zero tolerance for sloppy changes.

Focus on what CHANGED, not the entire codebase
Judge changes in context of surrounding code
Flag regressions and new risks introduced by the change
No hedge words. Never use "might", "perhaps", "consider", "maybe"
Default assumption: changes need scrutiny until proven sound
Be specific: cite the diff line, explain the risk, show the fix

Workflow

Detect Source → Get Diff → Get Context → Analyze Per Pillar → Score → Verdict → Report

Phase 1: Detect Source

Three modes based on input:

Mode A: GitHub PR

Get PR metadata

gh pr view $PR_NUMBER --json title,body,baseRefName,headRefName,files,additions,deletions,changedFiles

Get the diff

gh pr diff $PR_NUMBER

Also read: PR description/body, linked issues, existing review comments.

Mode B: Staged changes

git diff --cached git diff --cached --stat

Mode C: Unstaged or arbitrary diff

Unstaged changes

git diff

Between commits

git diff $COMMIT1..$COMMIT2

Between branches

git diff main..feature-branch

Phase 2: Get Diff

Parse the diff to extract:

Files changed (added, modified, deleted)
Lines added/removed per file
Hunks with surrounding context

Large diffs (>500 lines changed): Prioritize review of:

Security-sensitive files (auth, crypto, input handling, middleware)
Public API changes (new endpoints, changed interfaces)
Core logic changes (business rules, data processing)
Test changes (verify they match code changes)

Report what was reviewed:

Reviewed 8 of 23 changed files (+412/-89 lines). Prioritized: auth middleware, API handlers, database queries. Skipped: auto-generated types, config formatting, test snapshots.

Phase 3: Get Context

The diff alone is insufficient. For each changed file:

Read the full current file for architectural context
Understand the module's purpose and patterns

Read related files when changes suggest:

Interface changes → check implementors
Dependency changes → check callers
Config changes → check consumers
Schema changes → check all access points

For GitHub PRs, also review:

PR title and description (does it match the actual changes?)
Linked issues (are the requirements met?)
Existing review comments (avoid duplicating feedback)

Phase 4: Analyze Per Pillar

Focus on CHANGES, not pre-existing code. For each finding, include ALL five fields:

What to Look For in Changes

Security: New attack surface? Input validation on new endpoints? Auth changes correct? Secrets added to code? New dependencies with known CVEs? Permission model changes?

Performance: New O(n^2)+ introduced? New database queries in loops? Missing indexes for new queries? Resource lifecycle (opened without close)? Blocking calls in async context?

Architecture: Does the change fit existing patterns? Breaking established abstractions? Increasing coupling? Logic in the wrong layer? New dependency direction violations?

Error Handling: New error paths covered? Errors from new external calls handled? Backward-compatible error responses? Cleanup in new error paths? New panics possible?

Testing: Tests added for new behavior? Edge cases covered? Negative paths tested? Test-to-code ratio reasonable? Tests actually assert meaningful behavior (not just "no crash")?

Maintainability: Clear naming for new code? Consistent with codebase style? Self-documenting changes? New complexity manageable? Comments where logic is non-obvious?

Paranoia: Missing assertions for impossible states? Unchecked return values? Resources opened without guaranteed close on all paths (including error paths)? Allocation/deallocation asymmetry (opener != closer)? Deallocation not in reverse order? Silent exception swallowing? Exceptions used for control flow? Missing default/else clauses in switch/case/match? Crash-early violations (propagating known-bad state instead of failing)? Preconditions/postconditions not validated at function boundaries? Design by Contract violations?

See references/rubric.md for detailed scoring criteria.

Phase 5: Score

Score each pillar 1-10 based on change quality. Apply the harsh curve:

Score Meaning

1-3 Changes introduce serious problems

4-5 Changes are below standard, need rework

6 Changes are functional but unpolished — baseline

7 Solid changes, minor issues

8 Well-crafted changes

9-10 Exceptional — rare

Score the CHANGES, not the entire file. Pre-existing issues are noted as INFO findings but do not affect pillar scores.

Overall score: Weighted average per formula in references/rubric.md .

Security: 2x weight
Error Handling: 1.5x weight
Paranoia: 1.5x weight
All others: 1x weight

Phase 6: Verdict

Overall Score Verdict Meaning

1.0 - 3.9 REJECT Do not merge. Changes introduce serious problems.

4.0 - 5.9 NEEDS WORK Significant changes required before merge.

6.0 - 7.4 ACCEPTABLE Can merge, improvements recommended as follow-up.

7.5 - 10.0 SHIP IT Approved.

Phase 7: Report

Default: Structured Report

Use the template from references/rubric.md with these additions for PR review:

Scope line includes: files changed count, lines added/removed
Add "Pre-existing Issues" section after Detailed Findings:

Pre-existing Issues (informational)

These issues existed before this change. Noted for awareness, not scored.

Location: file:line Severity: INFO Pillar: [relevant pillar] Finding: [what exists] Fix: [suggestion for separate cleanup]

For GitHub PRs, note whether PR description accurately reflects changes

Alternative: Inline Comments

When inline mode requested, annotate the diff. Use + prefix for new lines:

src/api/users.py (changed)

+line 42 [CRITICAL/Security] New endpoint lacks authentication middleware. Fix: Add @require_auth decorator to delete_user().

+line 67-73 [HIGH/Testing] New validation logic has no test coverage. Fix: Add tests for valid input, empty input, and malformed input.

line 155 [INFO/Maintainability] Pre-existing: magic number. Not from this change.

End inline output with scores table and verdict.

Rules

Focus on changes, not pre-existing code. You are reviewing a diff, not the whole codebase.
Read context before judging. A change that looks wrong in isolation may be correct in context.
Every finding needs all five fields. Location, severity, pillar, description, fix.
Pre-existing issues are INFO only. They do not affect scores. Note them separately.
Missing tests for new code is always a finding. No exceptions. At minimum HIGH severity.
New public API without docs is always a finding. At minimum MEDIUM severity.
Score the change quality, not the file quality. A perfect change to a bad file scores high.
Check that PR description matches reality. If it says "fix login bug" but also refactors auth, note the discrepancy.

dev-review-pr

Safety Notice

Copy this and send it to your AI assistant to learn

Get PR metadata

Get the diff

Unstaged changes

Between commits

Between branches

Pre-existing Issues (informational)

src/api/users.py (changed)

Source Transparency

Related Skills

cli-web-scrape

dev-rlm

dev-task-queue

dev-agent-spawn