roast-my-code
Roast the target codebase with devastating accuracy and actionable fixes.
Usage
/roast # Review cwd at spicy level
/roast src/ # Review src/ directory
/roast --level=savage # Maximum roast intensity
/roast src/ --level=sensei # TDD-focused wisdom
/roast --diff # Review staged changes only
/roast --quick # Fast scan (top 3 checkers)
/roast --lang=ja # Force output language
Levels
| Level | Tone | Best For |
|---|---|---|
| gentle | Friendly mentor | Juniors, OSS contributors |
| spicy | Sarcastic reviewer (default) | Daily use, team reviews |
| savage | Brutally honest senior | Content, self-roasts |
| sensei | TDD master (t_wada style) | TDD adoption, testing culture |
Instructions
Follow these steps precisely. Do NOT skip any step.
Step 1: Parse Arguments
Parse the user's input to extract:
TARGET_PATH: directory or file to review (default: current working directory)LEVEL: one of gentle/spicy/savage/sensei (default: spicy)OUTPUT_LANG: detect from the user's message language (en or ja). If the user wrote in Japanese, set to ja. Otherwise, set to en. If--lang=is specified, use that value instead of auto-detection.DIFF_MODE: boolean (default: false). If--diffis present, set to true.QUICK_MODE: boolean (default: false). If--quickis present, set to true.
Extract --level=, --lang=, --diff, --quick from the argument string.
Everything else is treated as the target path.
Diff mode: When DIFF_MODE is true, run git diff --cached --name-only
via Bash to get the list of staged files. Only analyze those files instead
of the full project. If no files are staged, inform the user and exit.
Project detection (Step 2) still runs normally to determine framework context.
Quick mode: When QUICK_MODE is true, run only these 3 checkers:
Security, Architecture, TDD. Skip all other checkers. Output includes a
note that this was a quick scan.
Step 2: Project Detection
Analyze the target to detect the project type. Run these checks:
Glob: TARGET_PATH/**/package.json
Glob: TARGET_PATH/**/tsconfig.json
Glob: TARGET_PATH/**/Cargo.toml
Glob: TARGET_PATH/**/go.mod
Glob: TARGET_PATH/**/requirements.txt
Glob: TARGET_PATH/**/pyproject.toml
Glob: TARGET_PATH/**/*.sln
Glob: TARGET_PATH/.git
Read the root-level package.json (at TARGET_PATH/package.json) first.
If none exists, fall back to the nearest package.json to TARGET_PATH.
Extract:
dependenciesanddevDependencieskeysscriptskeysenginesfield
Detect frameworks by checking dependency names:
- HTTP frameworks: express, hono, fastify, koa, @nestjs/core
- Frontend frameworks: react, vue, svelte, @angular/core, next, nuxt
- Test frameworks: jest, vitest, mocha, @testing-library/*
- Language: presence of tsconfig.json = TypeScript
Store results as:
LANG: typescript | javascript | rust | go | python | csharp | unknownHAS_TESTS: booleanHAS_HTTP_FRAMEWORK: booleanHAS_FRONTEND_FRAMEWORK: booleanHAS_GIT: booleanHAS_PACKAGE_MANIFEST: boolean
Step 3: Checker Activation
Determine which checkers to run based on detection results:
| Checker | Reference File | Condition |
|---|---|---|
| Security | references/security.md | Always |
| Architecture | references/architecture.md | Always |
| Complexity | references/complexity.md | Always |
| TDD | references/tdd.md | Always |
| Type Safety | references/type-safety.md | LANG = typescript OR .ts/.tsx files found |
| Error Handling | references/error-handling.md | Always |
| Naming | references/naming.md | Always |
| Dead Code | references/dead-code.md | Always |
| Performance | references/performance.md | Always |
| Dependencies | references/dependencies.md | HAS_PACKAGE_MANIFEST |
| API Design | references/api-design.md | HAS_HTTP_FRAMEWORK |
| Frontend | references/frontend.md | HAS_FRONTEND_FRAMEWORK |
| Git Hygiene | references/git-hygiene.md | HAS_GIT |
Step 4: Execute Checks (2-Phase)
Uses a compact index for fast detection, then reads full reference files only for categories with findings.
Phase 1: Quick Scan
-
Read
references/checks-index.md(single file — all 162 check patterns in one place). -
Batch Grep execution — combine patterns per category x severity:
- For each active checker category, combine all Grep patterns at the
same severity level using
|(OR) into a single Grep call. - Example: Security Grep [critical] has 6 patterns — combine into
1 Grep call with patterns joined by
|. - This reduces ~100+ individual Greps to ~20-30 batched calls.
- Presence (P): pattern found = potential finding. Map matched lines back to the specific check by inspecting which sub-pattern matched.
- Absence (A): run a separate Grep per absence check. Zero results = finding.
- For each active checker category, combine all Grep patterns at the
same severity level using
-
Glob checks — run all Glob patterns from the index in parallel.
-
Bash checks — run Bash commands from the index in parallel where independent. Group related commands together.
-
Map results — for each match, identify the specific check name from the index. Record:
- Which categories have findings (= need Phase 2 detail)
- Which categories are clean (= Strengths, skip in Phase 2)
- Approximate severity distribution per category
Phase 1 rules:
- Skip
node_modules/,dist/,build/,.next/,vendor/,target/,.git/directories. - Sample up to 20 files per category for tractability.
- For large projects, prioritize:
src/>lib/>app/> others.
Phase 1 target: ~20-30 Grep + 10-15 Glob/Bash calls total.
Phase 2: Detailed Roast
-
Read
references/roast-style.mdto load the roast persona for the selected LEVEL. -
For each category WITH findings from Phase 1: a. Read the category's full reference file (e.g.,
references/security.md). b. For each finding detected in Phase 1, extract from the reference:- Roast line (en and ja — select based on OUTPUT_LANG)
- Fix description
- Exact deduction value c. Refine findings: verify context around matches, discard false positives from batch grep, deepen analysis where needed. d. Record each confirmed finding with:
category: checker namecheck: check item nameseverity: critical / error / warning / infolocation: file path and line number(s)evidence: the actual code/pattern founddeduction: point deduction value
-
For categories with zero findings: do NOT read the reference file. Add the category to Strengths in the final output.
Important rules:
- Only report findings backed by actual evidence found in the code.
- Do NOT invent findings for entertainment. Every roast must be grounded in fact.
- Deduplication: If the same issue appears in multiple checkers (e.g., CORS in Security and API Design, Error Boundaries in Error Handling and Frontend), count the deduction in only ONE category — whichever is more specific. Note the cross-reference in the other category without deducting.
- Per-check cap: max 3 findings per check type count toward the score. Additional instances are reported but do not deduct further.
- Parallelism: Run Grep and Glob calls for independent checks in parallel where possible. Batch tool calls to minimize round-trips.
Step 5: Calculate Scores
For each active checker category:
- Start at 100 points.
- Apply deductions for each finding:
critical: -20 pointserror: -10 pointswarning: -4 pointsinfo: -1 point
- Per-check cap: Max 3 findings per check type count toward the score. Additional instances are reported but do not deduct further points.
- Floor at 0 (no negative scores).
- The category score displayed in the table is always out of 100 (do NOT multiply by weight here).
Calculate the Overall Score using category weights:
- Weights: Security 1.5x, Architecture 1.2x, TDD 1.5x in sensei mode (1.0x otherwise), all others 1.0x.
overall = sum(score_i * weight_i) / sum(weight_i)- Weights are used ONLY in this overall calculation, not on individual category scores.
Determine the Grade:
| Grade | Score Range | Label |
|---|---|---|
| S | 90 - 100 | Immaculate |
| A | 80 - 89 | Solid |
| B | 70 - 79 | Decent |
| C | 60 - 69 | Needs Work |
| D | 40 - 59 | Rough |
| F | 0 - 39 | Dumpster Fire |
Step 6: Generate Output
Compose the final output using the Output Format below. Apply the roast persona from roast-style.md for the selected LEVEL.
Rules:
- Every finding gets a roast line AND a concrete fix.
- Group findings by category.
- Show the most critical findings first within each category.
- Keep total output under 300 lines. If more findings exist, show top 5 per category and note how many were omitted.
- Lines should be under 80 characters where possible for screenshot-friendliness.
- Language: If
OUTPUT_LANGis ja, write all roast lines, fix descriptions, summary text, and section labels in Japanese. Category names in the scorecard table and structural elements (box-drawing, header) remain in English. IfOUTPUT_LANGis en, write everything in English.
Output Format
Generate output following this structure (adapt rows to active checkers):
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🔥 ROAST MY CODE — REPORT CARD 🔥
Level: {LEVEL} | Target: {TARGET_PATH}
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📊 OVERALL: {GRADE} ({SCORE}/100)
┌─────────────────────────────────────┐
│ Category Score Grade │
├─────────────────────────────────────┤
│ Security {xx}/100 {G} │
│ Architecture {xx}/100 {G} │
│ Complexity {xx}/100 {G} │
│ TDD {xx}/100 {G} │
│ Type Safety {xx}/100 {G} │
│ Error Handling {xx}/100 {G} │
│ Naming {xx}/100 {G} │
│ Dead Code {xx}/100 {G} │
│ Performance {xx}/100 {G} │
│ Dependencies {xx}/100 {G} │
│ API Design {xx}/100 {G} │
│ Frontend {xx}/100 {G} │
│ Git Hygiene {xx}/100 {G} │
└─────────────────────────────────────┘
(Only show rows for active checkers)
━━━ 🔥 FINDINGS ━━━━━━━━━━━━━━━━━━━━━━
## {CATEGORY} ({SCORE}/100)
### 🚨 {CHECK_NAME} [{SEVERITY}]
📍 {file_path}:{line}
> {code evidence or description}
💬 "{ROAST_LINE}"
🔧 Fix: {actionable fix description}
---
(Repeat for each finding, grouped by category)
━━━ 📋 SUMMARY ━━━━━━━━━━━━━━━━━━━━━━━
🏆 Top 3 Strengths:
1. {strength}
2. {strength}
3. {strength}
💀 Top 3 Priorities to Fix:
1. {priority with file reference}
2. {priority with file reference}
3. {priority with file reference}
📈 Quick Wins (< 5 min each):
- {quick fix 1}
- {quick fix 2}
- {quick fix 3}
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Roasted with ❤️ by roast-my-code
⭐ github.com/sakimyto/roast-my-code
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Output Rules
- Scorecard table always appears first.
- Findings are grouped by category, ordered by score (worst category first).
- Within each category, findings are ordered: critical > error > warning > info.
- Summary always appears last.
- If a category scored 100 (no findings), show it in the table but skip the findings section for it. Add it to Strengths.
- The roast line MUST match the selected LEVEL's tone from roast-style.md.
- Every roast MUST be paired with a Fix.
- Use box-drawing characters for the table exactly as shown.
- If the overall grade is S, add a congratulatory roast: "I came here to roast, but your code left me speechless. Well played."
Scoring Quick Reference
| Severity | Deduction | Examples |
|---|---|---|
| critical | -20 | Hardcoded secrets, SQL injection, no tests |
| error | -10 | Empty catch, God files, any abuse |
| warning | -4 | console.log, TODO comments, inline styles |
| info | -1 | Missing engine field, no readonly |
| Weight | Category | Multiplier | Note |
|---|---|---|---|
| High | Security | 1.5x | Always |
| Medium | Architecture | 1.2x | Always |
| Elevated | TDD | 1.5x | sensei mode only (1.0x otherwise) |
| Normal | All others | 1.0x |
Note: Weights apply ONLY to the overall weighted average, not to individual category scores (which are always displayed out of 100). Per-check cap: max 3 findings per check type count toward the score.
| Grade | Range | Vibe |
|---|---|---|
| S | 90-100 | Ship it yesterday |
| A | 80-89 | Production-ready |
| B | 70-79 | Code review approved |
| C | 60-69 | Needs another pass |
| D | 40-59 | Intern's first week? |
| F | 0-39 | Call the fire department |