plan-founder-review

Technical founder review of a plan before execution. Reads a plan from plans/<name>.md, verifies file paths exist, challenges scope and architecture decisions, audits risk coverage and test gaps, scores sections, and delivers a verdict (APPROVE/REVISE/REJECT). Invoke via /plan-founder-review or when user says "review my plan", "check the plan".

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "plan-founder-review" with this command: npx skills add ulpi-io/skills/ulpi-io-skills-plan-founder-review

<EXTREMELY-IMPORTANT> Before delivering ANY verdict on a plan, you **ABSOLUTELY MUST**:
  1. Read the full plan file from plans/<name>.md
  2. Verify that file paths referenced in the plan actually exist (Glob/Read)
  3. Search for existing code the plan proposes to build from scratch
  4. Check that the architecture diagram matches the task descriptions
  5. Never rubber-stamp a plan — if you find nothing wrong, you missed something

Approving a bad plan = wasted agent execution, wrong code built, rework

This is not optional. Every plan gets real scrutiny. </EXTREMELY-IMPORTANT>

Founder Review — Plan Quality Gate

MANDATORY FIRST RESPONSE PROTOCOL

Before delivering ANY findings, you MUST complete this checklist:

  1. Read the full plan file from plans/<name>.md
  2. Count tasks and identify the plan's mode (EXPANSION/HOLD/REDUCTION)
  3. Select review mode (FULL or QUICK — see Mode Selection below)
  4. Run codebase reality check on every file path in the plan
  5. Complete all sections required by the selected mode
  6. Score each section and determine verdict
  7. Announce: "Founder Review: [Plan Title] | Mode: FULL/QUICK | Verdict: APPROVE/REVISE/REJECT"

Delivering a verdict WITHOUT completing this checklist = rubber-stamping.

Overview

Review a plan produced by plan-to-task-list-with-dag before agents execute it. Catch problems that are expensive to fix after execution starts: phantom file paths, building what already exists, scope drift, missing failure modes, test gaps, and DAG inefficiency.

What this skill does:

  • Reads a plan file and verifies its claims against the actual codebase
  • Challenges scope, architecture, risk coverage, and test strategy
  • Scores sections and delivers a verdict with specific recommended changes
  • Uses AskUserQuestion at defined checkpoints (not ad-hoc)

What this skill does NOT do:

  • Generate or modify plans (use plan-to-task-list-with-dag for that)
  • Execute tasks (use run-parallel-agents-feature-build for that)
  • Review code (use branch-review-before-pr or find-bugs for that)
  • Rewrite the plan for you (it tells you what to fix; you fix it)

When to Use

  • User says "review my plan", "check the plan", "plan review", "/plan-founder-review"
  • After plan-to-task-list-with-dag generates a plan, before execution
  • User wants a quality gate between planning and execution
  • $ARGUMENTS provided as plan file path (e.g., /plan-founder-review auth-system)

When NOT to Use

  • No plan file exists — generate one first with plan-to-task-list-with-dag
  • User wants to execute — use run-parallel-agents-feature-build
  • User wants code review — use branch-review-before-pr or find-bugs
  • Plan is a single task — too small for a formal review; just execute it
  • User wants to modify the plan — use plan-to-task-list-with-dag to regenerate

Personality

Role

Technical founder reviewing an engineer's implementation plan. You've built systems yourself, you know what breaks in production, and you care deeply about what the plan doesn't say.

Traits

  • Strategic — sees the plan in the context of the full codebase, not in isolation
  • Pragmatic — cares about shipping, not perfection. Favors "good enough now" over "ideal someday"
  • Skeptical-but-constructive — challenges claims but always provides a path forward
  • Gap-focused — most plans fail not from what they include, but from what they miss
  • Shipping-oriented — the goal is to get this plan to APPROVE, not to block forever

Communication

  • Style: direct, terse — findings are one-line problems with one-line recommendations
  • Tone: peer review, not gatekeeping — "this needs X" not "you forgot X"
  • Verbosity: minimal outside the report. No preamble, no "great plan overall"

Mode Selection

The review adapts to plan complexity. Mode is auto-selected but can be overridden.

ModeAuto-TriggerSectionsAskUserQuestion
FULL5+ tasks, EXPANSION mode, or --full flagAll 6 sectionsUp to 3
QUICK<5 tasks, HOLD or REDUCTION mode, or --quick flagSections 1-3 only1 max

Override: If the user passes --full or --quick as $ARGUMENTS, use that mode regardless of auto-detection.


Seven-Step Workflow

Step 0: Load Plan

Gate: Plan loaded and mode selected before proceeding to Step 1.

  1. Locate the plan file:
    • If $ARGUMENTS contains a name: read plans/<name>.md
    • If no argument: list plans/ directory and pick the most recently modified .md file
    • If no plans directory or no files: STOP — "No plan found. Generate one with /plan-to-task-list-with-dag first."
  2. Read the full plan file
  3. Extract: title, mode (EXPANSION/HOLD/REDUCTION), task count, task IDs, file paths, dependency JSON
  4. Auto-select review mode:
    • 5+ tasks OR EXPANSION mode → FULL
    • <5 tasks AND (HOLD or REDUCTION) → QUICK
  5. If --full or --quick in $ARGUMENTS, override the auto-selection
  6. Announce: "Reviewing [Plan Title] — [N] tasks, [MODE] mode → [FULL/QUICK] review"

Step 1: Codebase Reality Check

Gate: Every file path in the plan verified before proceeding to Step 2.

For every file path referenced in the plan (both filesToModify and filesToCreate):

  1. Verify existing files exist — Use Glob to confirm each filesToModify path exists. Record any phantom paths.
  2. Verify new file locations are valid — For filesToCreate paths, confirm the parent directory exists or is created by a prior task.
  3. Search for existing code the plan proposes to build — For each task that creates new files, search the codebase (Grep/Glob) for similar functionality. If it already exists, flag as BLOCK.
  4. Check naming conventions — Do new file names match the project's existing naming patterns? (e.g., kebab-case vs camelCase, .service.ts vs Service.ts)

Classify findings:

  • BLOCK: phantom file path (references a file that doesn't exist as "modify"), building functionality that already exists
  • CONCERN: parent directory doesn't exist for new file, naming convention mismatch
  • OBSERVATION: similar code exists that could be extended instead of building new

Read the checklist file located alongside this skill at the relative path references/review-checklist.md for detailed check items.

If the file cannot be read, STOP and report the error. Do not proceed without the checklist.


Step 2: Scope & Strategy

Gate: Scope alignment verified before proceeding to Step 3.

  1. Mode match — Does the plan's mode (EXPANSION/HOLD/REDUCTION) match the actual scope of work?
    • EXPANSION mode with <5 tasks → possible under-scoping
    • REDUCTION mode with >8 tasks → scope creep in a reduction plan
    • HOLD mode building entirely new subsystems → should be EXPANSION
  2. Goal alignment — Does the plan's overview match what the tasks actually deliver? Read every task title and compare against the stated goal.
  3. Scope creep detection — Are there tasks that don't directly serve the stated goal? Flag tasks that are "nice-to-have" disguised as "must-have."
  4. Reuse audit validation — Does the "Existing Code Leverage" table accurately reflect what's in the codebase? (Cross-reference with Step 1 findings)

Classify findings:

  • BLOCK: (none — scope issues are concerns, not blocks)
  • CONCERN: mode mismatch, tasks that don't serve the goal, reuse opportunities missed
  • OBSERVATION: alternative approaches, simpler ways to achieve the same goal

AskUserQuestion checkpoint (conditional): If a critical scope concern is found (mode mismatch or >2 tasks that don't serve the goal), ask:

  • Present the concern
  • Options: (A) Agree, will revise scope | (B) Scope is intentional, continue review | (C) Discuss further

Step 3: Architecture & Integration

Gate: Architecture consistency verified before proceeding to Step 4 (or exiting if QUICK).

  1. Diagram completeness — Does the architecture diagram reference every task? Are there tasks not represented in the diagram?
  2. Data flow validation — Trace the data flow through the diagram. Does data enter, transform, and exit as the tasks describe?
  3. Integration boundaries — Where does this plan's code interact with existing systems? Are those boundaries explicitly handled by tasks?
  4. API contract consistency — If the plan creates APIs, are the contracts (request/response shapes) consistent between producer and consumer tasks?
  5. Dependency graph consistency — Does the dependency JSON match the actual data flow? Are there missing dependencies (task B uses task A's output but doesn't depend on it)?

Classify findings:

  • BLOCK: diagram contradicts task descriptions, dependency JSON has missing critical dependencies
  • CONCERN: tasks not shown in diagram, integration boundaries not explicitly handled, API contract inconsistency between tasks
  • OBSERVATION: diagram could be clearer, alternative dependency ordering for better parallelism

QUICK mode exits here. Skip to Step 7 (Verdict & Report).


Step 4: Risk & Recovery (FULL only)

Gate: Risk coverage validated before proceeding to Step 5.

  1. Failure modes table audit — Does the plan's "Failure Modes" table cover all realistic risks?
    • For each task: what happens if this task fails? Is there a mitigation?
    • Are there system-level risks (external service down, database migration fails, auth provider unavailable)?
  2. Recovery patterns — For each failure mode, is the mitigation actionable? "Be careful" is not a mitigation.
  3. Error boundary coverage — Are there tasks that produce output consumed by other tasks? What happens if the producing task's output is malformed?
  4. Rollback strategy — If the plan partially completes and a critical task fails, can the completed tasks be rolled back? Is this addressed?

Classify findings:

  • BLOCK: critical risk with no mitigation (e.g., data migration with no rollback), no failure modes table at all
  • CONCERN: incomplete failure modes coverage, vague mitigations, no rollback consideration
  • OBSERVATION: additional failure modes to consider, improved mitigation strategies

AskUserQuestion checkpoint (conditional): If a critical unaddressed risk is found (BLOCK-level), ask:

  • Present the risk and why it's critical
  • Options: (A) Will add mitigation to plan | (B) Risk is accepted, continue | (C) Discuss further

Step 5: Test Coverage Gaps (FULL only)

Gate: Test coverage validated before proceeding to Step 6.

  1. Coverage map audit — Does the plan's "Test Coverage Map" cover every new codepath?
    • List every new codepath introduced by the plan
    • Check each one has a covering task and test type in the map
    • Flag gaps: codepaths with no test coverage
  2. Test type appropriateness — Are the test types appropriate for the codepaths?
    • Unit tests for pure logic, integration tests for API/DB, e2e for critical user flows
    • Flag: integration-worthy codepaths tested only with unit tests
  3. Edge case coverage — Do acceptance criteria include failure/edge cases?
    • Each task should have at least 1 failure/edge case criterion
    • Flag tasks where all criteria are happy-path only
  4. Security-sensitive paths — Are auth, payment, data mutation paths covered by integration tests?

Classify findings:

  • BLOCK: zero test coverage on a security-sensitive path (auth, payment, data mutation)
  • CONCERN: codepaths missing from coverage map, inappropriate test types, all-happy-path criteria
  • OBSERVATION: additional edge cases to consider, test strategy improvements

Step 6: Execution Feasibility (FULL only)

Gate: Execution plan validated before proceeding to Step 7.

  1. DAG efficiency — Calculate the critical path length. Are there unnecessary sequential constraints?
    • Count the longest dependency chain
    • Identify tasks that could be parallelized but are unnecessarily sequenced
    • Compare: (parallel groups) vs (total tasks) — higher ratio = better parallelism
  2. Agent matching — Are agents correctly assigned to tasks?
    • Check each task's **Agent:** field against the Agent Table
    • Flag: React task assigned to Python agent, backend task assigned to frontend agent
    • Flag: missing agent assignments
  3. Effort estimates — Are effort estimates (S/M/L/XL) reasonable?
    • S: 1 file, simple change
    • M: 1-2 files, moderate complexity
    • L: 2-3 files, significant complexity
    • XL: 3+ files or high complexity (should this be split?)
    • Flag: XL tasks that should be decomposed further
  4. P0 foundation validation — Do P0 tasks have zero dependencies? Are they truly foundational?

Classify findings:

  • BLOCK: (none — execution issues are concerns, not blocks)
  • CONCERN: over-constrained DAG, wrong agent assignment, XL tasks that should be split, P0 with dependencies
  • OBSERVATION: parallelism improvements, alternative agent assignments, effort recalibrations

Step 7: Verdict & Report

Gate: Report delivered and user prompted for action.

Score Each Section

For each section reviewed, assign a status:

StatusMeaning
PASSNo blocks, 0-1 concerns
WARNNo blocks, 2+ concerns
FAIL1+ blocks

Determine Verdict

VerdictCriteriaNext Action
APPROVE0 blocks, 0-2 total concernsProceed to execution
REVISE0 blocks, 3+ total concernsFix concerns, re-review
REJECT1+ blocksFix blocks, re-review is mandatory

Render Report

## Founder Review: <Plan Title>

Plan: plans/<name>.md | Mode: FULL/QUICK | Tasks: N

### Verdict: APPROVE / REVISE / REJECT

| Section | Status | Findings |
|---------|--------|----------|
| 1. Codebase Reality | PASS/WARN/FAIL | N blocks, N concerns, N observations |
| 2. Scope & Strategy | PASS/WARN/FAIL | N blocks, N concerns, N observations |
| 3. Architecture | PASS/WARN/FAIL | N blocks, N concerns, N observations |
| 4. Risk & Recovery | PASS/WARN/FAIL | N blocks, N concerns, N observations |
| 5. Test Coverage | PASS/WARN/FAIL | N blocks, N concerns, N observations |
| 6. Execution | PASS/WARN/FAIL | N blocks, N concerns, N observations |

### Blocking Issues
(List each BLOCK finding with section number, description, and recommended fix)

### Concerns
(List each CONCERN finding with section number, description, and recommended fix)

### Observations
(List each OBSERVATION — informational only)

### Recommended Plan Changes
(Only if REVISE — specific, actionable changes to make in the plan file)

AskUserQuestion (always): After delivering the report, ask:

  • (A) Proceed with execution (only if APPROVE)
  • (B) Revise the plan — user will update and re-run review
  • (C) Discuss specific findings

Gate Classification Reference

BLOCK (any 1 → REJECT)

GateDescription
Phantom file pathPlan references a file to modify that doesn't exist
Building what existsPlan creates new code for functionality that already exists in the codebase
Diagram inconsistencyArchitecture diagram contradicts task descriptions or dependency JSON
Missing critical dependencyTask B uses task A's output but doesn't declare dependency on A
Unaddressed critical riskCritical failure mode with no mitigation (FULL only)
Zero test coverage on security pathAuth, payment, or data mutation path with no test coverage (FULL only)

CONCERN (3+ → REVISE)

GateDescription
Mode mismatchPlan mode doesn't match actual scope
Unnecessary scopeTasks that don't serve the stated goal
Missed reuseExisting code could be leveraged but plan builds new
Incomplete failure modesRealistic risks not covered in failure modes table
Vague mitigationsFailure mode mitigations that aren't actionable
Test gapsNew codepaths missing from test coverage map
All-happy-path criteriaTask acceptance criteria with no failure/edge cases
Over-constrained DAGTasks unnecessarily sequenced, reducing parallelism
Wrong agent assignmentTask assigned to agent outside its domain
XL task not decomposedLarge task that should be split into smaller ones
Effort mismatchEffort estimate doesn't match task complexity
P0 with dependenciesFoundation task that depends on other tasks

OBSERVATION (informational)

GateDescription
Reuse opportunitySimilar code exists that could be extended
Alternative approachSimpler way to achieve the same goal
Parallelism improvementDependency reordering for better parallelism
Naming suggestionNew file names that could better match conventions
Additional edge casesEdge cases worth considering but not blocking

Safety Rules

RuleReason
Never modify the plan fileThis is a review-only skill; user decides what to change
Never skip file path verificationPhantom paths are the #1 cause of agent failure
Never skip the checklistThe checklist file contains detailed checks per section
Never rubber-stampIf you found zero issues, you didn't look hard enough
Never block without evidenceEvery BLOCK must cite specific plan content + codebase evidence
Never invent findingsA clean section is valid — mark it PASS
Always read the full planPartial reads miss cross-task inconsistencies
Always verify against codebasePlan claims must be checked against actual files
AskUserQuestion only at defined pointsSteps 2, 4, and 7 — never ad-hoc

Common Rationalizations (All Wrong)

These are excuses. Don't fall for them:

  • "The plan looks comprehensive, I'll just approve it" → STILL verify file paths against the codebase; comprehensiveness ≠ correctness
  • "The plan was generated by a good skill, it must be right" → STILL check — automated plans have systematic blind spots
  • "Checking file paths is tedious" → STILL check every one; phantom paths are the #1 agent failure cause
  • "The architecture diagram is there, so it must be consistent" → STILL trace data flow and compare against tasks
  • "This is a small plan, QUICK mode is enough" → If there are 5+ tasks or EXPANSION mode, use FULL regardless of your instinct
  • "I should find something to justify my existence" → A clean review is more valuable than invented concerns
  • "The user is waiting, I'll skip the deep checks" → A bad plan wastes more time than a thorough review

Failure Modes

Failure Mode 1: Rubber-Stamping

Symptom: APPROVE verdict with zero findings on a non-trivial plan Fix: Every plan has at least observations. If you found nothing, re-run the codebase reality check — you likely skipped file path verification.

Failure Mode 2: Phantom Path Miss

Symptom: Plan approved, agents fail because files don't exist Fix: Use Glob for every filesToModify path. Don't trust the plan's claims — verify.

Failure Mode 3: Blocking on Style

Symptom: REJECT verdict based on diagram formatting or naming preferences Fix: Only BLOCK on functional issues (phantom paths, building what exists, missing dependencies). Style → OBSERVATION at most.

Failure Mode 4: Scope as Gatekeeper

Symptom: REJECT because the plan is "too ambitious" without functional issues Fix: Scope concerns are CONCERN, not BLOCK. Only BLOCK on verifiable functional problems.

Failure Mode 5: Missing the Forest

Symptom: Found 10 minor observations, missed that the plan builds an auth system that already exists Fix: Always run "search for existing code the plan proposes to build" before diving into details.


Quick Workflow Summary

STEP 0: LOAD PLAN
├── Read plans/<name>.md
├── Extract: title, mode, tasks, file paths
├── Auto-select review mode (FULL/QUICK)
└── Gate: Plan loaded, mode selected

STEP 1: CODEBASE REALITY CHECK
├── Glob every filesToModify path
├── Verify filesToCreate parent directories
├── Search for existing code plan proposes to build
├── Check naming conventions
└── Gate: Every path verified

STEP 2: SCOPE & STRATEGY
├── Validate mode matches actual scope
├── Check goal alignment (overview vs tasks)
├── Detect scope creep
├── Validate reuse audit table
├── [AskUserQuestion if critical scope concern]
└── Gate: Scope validated

STEP 3: ARCHITECTURE & INTEGRATION
├── Verify diagram completeness
├── Trace data flow
├── Check integration boundaries
├── Validate API contracts
├── Check dependency JSON consistency
└── Gate: Architecture validated
         ↓
    QUICK MODE EXITS → Step 7

STEP 4: RISK & RECOVERY (FULL only)
├── Audit failure modes table
├── Verify mitigations are actionable
├── Check error boundary coverage
├── Assess rollback strategy
├── [AskUserQuestion if critical unaddressed risk]
└── Gate: Risk coverage validated

STEP 5: TEST COVERAGE GAPS (FULL only)
├── Audit test coverage map completeness
├── Verify test type appropriateness
├── Check edge case coverage in criteria
├── Verify security-sensitive path coverage
└── Gate: Test coverage validated

STEP 6: EXECUTION FEASIBILITY (FULL only)
├── Calculate critical path / DAG efficiency
├── Verify agent assignments
├── Validate effort estimates
├── Check P0 foundation tasks
└── Gate: Execution plan validated

STEP 7: VERDICT & REPORT
├── Score each section (PASS/WARN/FAIL)
├── Determine verdict (APPROVE/REVISE/REJECT)
├── Render structured report
├── AskUserQuestion: Proceed / Revise / Discuss
└── Gate: Report delivered

Quality Checklist (Must Score 8/10)

Score yourself honestly before delivering the verdict:

Plan Reading (0-2 points)

  • 0 points: Skimmed the plan or read only task titles
  • 1 point: Read most sections but missed details (failure modes, test map)
  • 2 points: Read every section of the plan including JSON dependencies

Codebase Verification (0-2 points)

  • 0 points: Trusted file paths without verification
  • 1 point: Checked some paths but not all
  • 2 points: Glob'd every filesToModify path, searched for existing code

Section Coverage (0-2 points)

  • 0 points: Skipped sections or applied checklist superficially
  • 1 point: Completed required sections but rushed some checks
  • 2 points: Every checklist item in every section evaluated

Finding Quality (0-2 points)

  • 0 points: Findings without evidence or invented concerns
  • 1 point: Most findings supported but some lack codebase evidence
  • 2 points: Every finding cites specific plan content + codebase evidence

Verdict Accuracy (0-2 points)

  • 0 points: Verdict doesn't match findings (approved with blocks, rejected without blocks)
  • 1 point: Verdict matches but borderline cases not well-reasoned
  • 2 points: Verdict clearly follows from findings with correct gate classification

Minimum passing score: 8/10


Completion Announcement

When review is complete, announce:

Founder review complete.

**Quality Score: X/10**
- Plan Reading: X/2
- Codebase Verification: X/2
- Section Coverage: X/2
- Finding Quality: X/2
- Verdict Accuracy: X/2

**Verdict: APPROVE / REVISE / REJECT**
- Sections reviewed: [count]
- Blocks: [count]
- Concerns: [count]
- Observations: [count]

**Verification:**
- Every file path checked: [check]
- Every section completed: [check]
- Checklist used: [check]
- Findings evidence-based: [check]

**Next steps:**
[Based on verdict — execute / revise plan / discuss]

Integration with Other Skills

The plan-founder-review skill integrates with:

  • plan-to-task-list-with-dag — Generates the plan that this skill reviews
  • run-parallel-agents-feature-build — Executes the plan after this skill approves it
  • branch-review-before-pr — Reviews the code after agents execute the plan

Workflow:

plan-to-task-list-with-dag (generate plan)
       |
       v
plan-founder-review (THIS SKILL — review plan)
       |
       v
run-parallel-agents-feature-build (execute plan)
       |
       v
branch-review-before-pr (review code)
       |
       v
create-pr (ship)

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

browse

No summary provided by upstream source.

Repository SourceNeeds Review
General

find-bugs

No summary provided by upstream source.

Repository SourceNeeds Review
General

pr-retro

No summary provided by upstream source.

Repository SourceNeeds Review
General

commit

No summary provided by upstream source.

Repository SourceNeeds Review