Spec Executor

Execution workflow centered on state navigation and state assurance. At any moment, know where you are, what comes next, and how to recover if interrupted.

⚠️ Dependency Notice

This skill CANNOT be used standalone. It depends on spec-workflow to produce the plan and tasks.md tracker before execution begins.

Skill	Phase	Responsibility
`spec-workflow`	Planning	Requirements → Design → Task breakdown
`spec-executor`	Execution	State navigation → Task execution → State recovery

Handoff example:

spec-workflow output:
  docs/login-feature/tasks.md
  docs/login-feature/requirements.md
  docs/login-feature/design.md

spec-executor input:
  Read docs/login-feature/tasks.md
  Execute tasks in order
  Update tasks.md after each task

Do NOT use this skill for work that has not been planned by spec-workflow.

Validation: Before executing, confirm tasks.md exists and contains valid task entries (Scope + Verification fields). If the file is missing or malformed, stop and ask the user.

Quick Start

First time using this skill with a confirmed tasks.md:

1. Read tasks.md → find first [ ] task
2. Mark it [~] → execute per Scope description
3. When done: mark [✓], fill Verification, stage changes
4. Show diff → wait for user review and explicit commit approval
5. User says "next" → repeat from step 1

Session compressed? Jump to Session Recovery.

State Navigation

Where Am I?

Received user instruction
│
├─ Is the scope clear and small? ──→ YES → Execute directly (Simple Task)
│                                    NO
├─ Does it need design/architecture? → YES → Plan first → Write tasks.md → Execute
│                                    NO (unclear)
└─ Investigate → Reclassify

Current State	What To Do Next	Key Output
Just received instruction	Classify task (simple / complex / exploratory)	Decision: plan or execute directly
Planning in progress	Write design doc + tasks.md → wait for user confirmation	Confirmed `tasks.md`
Ready to implement	Read `tasks.md`, mark first pending task as `[~]`	Task in progress
Just finished a task	Update `tasks.md` to `[✓]` → verify → commit → next task	Updated tracker
Hit an error	Stop → diagnose → fix → re-verify → resume	Documented fix
Session interrupted	Read `tasks.md` → verify last done task → resume first pending	Recovered context
User inserts unrelated query	Handle query → return to previous state without losing position	Continuity preserved

State Transition Rules

[Classified] --(simple)--> [Executing] --(task done)--> [Update tracker] --(more tasks)--> [Executing]
      │                              │                                    └─(all done)--> [Complete]
      └─(complex)--> [Planning] --(confirmed)--> [Executing]
                             └─(rejected)--> [Revise plan]

[Executing] --(error)--> [Diagnosing] --(fixed)--> [Update tracker] --> [Executing]
                                    └─(stuck)--> [Ask user]

[Any state] --(unrelated user query)--> [Handle query] --> [Return to prior state]

Critical rule: You cannot transition from [Executing] to the next task without updating the tracker. Updating tasks.md is the definition of task completion.

Task Classification

Before writing code, classify the task:

Type	Criteria	Planning Required
Complex	New feature, architecture change, multi-file refactor, API design	Yes — use `spec-workflow` to produce design doc + tasks.md
Simple	Single-file change, config tweak, clear bug fix with known scope	No — brief explanation, then implement
Exploratory	Unclear scope, needs investigation to understand the problem	Investigate first, then reclassify

Boundary rule: If a "simple" task grows to need design decisions, interface changes, or cross-file coordination, stop immediately and upgrade to "complex" with planning.

Task Type Reference

User Intent	Type	Action
New feature, architecture change, large refactor	Complex	Trigger `spec-workflow` for full planning
Bug fix with ticket/reference, interface alignment	Fix / Complex	Trigger `spec-workflow`, Phase 1-2 may be simplified
Single-file change, config tweak, clear small fix	Simple	Brief explanation, then implement directly
Explain code, check logs, `updatecode`, info query	Routine	Execute directly, no Spec

Session split rule: If tasks.md has > 8 pending tasks or estimated work > 2 hours, suggest /clean and continue in a new session.

Must Ask the User

Stop and ask before proceeding when:

Situation	Question to Ask
Unclear requirement background	"What problem should this code solve?"
Design conflict (A vs B)	"Both approaches have tradeoffs — what's your priority?"
Deleting code	"This code appears used in X — are you sure you want to delete it?"
Deleting code — verification	Run `grep -r "ClassName" --include=".java" --include=".xml"` to confirm no downstream references before deleting
Scope expands beyond original	"This seems broader than the original request — should we expand the design?"
Better approach found	"I see a better refactoring path — can you explain the original design intent?"
Uncommitted user changes exist	"I see you have uncommitted changes — should I work from the latest code?"

State Assurance

The Tracker (`tasks.md`)

tasks.md is the primary state reference. If memory, conversation, and tracker disagree, pause and reconcile with the user — never silently override user intent.

Location: docs/{feature-name}/tasks.md (or project-defined location).

Format:

- [ ] Task name (≤10 words)
  - Scope: files and specific methods/fields changed
  - Affected: files that need sync changes due to this task
  - Verification: how to verify (compile, test, grep, etc.)
  - Commit: the actual commit message used
  - User corrections: 0 | Compression recoveries: 0
  - Notes: dependencies, blockers, scope changes

Critical field — Scope: Must be precise down to method/field level. After session compression, recovery depends entirely on this field to reconstruct what was done and what remains.

Example:

# Implementation Plan

- [✓] 1. Add user authentication endpoint
  - Scope: `AuthController.java` add `login()` method; `AuthService.java` add `authenticate()`
  - Affected: `UserRepository.java` (add `findByUsername`)
  - Verification: `./gradlew :api:compileJava` passes
  - Commit: `feat(auth): add login endpoint`
  - User corrections: 0 | Compression recoveries: 0
  - Notes: depends on task 0 (DB schema)

- [~] 2. Add JWT token generation
  - Scope: `JwtUtil.java` add `generateToken()`; `AuthService.java` inject JwtUtil
  - Affected: `application.yml` (add jwt.secret)
  - Verification: —
  - Commit: —
  - User corrections: 0 | Compression recoveries: 0
  - Notes: —

- [ ] 3. Add auth middleware
  - Scope: ...

Status values:

[ ] — Not started
[~] — In progress (set this before editing code)
[✓] — Done (set this before committing or starting next task)
[⏭] — Skipped

Correction tracking:

User corrections ≥ 2: Escalate — record the pattern in Notes and adjust approach
Compression recoveries ≥ 6: Suggest user /clean and start fresh

Real-Time Update Discipline

Event	Action	Why
Start a task	Change `[ ]` → `[~]`	Prevents duplicate work
Finish a task	Change `[~]` → `[✓]`, fill Verification + Commit	Defines completion
User changes scope	Add note, increment User corrections	Prevents drift
Session compresses	Mark `[~]` → `[ ]` if uncertain	Prevents false progress

This takes precedence over compilation, testing, and starting the next task. A task whose tracker entry is not updated is considered NOT done.

Timeliness Guarantee

tasks.md must be updated in real time — never retroactively:

After completing a task → update status, verification, and commit message immediately
After starting a task → change [ ] to [~] before editing code
After session compression → mark [~] → [ ] if uncertain about state
After user adds/changes scope → record in Notes immediately
Never reconstruct progress after compression without re-verifying code state first

Compression Recovery

tasks.md is the operational basis for recovery:

Step	How to use `tasks.md`
Read and confirm progress	Scan statuses in order; identify boundary between `[✓]` and `[ ]`
Verify code state	Only verify the last `[✓]` row (state boundary). Re-run its verification method. If it fails, state is stale — pause and reconcile
Determine next step	Find first `[ ]` or `[~]` row; execute from its Scope description

Session Recovery (After Interruption)

When context is compressed and the session continues:

Report count — state "This session has recovered from compression N times"
Read tasks.md — scan statuses, identify the boundary between [✓] and [ ]
Validate tracker — verify tasks.md structure is intact (expected fields, no corruption). If malformed, ask user before proceeding.
Verify boundary — re-run the verification for the last [✓] task. If it fails, the state is stale; pause and reconcile.
Resume — mark the first [ ] task as [~] and continue from its Scope description.
Count recoveries — increment Compression recoveries. If ≥ 6, stop execution, update tasks.md to [~], and wait for user /clean

Context Switching

Scenario	Rule
User asks unrelated question	Handle it, do NOT modify tracker, return to prior task afterward
User starts a new feature	Pause current task (mark `[~]`), create new tracker for new work
User changes direction mid-task	Treat as error response: stop, document in Notes, wait for confirmation
User corrects design assumption	Not a context switch — follow Error Response Protocol. Update `tasks.md` Notes and `design.md` if assumptions changed

Key principle: One tasks.md per feature. Different features must have different tracker files.

Execution Rules

Incremental Delivery

Implement one small task at a time
Max ~300 lines changed per task
Stop for review after each task unless user explicitly authorizes continuous mode
Update tasks.md status immediately after each task — before committing or starting the next

Task Size Exemptions

Scenario	Relaxed Limit	Required Condition
User authorizes continuous mode	≤ 500 lines	User explicitly says "continuous implement" or "finish the rest"
Pure deletion	No limit	Only removing code, no new logic, verified no downstream references
Batch rename/migration	≤ 500 lines	Mechanical replacement (e.g., package rename), zero logic change
Config/constant extraction	≤ 400 lines	Extracting scattered magic values to unified constants, no behavior change

If exceeding limit: Must split into smaller tasks or explain split plan to user and get approval.

Continuous Mode Authorization

Only proceed without stopping for review when user explicitly says:

"You can continuous implement"
"Finish the rest without asking"
"Batch execute the remaining tasks"

NOT authorization: "OK", "Continue", "Go ahead", "Sure" — these confirm the current task only. Must ask: "Do you mean I can continuously implement remaining tasks, or just confirming this one?"

When in continuous mode:

Each task must still be independently verified (compile/type-check)
Each task must still be independently committed
tasks.md must be updated before starting the next task
User can interrupt at any time — stop immediately when they do

Risk Prediction (Before Starting Each Task)

Spend 30 seconds on pre-flight risk analysis:

Match task type — What category is this? (API alignment / dependency injection / method refactor / DTO addition / deletion)
Extract lessons — What has gone wrong with this type of task before?
Pre-verify — Run the recommended verification before writing code:
- API alignment: Extract signatures with language-specific tools, diff against interface
- Deletion: grep -r "targetName" to confirm no downstream references
- Method refactor: grep -r "oldMethodName" to find all call sites
- Dependency change: Verify all injection points and mock references
- DTO addition: Check serialization paths and downstream consumers
- Mock update: Verify test files reference the correct type (interface vs concrete)
- Package rename: grep -r "old.package.name" to find all imports

Review Rules

Default: Stop and wait for user review after every task.

Review exemptions (no need to stop, but still run quality gates):

Scenario	Condition
User authorized continuous mode	Explicit continuous implement statement
Pure formatting	Only whitespace, line breaks, import sorting — zero logic change
Pure test data adjustment	Only test inputs or expected values — no production code touched
Emergency hotfix	User explicitly says "urgent fix" and change is ≤ 10 lines

Code Quality Gates

Before marking any task [✓]:

Compiles — no build/type errors. Prefer quiet single-module compilation.
- Build tool available: run ./gradlew :module:compileJava --quiet or equivalent
- Build tool unavailable: use javap -public to extract signatures, or grep/diff to verify definitions align
No dead code — unused imports, helpers, or fields removed
No magic values — hardcoded literals (appearing 2+ times) extracted to named constants
Size limits — methods ≤ 30 lines, nesting ≤ 2 levels, files ≤ 500 lines
Naming — booleans use is/has/should prefixes, names are self-explanatory
Task status updated — tasks.md reflects current state with verification and commit message

Pre-Commit Checklist

Before every commit:

Change scope matches the task (or fits within an exemption)
All quality gates passed
No secrets or credentials in diff
Diff reviewed — user has seen the changes and approved
Commit message provided and follows project convention
tasks.md updated with status, commit message, and verification
Self-check: Did I follow the risk prediction? Did I avoid patterns known to fail?

Do not commit if self-check fails.

Error Response Protocol

When compilation fails, tests fail, tools error, or user points out a mistake:

Stop — no further edits until resolved
Locate — identify affected files and methods from error output
Verify root cause — use tools (grep, diff, compiler) to confirm. Never rely on reading alone.
Fix — apply minimal corrective change
Document — distinguish the nature of the fix:
- Code error: Record cause and fix in tasks.md Notes
- User corrected design assumption: Update tasks.md Notes and sync changes to design.md
Re-verify — run full quality gates and pre-commit checklist
Resume — only after explicit user confirmation

Tool Verification Rules

Agent is for exploration (reading code, gathering context) — never for precision verification
Bash is for precision (grep, diff, compiler output, line counts) — always cross-check agent conclusions
Never judge signature alignment, overload resolution, or parameter types by reading alone
Never trust "looks correct" from an agent without independent verification

Core Principles

State over speed — document progress before moving on
Verify with tools — grep, diff, compiler; never eyeball
Stop and ask — when uncertain, pause for clarification
Tracker is truth — if memory and tracker disagree, the tracker wins
Agent is not evidence — agent conclusions require Bash cross-check
Plan before complex work — multi-file or architectural changes need design first

spec-executor

Safety Notice

Copy this and send it to your AI assistant to learn

Spec Executor

⚠️ Dependency Notice

Quick Start

State Navigation

Where Am I?

State Transition Rules

Task Classification

Task Type Reference

Must Ask the User

State Assurance

The Tracker (`tasks.md`)

Real-Time Update Discipline

Timeliness Guarantee

Compression Recovery

Session Recovery (After Interruption)

Context Switching

Execution Rules

Incremental Delivery

Task Size Exemptions

Continuous Mode Authorization

Risk Prediction (Before Starting Each Task)

Review Rules

Code Quality Gates

Pre-Commit Checklist

Error Response Protocol

Tool Verification Rules

Core Principles

Source Transparency

Related Skills

Moosend

Oracle Field Service

Diffbot

Moesif

spec-executor

Safety Notice

Copy this and send it to your AI assistant to learn

Spec Executor

⚠️ Dependency Notice

Quick Start

State Navigation

Where Am I?

State Transition Rules

Task Classification

Task Type Reference

Must Ask the User

State Assurance

The Tracker (tasks.md)

Real-Time Update Discipline

Timeliness Guarantee

Compression Recovery

Session Recovery (After Interruption)

Context Switching

Execution Rules

Incremental Delivery

Task Size Exemptions

Continuous Mode Authorization

Risk Prediction (Before Starting Each Task)

Review Rules

Code Quality Gates

Pre-Commit Checklist

Error Response Protocol

Tool Verification Rules

Core Principles

Source Transparency

Related Skills

Moosend

Oracle Field Service

Diffbot

Moesif

The Tracker (`tasks.md`)