Long Task Coordinator
Keep long-running work recoverable, stateful, and honest.
When to Use This Skill
Use this skill when the work:
- Spans multiple turns or multiple sessions
- Involves handoffs to workers, subagents, or background jobs
- Needs explicit waiting states instead of "still looking" updates
- Must survive interruption and resume from a durable state file
Skip this skill for small, single-turn tasks. Use planning-with-files when simple planning is enough and recovery logic is not the main concern.
Related Skills
planning-with-fileskeeps multi-step work organized in files.workflow-orchestratorchains follow-up skills after milestones.long-task-coordinatormakes long-running work resumable, auditable, and safe to hand off.
Core Rules
1. Create one source of truth
For any real long task, maintain one durable state file. Chat history is not a reliable state store.
The state file should capture at least:
- Goal
- Success criteria
- Current status
- Current step
- Completed work
- Next action
- Next checkpoint
- Blockers
- Active owners or workers
2. Separate roles only when needed
Use the smallest role model that fits the task:
- Origin: owns the goal and acceptance criteria
- Coordinator: owns state, sequencing, and recovery
- Worker: executes bounded sub-work
- Watchdog: checks liveness and recovery only
Simple tasks can collapse these roles into one agent. Long or delegated tasks should make the split explicit.
3. Run every cycle in this order
For each coordination round:
READ -> RECOVER -> DECIDE -> PERSIST -> REPORT -> END
Do not report conclusions before the state file has been updated.
4. Treat awaiting-result as a valid state
If a worker or background job was dispatched successfully, the task is not failing just because the result is not back yet.
Valid transitions include:
running -> awaiting-resultawaiting-result -> runningrunning -> pausedrunning -> complete
5. Non-terminal rounds must create real progress
A coordination round is only valid if it does at least one of the following:
- Dispatches bounded work
- Consumes new results
- Updates the current stage or decision
- Persists a new next step or checkpoint
- Performs explicit recovery
If nothing changed, do not pretend the task advanced.
6. Keep recovery separate from domain work
Recovery answers:
- Did execution drift from the saved state?
- Is the expected worker result still pending?
- Do we need to wait, retry, or re-dispatch?
Domain work answers:
- What should we build, analyze, or deliver next?
Recover first, then continue domain work.
Operating Workflow
Step 1: Decide whether the task needs coordination
Use this skill when at least one is true:
- The task will outlive the current turn
- The task will hand off work to another execution unit
- The task needs checkpoints, polling, or scheduled follow-up
- The task has enough complexity that loss of state would be expensive
Step 2: Create or load the state file
Prefer a path that is easy to rediscover, such as:
docs/<topic>-execution-plan.mddocs/<topic>-state.mdworklog/<topic>-state.md
If no durable state exists yet, create one from references/workflow.md.
Step 3: Recover before acting
At the start of every new round:
- Read the state file
- Check whether the recorded next step still makes sense
- Confirm whether any delegated work returned
- Repair stale assumptions before new action
Step 4: Persist before reporting
After deciding the next action:
- Update the state file
- Record new status, owners, blockers, and checkpoint
- Only then report progress to the user or caller
Step 5: Close the round honestly
End each round with one of these states:
runningawaiting-resultpausedblockedcomplete
The reported status should match the persisted status exactly.
Output Expectations
When using this skill, produce updates that are grounded in saved state:
- What status the task is in now
- What changed this round
- What is expected next
- What would unblock or complete the task
Acceptance Criteria
Treat the coordination work as complete only when all relevant items below are true:
- A durable state file exists in a predictable path
- The saved status matches the real task state
- Completed work, next action, and blockers are recorded explicitly
- Any delegated work has a named owner and a return condition
- The final report is derived from the persisted state, not from transient reasoning
If the task is not truly complete, end in running, awaiting-result, paused, or blocked rather than pretending the work is done
Anti-Patterns
Avoid:
- Reconstructing progress from memory instead of the state file
- Reporting a conclusion before saving it
- Marking waiting as failure
- Ending a round with no new action and no state change
- Mixing recovery checks with domain decisions in one fuzzy step
References
references/workflow.md- Detailed workflow, state template, and recovery checklist