SOP Runner Skill

Execute Standard Operating Procedures (SOPs) - predefined workflows for repeatable operational tasks. SOPs ensure consistent execution of complex procedures.

When to Use

✅ USE this skill when:

Documented procedures exist
Compliance requirements
Runbook execution
On-call incident response

❌ DON'T use this skill when:

Ad-hoc one-off tasks
Procedures still being defined
Highly dynamic situations

SOP Format

SOPs are defined in YAML:

sops/deploy-to-prod.yaml

name: Deploy to Production version: "1.0" description: Deploy application to production environment

preconditions:

type: branch value: main
type: ci_status value: success

steps:

id: notify name: Notify team type: message channel: alerts message: "Starting production deployment"
id: backup name: Create database backup type: command command: "pg_dump > backup_$(date +%s).sql" timeout: 300 onSuccess: continue onFailure: abort
id: deploy name: Deploy application type: command command: "kubectl apply -f k8s/" timeout: 600
id: verify name: Verify deployment type: http url: "https://api.example.com/health" expectStatus: 200 retries: 3
id: complete name: Notify completion type: message channel: alerts message: "Deployment completed successfully"

postconditions:

type: rollback_on_failure

Usage

Execute SOP

const { executeSOP } = require('/job/.pi/skills/sop-runner/runner.js');

const result = await executeSOP('deploy-to-prod', { params: { version: 'v2.1.0', environment: 'production' }, notifyChannel: 'alerts' });

console.log(result.status); // success|failed|aborted console.log(result.steps); // Step-by-step results

List Available SOPs

const { listSOPs } = require('/job/.pi/skills/sop-runner/runner.js');

const sops = await listSOPs(); console.log(sops); // ['deploy-to-prod', 'backup-database', 'incident-response']

Get SOP Details

const { getSOP } = require('/job/.pi/skills/sop-runner/runner.js');

const sop = await getSOP('deploy-to-prod'); console.log(sop.name); console.log(sop.steps.length);

Resume from Checkpoint

const { resumeSOP } = require('/job/.pi/sop-runner/runner.js');

const result = await resumeSOP('backup-database', { runId: 'run_abc123', fromStep: 'verify' });

Step Types

Command - Execute shell command

type: command command: "npm run build" timeout: 300 workingDir: "."

HTTP - Make HTTP request

type: http method: POST url: "https://api.example.com/deploy" headers: Authorization: Bearer $DEPLOY_TOKEN body: version: "${params.version}" expectStatus: 200

Message - Send notification

type: message channel: alerts template: "SOP {{name}} step {{stepId}} completed"

Agent - Delegate to AI agent

type: agent agent: reviewer task: "Review the deployment logs" context: "${previous_step.output}"

Approval - Require human approval

type: approval message: "Proceed with production deployment?" timeout: 600 # seconds approvers:

oncall-lead
sre-team

Wait - Delay execution

type: wait duration: 60 # seconds

Failure Handling

steps:

id: deploy type: command command: "kubectl apply -f k8s/" onFailure: action: rollback

or: abort|retry|max:3|continue
id: rollback type: command command: "kubectl rollout undo deployment/app" onlyIf: previous_step_failed

Execution Context

Each step has access to:

${params.*}
SOP parameters
${previous_step.output}
Previous step output
${env.VARIABLE}
Environment variables
${runId}
Current run ID
${startedAt}
Run start time

Example: Incident Response

sops/incident-response.yaml

name: Incident Response version: "2.0"

steps:

id: notify type: message channel: incidents message: "🚨 Incident detected! Starting response procedure"
id: gather_logs type: command command: "kubectl logs -l app=${params.service} --tail=200 > logs.txt"
id: analyze type: agent agent: analyst task: "Analyze logs for errors" context: "${previous_step.output}"
id: check_metrics type: http url: "https://prometheus/api/v1/query" params: query: "rate(http_requests_total{status='500'}[5m])"
id: escalation type: approval message: "Escalate to on-call engineer?" autoApprove: condition: "${check_metrics.value > 100}"
id: remediation type: agent agent: remediator task: "Suggest remediation steps" context: "${analyze.output}"
id: report type: message channel: incidents message: "Incident analysis complete"

Running SOPs

Execute SOP

node /job/.pi/skills/sop-runner/runner.js --sop deploy-to-prod

With parameters

node /job/.pi/skills/sop-runner/runner.js
--sop deploy-to-prod
--params '{"version":"v2.1.0","env":"prod"}'

Dry run

node /job/.pi/skills/sop-runner/runner.js
--sop deploy-to-prod
--dry-run

Resume failed run

node /job/.pi/skills/sop-runner/runner.js
--resume run_abc123

Output

{ runId: "run_abc123", sopName: "deploy-to-prod", status: "success", startedAt: "2026-02-25T13:30:00Z", completedAt: "2026-02-25T13:35:00Z", steps: [ { id: "notify", status: "success", output: "...", duration: 1200 }, { id: "backup", status: "success", output: "...", duration: 45000 }, { id: "deploy", status: "success", output: "...", duration: 120000 }, { id: "verify", status: "success", output: "...", duration: 5000 } ] }

sop-runner

Safety Notice

Copy this and send it to your AI assistant to learn

sops/deploy-to-prod.yaml

or: abort|retry|max:3|continue

sops/incident-response.yaml

Execute SOP

With parameters

Dry run

Resume failed run

Source Transparency

Related Skills

vector-memory

model-router

rss-reader

video-frames