Skill System Behavior — Spec Engine
Define what a skill does before writing how. This skill enforces a BDD-style workflow:
SKILL.spec.yaml → Acceptance Tests → Implementation → SKILL.behavior.yaml (generated)
(WHAT) (VERIFY) (HOW) (DOCUMENT)
Spec Format
Every skill's behavior is defined in SKILL.spec.yaml:
schema_version: 1
skill_name: my-skill
description: "What this skill does"
operations:
- name: do-something
intent: "Why this operation exists"
inputs: [...]
outputs: [...]
constraints: ["Must not...", "Must always..."]
expected_effects: [...]
acceptance_tests:
structural:
- id: manifest-valid
assert: "manifest exposes all operations"
behavioral:
- id: happy-path
given: "Valid input"
when: "do-something is called"
then: "Returns expected output"
Full schema: schema/spec-v1.yaml
Operations
create-spec
Scaffold a new SKILL.spec.yaml from the schema template.
- Read
schema/spec-v1.yamlfor the format - Ask user for: skill name, description, operations (name + intent + inputs/outputs)
- Generate
SKILL.spec.yamlin the target skill directory - Include placeholder acceptance tests (structural defaults + behavioral stubs)
validate-spec
Validate a SKILL.spec.yaml against the v1 schema.
- Read the target
SKILL.spec.yaml - Check required fields:
schema_version,skill_name,description,operations - For each operation: verify
name,intent, at least one input or output, constraints present - For acceptance_tests: verify structural tests include
manifest-validandscripts-exist - Report: PASS with summary, or FAIL with specific violations
Run via: python3 scripts/validate_spec.py <path-to-SKILL.spec.yaml>
verify-structural
Run structural acceptance tests against a built skill.
- Read the skill's
SKILL.spec.yamlfor declared operations - Read the skill's
SKILL.mdfor theskill-manifestblock - Verify:
- Every operation in spec has a matching operation in manifest
- Every entrypoint script referenced in manifest exists on disk
skills-index.jsonincludes this skill's capabilities
- Report: checklist with PASS/FAIL per test
generate-contract
Generate a SKILL.behavior.yaml (Mermaid DAG) from a completed spec + implementation.
- Read
SKILL.spec.yamlfor operations, inputs, outputs, constraints - Map operations → stages in a flowchart
- Map inputs → input nodes, outputs → output nodes
- Map constraints → error paths
- Write
SKILL.behavior.yamlwith embedded Mermaid diagram
This is the last step — only run after implementation is complete and tests pass.
Coverage Gate
Scan a project tree for scripts/modules that are missing behavior specs.
- Scan
project_dirfor scripts that match the configured patterns - Scan for
*.behavior.yamlfiles in the same tree - Report covered/uncovered scripts using exact-name and SKILL convention matching
- Emit a human-readable summary and last-line JSON report
Run via: python3 scripts/coverage_gate.py <project_dir> [--patterns ...] [--exclude ...]
Use this before declaring "behavior coverage is complete."
Workflow Integration
For skill-system-creator (updated flow)
1. Understand the skill (examples, triggers)
2. Create spec → skill-system-behavior:create-spec
3. Validate spec → skill-system-behavior:validate-spec
4. Write tests → define acceptance criteria from spec
5. Initialize skill → skill-system-creator:init
6. Implement skill → edit SKILL.md, scripts, references
7. Verify structural → skill-system-behavior:verify-structural
8. Generate contract → skill-system-behavior:generate-contract
9. Package → skill-system-creator:package
Spec as Single Source of Truth
SKILL.spec.yaml— the authoritative definition (human-written, reviewed)SKILL.md+ manifest — implementation (must conform to spec)SKILL.behavior.yaml— generated documentation (derived from spec)
Governance Tiers
Use this 3-tier hierarchy as guidance for organizing cross-file governance docs:
Tier 1 — System Architecture
├── SYSTEM_ARCHITECTURE.md (system boundaries + data flow)
└── PIPELINE_DAG.md (zoom-in companion, state machines)
Tier 2 — Behavior Contracts
├── *.behavior.yaml (single-script: inputs/outputs/stages/error_paths)
└── BEHAVIOR_RUNTIME.md (strictly cross-script scenarios only)
Tier 3 — Tests
└── test_*.py (each test traces to a specific behavior scenario)
For Tier 2 → Tier 3 traceability, use acceptance_tests.behavioral[].tested_by in SKILL.spec.yaml to point to validating tests.
For cross-tier references, use optional governance.tier and governance.traces_to in SKILL.spec.yaml to declare where a spec sits and what higher/lower artifacts it traces to.
Cross-File Boundary Rules
*.behavior.yamldefines single-script behavior only (inputs, outputs, stages, error paths for one script).BEHAVIOR_RUNTIME.mdis for cross-script scenarios only (2+ scripts interacting).- If a scenario involves only one script, place it in that script's
.behavior.yaml, not inBEHAVIOR_RUNTIME.md. - When extending to new phases, avoid full-copy duplication; use references/overlay patterns so shared flows stay canonical.
Guidelines
- Write specs in the language your team uses (English, Chinese, mixed — consistency within a spec)
- Constraints are invariants: if violated, the skill has a bug
intentdescribes WHY, not HOW — implementation details go in SKILL.md- Acceptance tests use Given/When/Then for behavioral, assert-statements for structural
- Generate contracts only after all tests pass — contracts are docs, not specs
{
"schema_version": "2.0",
"id": "skill-system-behavior",
"version": "2.0.0",
"capabilities": ["spec-create", "spec-validate", "spec-verify", "contract-generate", "coverage-gate", "skill-graph-build"],
"effects": ["fs.read", "fs.write", "proc.exec"],
"operations": {
"create-spec": {
"description": "Scaffold a new SKILL.spec.yaml from the schema template.",
"input": {
"skill_name": { "type": "string", "required": true, "description": "Name of the skill to spec" },
"skill_dir": { "type": "string", "required": false, "description": "Target directory (default: skills/<skill_name>)" }
},
"output": {
"description": "Path to created SKILL.spec.yaml",
"fields": { "path": "string" }
},
"entrypoints": {
"agent": "Read schema/spec-v1.yaml, then scaffold SKILL.spec.yaml in skill_dir"
}
},
"validate-spec": {
"description": "Validate a SKILL.spec.yaml against the v1 schema.",
"input": {
"spec_path": { "type": "string", "required": true, "description": "Path to SKILL.spec.yaml" }
},
"output": {
"description": "Validation result with pass/fail and violations list",
"fields": { "status": "string", "violations": "array" }
},
"entrypoints": {
"unix": ["python3", "scripts/validate_spec.py", "{spec_path}"],
"windows": ["python", "scripts/validate_spec.py", "{spec_path}"]
}
},
"verify-structural": {
"description": "Run structural acceptance tests against a built skill.",
"input": {
"skill_dir": { "type": "string", "required": true, "description": "Path to skill directory" }
},
"output": {
"description": "Test results checklist",
"fields": { "status": "string", "results": "array" }
},
"entrypoints": {
"unix": ["python3", "scripts/verify_structural.py", "{skill_dir}"],
"windows": ["python", "scripts/verify_structural.py", "{skill_dir}"]
}
},
"generate-contract": {
"description": "Generate SKILL.behavior.yaml (Mermaid DAG) from a completed spec.",
"input": {
"skill_dir": { "type": "string", "required": true, "description": "Path to skill directory with SKILL.spec.yaml" }
},
"output": {
"description": "Path to generated SKILL.behavior.yaml",
"fields": { "path": "string" }
},
"entrypoints": {
"unix": ["python3", "scripts/generate_contract.py", "{skill_dir}"],
"windows": ["python", "scripts/generate_contract.py", "{skill_dir}"]
}
},
"coverage-gate": {
"description": "Scan project for scripts/modules missing behavior specs.",
"input": {
"project_dir": { "type": "string", "required": true, "description": "Root directory to scan" },
"patterns": { "type": "string", "required": false, "description": "Comma-separated glob patterns (default: **/*.py,**/*.sh)" },
"exclude": { "type": "string", "required": false, "description": "Patterns to exclude (default: test_*,__pycache__,.*)" }
},
"output": {
"description": "Coverage report with covered/uncovered lists and percentage",
"fields": { "covered": "array", "uncovered": "array", "coverage_pct": "number" }
},
"entrypoints": {
"unix": ["python3", "scripts/coverage_gate.py", "{project_dir}"],
"windows": ["python", "scripts/coverage_gate.py", "{project_dir}"]
}
},
"build-graph": {
"description": "Build a cross-skill dependency graph from SKILL.spec.yaml files.",
"input": {
"skills_dir": { "type": "string", "required": false, "description": "Root directory to scan (default: ./skills)" },
"output_format": { "type": "string", "required": false, "description": "mermaid, json, or sql" }
},
"output": {
"description": "Graph output in the requested format",
"fields": { "graph": "string|json" }
},
"entrypoints": {
"unix": ["python3", "scripts/build_graph.py", "--skills-dir", "{skills_dir}", "--format", "{output_format}"]
}
},
"refresh-projections": {
"description": "Refresh behavior projection bundles from SKILL.spec.yaml files for behavior_* tables.",
"input": {
"skills_dir": { "type": "string", "required": false, "description": "Root directory to scan (default: ./skills)" },
"output_format": { "type": "string", "required": false, "description": "json or sql" }
},
"output": {
"description": "Projection bundle or SQL statements for behavior_* refresh",
"fields": { "bundle": "json|string" }
},
"entrypoints": {
"unix": ["python3", "scripts/refresh_behavior_projections.py", "--skills-dir", "{skills_dir}", "--format", "{output_format}"]
}
}
},
"stdout_contract": {
"last_line_json": true
}
}