skill-create-flow

Skill Create Flow

Build a new skill using a repeatable, high-signal workflow that separates methodology discovery from packaging**—without requiring any other skills to be installed or invoked.

Best for

This workflow excels at creating procedural agent skills where the value comes from following a structured methodology rather than raw knowledge retrieval.

Ideal skill types:

Methodology-based: Debugging, code reviews, writing specs, systematic problem-solving
Decision-heavy: Architecture decisions, UX design, business strategy, negotiation
Multi-step workflows: Defined sequence of operations with clear deliverables
Quality-focused: Where "excellent vs mediocre" output is distinguishable

Less ideal for:

Pure reference/lookup skills (use knowledge retrieval instead)
Simple one-shot commands (use direct tool invocation instead)
Skills requiring external dependencies (this flow produces standalone skills)

Scope & Outputs

You are producing a new skill folder (with its own SKILL.md and optional bundled resources). This flow helps you:

Narrow the intent until “excellent vs mediocre” is judgeable.
Extract expert frameworks (when needed).
Generate a production-ready, testable skill spec and artifacts.

Target artifacts (recommended)

SKILL.md
examples/ (1 tiny example)
tests/ (3–5 eval prompts or scenarios)
index_entry.json (if you maintain a skill index)
CHANGELOG.md (optional but helpful for iteration)

Decision Rules (Pick the Track)

Track A — Non-technical / judgment-heavy (default if unsure)

Examples: writing, sales, hiring, product decisions, strategy, negotiation.

Use the full flow below (steps 1–6).

Track B — Technical / objective correctness dominates

Examples: code generation patterns, CLI automation, infrastructure scripts, data pipelines.

Usually shorten Step 2 (“expert frameworks”) and spend more time on Step 5 (“validation”).

Workflow (Standalone)

Step 1 — Lock the intent (narrowing)

Converge from broad → specific using this 5-layer funnel:

Domain: what domain is the skill for?
Context (5W1H): who/what/where/when/why/how constraints.
Comparative choice: pick the closest of 2–3 similar scenarios.
Boundaries (via negativa): confirm what is explicitly excluded.
Concrete anchor: one real recent case with inputs + desired output.

Deliverable from Step 1

A 5–10 line “skill brief”:
target user + context
inputs the user will provide
outputs the skill must produce
what’s explicitly out of scope
quality bar (“excellent output looks like…”)

Step 2 — Extract expert frameworks (masters-level)

Produce:

1–3 core frameworks/checklists experts use in this exact scenario.
Common failure modes + counterexamples.
Minimal decision rules (when to do A vs B; when to stop; when to ask for more info).

Stop condition

You can explain the method in a way that is: specific, testable, and not generic advice.

Step 3 — Draft the trigger contract (frontmatter)

Write a frontmatter description that triggers reliably:

Include 3–6 concrete “when to use” phrases users will actually type.
Include exclusions if you keep getting false triggers.
Avoid “does everything”.

Template

name: <skill-name> description: <What it does>. Use when <trigger 1>, <trigger 2>, or <trigger 3>. Avoid when <non-goal>.

Step 4 — Write SKILL.md (lean + procedural)

Keep SKILL.md short and procedural:

Decision rules first.
Output contract (what files/sections are produced) is explicit.
Put long templates into references/ if they exceed ~100 lines.

Minimum recommended sections:

Scope & boundaries
Inputs (what the user must provide)
Procedure (step-by-step)
Decision rules & failure handling
Output contract
Test prompts (what you will validate with)

Step 5 — Create supporting artifacts (examples/evals/index/changelog)

A) Tiny example (examples/<name>-example.txt )

≤30 lines; show the most common invocation and expected output shape.

B) Evals / test prompts (tests/evals_<name>.yaml ) Include 3–5 scenarios:

happy path
missing input
edge case
out-of-scope
safety/constraint case (if relevant)

Minimal template:

cases:

name: happy_path prompt: | <a realistic user request that should trigger the skill> expect:
- <must-have output property 1>
- <must-have output property 2>

C) Index entry (index_entry.json ) (only if you keep an index)

{ "slug": "<skill-name>", "name": "<Human title>", "summary": "<=160 chars>", "entry": "skills/<skill-name>/SKILL.md" }

D) CHANGELOG.md (optional) Use it if you expect iteration; otherwise skip.

Step 6 — Validate via simulations

Dry-run the new skill against your test prompts:

Does it trigger when it should?
Does it avoid false triggers?
Does it produce the promised output contract?

If it fails, iterate:

tighten frontmatter description
move long content into references
add/adjust decision rules

Prompt Templates

Template 1 — Step 1 (narrowing)

“I want a skill for: . Help me narrow it using this 5-layer funnel: domain → 5W1H → pick closest scenario → boundaries → one real recent case. Ask only the minimum questions (1–3 at a time).”

Template 2 — Step 2 (expert frameworks)

“For the exact scenario we narrowed to, give me 1–3 expert-level frameworks/checklists, decision rules, and common failure modes. Keep it specific and testable (no generic advice).”

Template 3 — Step 4/5 (write artifacts)

“Using the skill brief + frameworks, draft:

SKILL.md (lean + procedural),
1 tiny example,
3–5 eval prompts,
optional index entry + changelog. Keep templates short; push long templates into references/.”

Notes (Keep It Lean)

Prefer decision rules over long explanation.
Prefer one tiny example over many medium ones.
If you feel tempted to add a long “how skills work” section, don’t—keep the flow operational.

skill-create-flow

Safety Notice

Copy this and send it to your AI assistant to learn

name: <skill-name> description: <What it does>. Use when <trigger 1>, <trigger 2>, or <trigger 3>. Avoid when <non-goal>.

Source Transparency

Related Skills

ai-agent-prd

headless-web-viewer

prd-to-engineering-spec