Skill Optimizer
Reduce skill token cost without losing coverage. Every token in SKILL.md body is paid per conversation — references/ files are loaded on-demand.
Optimization Workflow
Phase 1: Analyze
Measure the current skill before changing anything.
-
Count SKILL.md body lines (exclude frontmatter) and estimate tokens (~4.5 tokens/line for mixed code/prose).
-
Count description characters.
-
List every references/ file with line counts.
-
Identify duplication: for each body section (at any heading level), check if the same concept or procedure is also covered in a reference file. Count those body lines and divide by total body lines for the overlap percentage.
-
List nouns from the description that appear verbatim in a sibling skill's description — these need domain-qualifying in Phase 2.
If the skill has no references/ directory, optimization may require creating reference files first. See the playbook for guidance on this edge case.
Output a table:
| Metric | Current |
|---|---|
| Description chars | ??? |
| Body lines | ??? |
| Body tokens (est.) | ??? |
| Duplication % | ??? |
| Reference files | ??? |
Phase 2: Plan
Decide what stays in the body, what moves to references, and what gets compressed.
Body retention criteria — keep a section in the body ONLY if it meets at least one:
-
Complex multi-step pattern requiring coordination across multiple sections or files
-
Non-obvious logic, parameters, or decision rules that agents frequently get wrong without inline guidance
-
A concept unique to this skill with no external documentation
-
Primary use case the skill exists for (the thing agents reach for most often)
Everything else belongs in the appropriate references/ file. See the playbook decision tree for concrete examples of what typically stays vs. moves.
Description compression rules:
-
Lead with the package/tool name and a one-line identity
-
Replace enumerations of 4+ specific names (APIs, checks, steps) with category-based phrasing (e.g., "hooks for auth, sessions, tokens" instead of listing each hook name)
-
Qualify generic keywords with the skill's domain to reduce false positives (e.g., "MyLib integrations with Redis" not "Redis integration")
-
Merge items that share a theme into a single line (e.g., "error handling" + "retry logic" → "Error handling and retry logic in MyLib")
-
Verify every original trigger category maps 1:1 to the compressed version — no categories dropped
Plan the Reference Guide section — for each reference file, write a one-line description of when to read it. This section is load-bearing: it tells agents which file to consult.
Target metrics:
-
Body: under ~250 lines
-
Description: under ~700 characters
-
Duplication with references: 0%
Phase 3: Execute
Apply the plan. Work in this order:
-
Compress the description — rewrite the YAML description field. Keep all trigger categories; do not remove any "when to use" signals.
-
Remove duplicate sections from body — delete sections already covered in references.
-
Add the Reference Guide section — add explicit pointers to each reference file with descriptions. See the playbook for the recommended format.
-
Add a Maintenance Note — add a note at the bottom of the body with: (a) the body-line budget (~250 lines), (b) a pointer to the ADR if one exists, and (c) a one-sentence rationale for the split. See the playbook template.
-
Bump version — increment metadata.version minor if the skill uses versioning.
Do NOT:
-
Move "when to use" triggers from description to body (description is the only field read for triggering)
-
Remove code examples from retained body sections (they are the value)
-
Create new reference files just to move content — use existing files when possible
-
Add content that duplicates what is already in references
Phase 4: Validate
Use the Task tool to spawn a subagent (opus model) to challenge coverage. Provide it:
-
The full SKILL.md (body + frontmatter)
-
All reference files
-
A list of 15-25 questions the skill must answer (provided by the user, or derived from trigger categories — see the playbook for derivation rules)
The subagent evaluates each question:
-
From SKILL.md alone: YES / PARTIAL / NO
-
From SKILL.md + references: YES / PARTIAL / NO
-
Gap: content missing from ALL files
Pass criteria:
-
0 regressions (nothing answerable before that isn't answerable after)
-
All trigger categories in the description still present
-
Body under ~250 lines
If gaps are found, determine whether they are pre-existing (never covered) or regressions (lost during optimization). Only regressions require fixes — restore or rewrite the missing content in the body or appropriate reference file, then re-evaluate only the affected questions.
Fallback: If subagent spawning is unavailable, self-evaluate: for each question, attempt to answer it using only the optimized files and rate confidence as HIGH / MEDIUM / LOW. Any LOW-confidence answer on a question that was previously answerable is a regression.
Output
After validation, produce a summary table:
| Metric | Before | After | Change |
|---|---|---|---|
| Description chars | ??? | ??? | -??% |
| Body lines | ??? | ??? | -??% |
| Body tokens (est.) | ??? | ??? | -??% |
| Duplication % | ??? | 0% | -??% |
| Regressions | n/a | 0 |
Reference
For detailed checklists, before/after examples, and the full validation methodology, see optimization-playbook.md.
Maintenance Note
Body budget: ~120 lines (general target for optimized skills: ~250). The optimization workflow and decision rules are the core value and stay in the body; expanded examples, checklists, and the decision tree live in the playbook reference.