agent-skills-authoring

Guides agents through creating, validating, and optimizing Agent Skills (SKILL.md files) from user requests. Covers all YAML frontmatter fields, directory layout, progressive disclosure, scripts, quality patterns, and evaluation workflows. Use when a user asks you to create, edit, audit, or improve an Agent Skill, even if they don't use the word "skill" — any request about reusable agent instructions, workflow packaging, or capability extension should trigger this skill.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "agent-skills-authoring" with this command: npx skills add zebbern/agent-skills-authoring/zebbern-agent-skills-authoring-agent-skills-authoring

Agent Skills Authoring Guide

This is your authoritative reference for producing high-quality SKILL.md files. When a user asks you to create, edit, or improve a skill, follow these instructions to produce the best possible result — one that any consuming agent can discover, activate, and apply effectively.


1. Core Principles

These principles come from Anthropic's official best practices. Apply them to every skill.

Claude is already very smart. Only add context Claude doesn't already have. Challenge each piece of content: "Does Claude really need this explanation?" "Can I assume Claude knows this?" "Does this paragraph justify its token cost?" Concise skills perform better because the context window is shared with conversation history, system prompts, and other skills' metadata.

Explain WHY, not just WHAT. Anthropic's guidance: "Explain to the model why things are important in lieu of heavy-handed musty MUSTs." When agents understand reasoning, they generalize correctly across novel situations. Rigid imperative-only rules cause brittle, over-literal behavior.

Match specificity to fragility. Use high freedom (general text guidance) when multiple approaches are valid. Use medium freedom (pseudocode/parameterized scripts) when a preferred pattern exists. Use low freedom (exact scripts, no modifications) when operations are fragile and consistency is critical. Think of it as: narrow bridge (one safe path, give exact steps) vs. open field (many valid routes, give direction and trust the agent).

Use consistent terminology. Pick one term and use it throughout. Don't mix "API endpoint" / "URL" / "path" or "extract" / "pull" / "get" / "retrieve" in the same skill.


2. Directory Structure

skill-name/
├── SKILL.md          # REQUIRED — frontmatter + instructions
├── scripts/          # Optional — executable code (runs without loading into context)
│   └── validate.py
├── references/       # Optional — docs loaded on demand
│   ├── REFERENCE.md
│   └── domain.md
└── assets/           # Optional — templates, data files, schemas
    └── template.json
  • Directory name must match the name field in frontmatter exactly
  • scripts/ — self-contained executables. Handle errors explicitly ("solve, don't punt"). Scripts execute via bash and only their output enters context, not their code
  • references/ — additional docs loaded on demand. Keep files focused. Add a table of contents at the top of any reference file over 100 lines
  • assets/ — static resources: templates, configuration files, schemas, lookup tables
  • Use forward slashes in all paths, even on Windows: scripts/validate.py not scripts\validate.py

Domain organization: For skills supporting multiple variants, split into references:

cloud-deploy/
├── SKILL.md           # Overview + variant selection
└── references/
    ├── aws.md
    ├── gcp.md
    └── azure.md

The agent loads only the relevant reference, saving context.


3. YAML Frontmatter — Complete Field Reference

Base Spec Fields (agentskills.io)

The open standard defines exactly six fields:

name (REQUIRED)

Unique identifier. 1–64 chars. Lowercase alphanumeric + hyphens only. No leading/trailing/consecutive hyphens. No XML tags. Cannot contain reserved words "anthropic" or "claude". Must match the parent directory name.

Prefer gerund naming (verb + -ing): processing-pdfs, analyzing-spreadsheets, testing-code. Also acceptable: noun phrases (pdf-processing) or action-oriented (process-pdfs). Avoid vague names (helper, utils, tools).

name: processing-pdfs

description (REQUIRED)

Natural-language summary. 1–1,024 chars. Plain text only. No XML tags. This is the primary triggering mechanism — agents use it to decide which skills to load.

Always write in third person. "Processes Excel files..." not "I can help you..." or "You can use this..." (inconsistent POV causes discovery problems).

Make descriptions slightly "pushy." Claude tends to undertrigger skills. Include both what the skill does AND specific trigger contexts. List keywords users would say. Add "even if they don't explicitly ask for X" where appropriate.

description: >-
  Extracts text and tables from PDF files, fills forms, and merges documents.
  Use when working with PDF files or when the user mentions PDFs, forms, or
  document extraction, even if they don't explicitly ask for "PDF processing."

license (OPTIONAL)

SPDX identifier or bundled file reference. Omit if unknown.

compatibility (OPTIONAL)

Environment requirements. 1–500 chars. Plain text. Only include if the skill has genuine requirements — most skills don't need this field.

metadata (OPTIONAL)

String key-value map. Use reasonably unique key names to avoid collisions.

metadata:
  author: "your-org"
  version: "1.0"
  category: "document-processing"

allowed-tools (OPTIONAL, EXPERIMENTAL)

Space-delimited list. Use scoped format where the platform supports it.

allowed-tools: Bash(git:*) Bash(jq:*) Read Write

Claude Code Extended Fields

Claude Code extends the open standard with additional fields. These are platform-specific and not part of the base agentskills.io spec:

FieldPurpose
argument-hintAutocomplete hint: [issue-number] or [filename] [format]
disable-model-invocationtrue to prevent auto-loading (manual /name only)
user-invocablefalse to hide from / menu (background knowledge skills)
modelModel override when this skill is active
contextfork to run in a forked subagent context
agentSubagent type to use when context: fork is set
hooksHooks scoped to this skill's lifecycle

In Claude Code, name falls back to directory name if omitted, and description falls back to the first paragraph of the markdown body.


4. Body Content

Everything after the closing --- is free-form Markdown. Keep the body under 500 lines. If approaching this limit, move detail into references/ and add clear pointers.

Recommended Structure

# [Skill Name]

[1-3 sentence overview]

## When to use this skill

[Trigger conditions — what user requests or contexts activate this]

## How to [primary action]

[Step-by-step instructions]

## Examples

[Input/output pairs]

## Common edge cases

[Boundary conditions and handling]

## Guidelines

[Key principles with reasoning]

Progressive Disclosure

LevelContentBudgetWhen Loaded
1Frontmatter (name + description)~100 tokensAlways — at agent startup
2SKILL.md body< 500 lines / < 5K tokensWhen skill activates
3scripts/, references/, assets/UnlimitedOn demand during execution

Level 3 scripts execute without loading into context — only their output enters the context window. Reference files are read on demand. This means you can bundle comprehensive resources with no context penalty until they're accessed.

File References

Use standard Markdown links with relative paths:

See [the reference guide](references/REFERENCE.md) for detailed API documentation.
Run the extraction script: `python scripts/extract.py`
For TypeScript specifics, see [TypeScript patterns](references/typescript.md).

Keep references one level deep from SKILL.md. Avoid nested reference chains — Claude may partially read files referenced from other referenced files.

Script Guidance

When referencing scripts, be explicit about intent:

  • Execute (most common): "Run scripts/analyze.py to extract fields"
  • Read as reference (for complex logic): "See scripts/analyze.py for the algorithm"

Scripts should handle errors explicitly. Don't let scripts fail and expect Claude to figure it out — provide fallbacks, helpful error messages, and document dependencies.

For MCP tools, use fully-qualified references: ServerName:tool_name (e.g., BigQuery:bigquery_schema, GitHub:create_issue).


5. Creating a Skill from a User Request

Step 1: Capture Intent

Understand what the user needs before writing anything:

  • What should this skill enable an agent to do?
  • When should it trigger? (phrases, tasks, contexts)
  • What's the expected output format?
  • Are there edge cases or constraints?
  • Does this need testing? (Verifiable outputs benefit from evals; subjective outputs like writing style often don't)

If the conversation already contains a workflow ("turn this into a skill"), extract answers from context first — tools used, step sequence, corrections the user made.

Step 2: Choose the Name

Prefer gerund form: {verb-ing}-{noun}processing-pdfs, testing-code. Validate: lowercase, hyphens only, 1-64 chars, no leading/trailing/consecutive hyphens, no XML tags, no "anthropic" or "claude" in the name.

Step 3: Write the Description

Third person always. Start with action verb. Include keyword triggers. State "Use when..." Make it slightly pushy to prevent undertriggering. Aim for 200-400 characters.

Step 4: Set Optional Fields

  • license: User-specified, or MIT/Apache-2.0, or omit
  • compatibility: Only if genuine environment requirements exist
  • metadata: Add author, version, category at minimum
  • allowed-tools: Use scoped format: Bash(git:*) Read
  • Claude Code fields: Set disable-model-invocation, argument-hint, etc. if relevant

Step 5: Write the Body

Start with a draft. Then review with fresh eyes and improve.

  • 1-3 sentence overview
  • "When to use this skill" with trigger conditions
  • Step-by-step instructions explaining WHY, not just WHAT
  • Input/output examples (crucial for quality)
  • Edge cases
  • For complex workflows: provide a copyable checklist Claude can check off as it progresses

Provide a default approach with an escape hatch rather than listing many options. "Use pdfplumber for text extraction. For scanned PDFs requiring OCR, use pdf2image instead."

Step 6: Iterate (Eval-First)

Anthropic's recommended pattern uses two Claude instances:

  1. Establish baseline — run Claude on representative tasks WITHOUT the skill
  2. Write minimal draft — only enough to address observed gaps
  3. Test with Claude B — a fresh instance with the skill loaded on 2-3 realistic prompts
  4. Observe Claude B's behavior — note where it struggles or misses instructions
  5. Return to Claude A to refine — share what you observed, ask for improvements
  6. Repeat until satisfied, then expand the test set

This is evaluation-driven development: create evals BEFORE writing extensive documentation. Test without the skill first, then with the skill, and measure the improvement.

Step 7: Validate

  • Directory name matches name field exactly
  • Description: 1-1,024 chars, third person, action verb, keyword triggers, "Use when..."
  • No XML tags in name or description
  • All Markdown links resolve to files that exist
  • SKILL.md body under 500 lines
  • Reference files over 100 lines have a table of contents
  • Run skills-ref validate ./skill-directory if available
  • Run skills-ref to-prompt ./skill-directory to preview the XML agents see

6. Quality Patterns

Phased Workflows

## Phase 1: Analyze

[Understand the domain, gather context]

## Phase 2: Implement

[Execute the primary task]

## Phase 3: Validate

[Check quality, run validation]

Feedback Loops

The run-validate-fix pattern dramatically improves output quality:

1. Make changes
2. Run: `python scripts/validate.py`
3. If validation fails: fix and return to step 1
4. Only proceed when validation passes

Copyable Checklists

For complex multi-step tasks:

Copy this checklist and track progress:

- [ ] Step 1: Analyze input
- [ ] Step 2: Create plan
- [ ] Step 3: Validate plan
- [ ] Step 4: Execute
- [ ] Step 5: Verify output

Verifiable Intermediate Outputs

For high-stakes operations: plan → validate plan → execute → verify. The plan file is machine-verifiable before changes are applied.

Decision Tables

ScenarioApproachWhy
< 100 itemsInline arrayNo pagination overhead
100-10K itemsCursor paginationStable ordering
> 10K itemsKeyset paginationO(1) page fetch

Before/After Examples

**Before:** `app.get('/getUser', ...)`
**After:** `app.get('/users/:id', ...)`

7. Anti-Patterns

Anti-PatternWhy It FailsFix
Vague descriptionAgents can't match to requestsKeyword triggers + "Use when..." + be pushy
First/second person descriptionDiscovery problemsThird person: "Processes..." not "I can..."
Over-explaining basicsWastes tokens; Claude already knowsOnly add what Claude doesn't have
Too many optionsConfusing; agent may pick wrong oneDefault approach + escape hatch
Rigid MUST-only rulesBrittle; agents over-literalizeExplain reasoning so agent generalizes
Body over 500 linesContext waste, agent may truncateMove detail to references/
Code in descriptionViolates spec, breaks parsersCode in body only
Nested file referencesClaude partially reads deeper filesOne level deep from SKILL.md
Name with "anthropic"/"claude"Reserved words; validation failsChoose different name
Backslash pathsBreak on Unix systemsForward slashes always
Time-sensitive infoBecomes wrong; no auto-updateUse "old patterns" section
No examplesAgent guesses output format2-3 input/output pairs minimum

8. Security

  • Never include secrets, API keys, or credentials in any skill file
  • Never instruct agents to disable security controls
  • Scripts should handle errors explicitly with fallbacks
  • Include input validation guidance for skills handling user data
  • Include sanitization guidance for skills generating browser-rendered output
  • Flag destructive tools in allowed-tools so users can review

9. Testing

  1. Syntax: skills-ref validate ./skill-name
  2. Prompt preview: skills-ref to-prompt ./skill-name — generates XML agents see
  3. Activation test: Prompt that SHOULD trigger → confirm it fires
  4. Negative test: Prompt that should NOT trigger → confirm it stays dormant
  5. Quality test: Agent executes skill → review output quality
  6. Multi-model test: Test with Haiku, Sonnet, AND Opus — what works for Opus may need more detail for Haiku

For verifiable outputs, create structured evaluations comparing with-skill vs without-skill.


10. Platform Usage

  • Claude Code: /plugin marketplace add anthropics/skills/plugin install <name>@anthropic-agent-skills
    • Skills at: ~/.claude/skills/ (personal) or .claude/skills/ (project)
  • Claude.ai: Pre-built skills on paid plans. Custom skills uploaded via settings
  • Claude API: Container parameter with skills-2025-10-02 beta flag

11. Resources


12. Pre-Delivery Checklist

Frontmatter

  • name: 1-64 chars, lowercase alphanum + hyphens, gerund preferred, matches directory
  • name: no XML tags, no "anthropic"/"claude"
  • description: 1-1,024 chars, third person, action verb, trigger keywords, "Use when..."
  • description: slightly pushy to prevent undertriggering
  • Optional fields set where relevant — no invented fields

Body

  • 1-3 sentence overview
  • Trigger section explaining when this skill activates
  • Instructions explain WHY, not just WHAT
  • 2-3 input/output examples minimum
  • Edge cases covered
  • Under 500 lines, pure Markdown
  • Consistent terminology throughout

Files

  • All Markdown links resolve to real files, one level deep
  • Reference files >100 lines have table of contents
  • Scripts handle errors explicitly with helpful messages
  • No secrets in any file
  • Forward slashes in all paths

Quality

  • Description alone determines relevance
  • Body alone enables execution without reading references
  • Tested with representative prompts (activation + negative + quality)
  • A real user would find the output useful, not just spec-compliant

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Security

idor vulnerability testing

No summary provided by upstream source.

Repository SourceNeeds Review
Security

security scanning tools

No summary provided by upstream source.

Repository SourceNeeds Review
Security

jwt security testing

No summary provided by upstream source.

Repository SourceNeeds Review
Security

mobile application security testing

No summary provided by upstream source.

Repository SourceNeeds Review