prompt-engineering

Prompt engineering and agentic orchestration patterns. Use when crafting prompts for reasoning models, implementing chain-of-thought or Tree-of-Thoughts, designing ReAct loops, building few-shot examples, optimizing prompt performance, structuring system prompts, using extended thinking, or designing tool use workflows. Use for prompt templates, multi-step agent workflows, structured thinking protocols, and multimodal prompting.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "prompt-engineering" with this command: npx skills add oakoss/agent-skills/oakoss-agent-skills-prompt-engineering

Prompt Engineering

Advanced prompt design for LLMs and autonomous agents. Covers reasoning patterns, template systems, optimization workflows, agentic orchestration, extended thinking, and tool use prompting.

When to use: Designing prompts that require structured reasoning, building agent loops, optimizing LLM output quality, creating reusable prompt templates, configuring extended thinking for complex tasks, or designing multimodal prompts with images and text.

When NOT to use: Simple factual queries, direct lookups, or creative writing that benefits from open-ended generation.

Key Principles

  1. Explicit over implicit -- Modern models (Claude 4.x, GPT-4.1) follow instructions literally. Be specific about desired output, format, and behavior rather than relying on the model to infer intent.
  2. Objective over instruction -- For reasoning models (OpenAI o-series, Claude with extended thinking), state the goal rather than prescribing step-by-step methods. These models plan natively.
  3. Structure signals intent -- Use XML tags, clear delimiters, and consistent formatting to communicate prompt structure. Models trained on structured prompts parse them more reliably than plain text.
  4. One good example beats many rules -- Few-shot examples with consistent formatting anchor model behavior more effectively than verbose instructions.
  5. Feedback loops are built-in -- Design prompts that ask the model to verify, critique, or score its own output before finalizing.
  6. Token economy matters -- Every extra token adds latency and cost. Compress context, remove filler, and front-load critical information.

Model-Specific Considerations

Claude 4.x models follow instructions with high precision. They take prompts literally and do exactly what is asked -- no more, no less. Use XML tags to structure prompt sections (<rules>, <context>, <output_format>). Frame instructions positively (describe what to do, not what to avoid). Provide context or motivation behind instructions so Claude can generalize. Extended thinking and interleaved thinking provide native reasoning capabilities.

OpenAI o-series models (o3, o4-mini) use internal reasoning tokens before responding. Use developer messages instead of system messages. Write detailed function descriptions as interface contracts. Do not add explicit reasoning prompts -- these models reason natively and additional planning prompts can hurt performance. Pass back persisted reasoning items for multi-turn conversations.

GPT-4.1 and standard models benefit from explicit step-by-step instructions, few-shot examples, and structured output schemas. These models do not have native reasoning loops, so CoT prompting and structured thinking protocols add measurable value.

Multimodal models (GPT-4o, Claude with vision, Gemini) accept images alongside text. Provide context about what each image represents, use clear action verbs, and crop images to relevant regions. Label multiple images explicitly and specify their relationship.

Quick Reference

PatternAPI / TechniqueKey Point
Zero-shot CoT"Let's think step by step" triggerElicits reasoning without examples
Few-shot CoTExplicit reasoning chain examplesOne good example beats many rules
Self-consistencyMultiple paths + majority voteHigher accuracy on complex tasks
Tree-of-ThoughtsGenerate 3+ strategies, eliminate weakestParallel exploration with pruning; high cost
ReAct loopThought-Action-Observation cycleAgent reasons and acts in unison
System promptRole + Expertise + Guidelines + FormatFoundation for all LLM behavior
Prompt templateModular composition with variable slotsReusable, validated, cacheable
A/B testingStatistical comparison of prompt variantsIsolate variables, measure significance
Extended thinkingBudget-controlled deep reasoning (Claude)Let model think before responding
Interleaved thinkingThink between tool calls (Claude 4)Reason after each tool result
Think toolNo-op tool for structured reasoning spaceGives agents a place to reason mid-turn
Reasoning modelsObjective-based prompting for o3/o4-miniLet the model plan its own reasoning
Structured thinkingUnderstanding-Analysis-Execution protocolForces verification before acting
XML structuringTags to delimit prompt sectionsModels parse structured prompts reliably
Multimodal promptingText + image context for vision modelsProvide spatial context and clear action verbs
Confidence scoringModel self-reports certainty per claimQuantifies reliability of output
Token optimizationCompress context, remove filler wordsReduce latency and cost

Common Mistakes

MistakeCorrect Pattern
Overloading a single prompt with too many instructionsUse hierarchical rules with clear priority ordering
Forcing rigid step-by-step on reasoning modelsUse objective-based prompts; reasoning models plan natively
Setting max output tokens too low for reasoning modelsAllocate sufficient tokens for internal chain-of-thought
Using static examples for complex tasksSelect examples dynamically via semantic similarity
Inconsistent formatting across few-shot examplesAll examples must follow identical input-output structure
Manually parsing unstructured LLM outputUse JSON mode or structured output schemas
Ignoring token budget allocationReserve tokens for system prompt, examples, input, and response
Skipping baseline measurement before optimizingEstablish metrics first, then change one variable at a time
Using CoT prompts on reasoning modelsRedundant; these models reason natively without explicit triggers
Telling models what NOT to do instead of what to doFrame instructions positively: describe the desired behavior
Passing thinking blocks back as user text (Claude)Pass thinking blocks unmodified in assistant message only
Over-prompting reasoning models to "plan more"Additional planning prompts can degrade reasoning model performance

Prompt Engineering Workflow

  1. Define the objective -- State what the prompt should achieve and how success is measured
  2. Choose the right pattern -- Match the task to CoT, ReAct, ToT, or simple prompting based on complexity
  3. Select the model tier -- Route to lightweight, standard, or reasoning models based on task difficulty
  4. Write the baseline prompt -- Start simple; use system prompt structure with XML tags for complex cases
  5. Add examples -- Include 1-3 few-shot examples with consistent formatting if the task requires them
  6. Test and measure -- Establish baseline metrics (accuracy, latency, token usage) on representative inputs
  7. Analyze failures -- Categorize errors (format, factual, logical, incomplete) and address the most impactful
  8. Iterate one variable -- Change one element at a time to isolate what improves performance
  9. Version and deploy -- Track prompt versions alongside performance data for rollback capability

Delegation

  • Explore prompt variants and compare model responses: Use Explore agent to test prompt strategies across different inputs
  • Build multi-step agentic workflows with tool use: Use Task agent to implement and validate ReAct loops and autonomous chains
  • Design hierarchical prompt architecture for complex systems: Use Plan agent to structure prompt systems with verification loops

If the expert-instruction skill is available, delegate system prompt design and agent persona crafting to it.

References

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

prompt-engineering

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

playwright

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

ui-ux-polish

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

tanstack-form

No summary provided by upstream source.

Repository SourceNeeds Review