prompt-assemble

Token-safe prompt assembly with memory orchestration. Use for any agent that needs to construct LLM prompts with memory retrieval. Guarantees no API failure due to token overflow. Implements two-phase context construction, memory safety valve, and hard limits on memory injection.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "prompt-assemble" with this command: npx skills add alexunitario-sketch/prompt-assemble

Prompt Assemble

Overview

A standardized, token-safe prompt assembly framework that guarantees API stability. Implements Two-Phase Context Construction and Memory Safety Valve to prevent token overflow while maximizing relevant context.

Design Goals:

  • ✅ Never fail due to memory-related token overflow
  • ✅ Memory is always discardable enhancement, never rigid dependency
  • ✅ Token budget decisions centralized at prompt assemble layer

When to Use

Use this skill when:

  1. Building or modifying any agent that constructs prompts
  2. Implementing memory retrieval systems
  3. Adding new prompt-related logic to existing agents
  4. Any scenario where token budget safety is required

Core Workflow

User Input
    ↓
Need-Memory Decision
    ↓
Minimal Context Build
    ↓
Memory Retrieval (Optional)
    ↓
Memory Summarization
    ↓
Token Estimation
    ↓
Safety Valve Decision
    ↓
Final Prompt → LLM Call

Phase Details

Phase 0: Base Configuration

# Model Context Windows (2026-02-04)
# - MiniMax-M2.1: 204,000 tokens (default)
# - Claude 3.5 Sonnet: 200,000 tokens
# - GPT-4o: 128,000 tokens

MAX_TOKENS = 204000  # Set to your model's context limit
SAFETY_MARGIN = 0.75 * MAX_TOKENS  # Conservative: 75% threshold = 153,000 tokens
MEMORY_TOP_K = 3                     # Max 3 memories
MEMORY_SUMMARY_MAX = 3 lines        # Max 3 lines per memory

Design Philosophy:

  • Leave 25% buffer for safety (model overhead, estimation errors, spikes)
  • Better to underutilize capacity than to overflow

Phase 1: Minimal Context

  • System prompt
  • Recent N messages (N=3, trimmed)
  • Current user input
  • No memory by default

Phase 2: Memory Need Decision

def need_memory(user_input):
    triggers = [
        "previously",
        "earlier we discussed",
        "do you remember",
        "as I mentioned before",
        "continuing from",
        "before we",
        "last time",
        "previously mentioned"
    ]
    for trigger in triggers:
        if trigger.lower() in user_input.lower():
            return True
    return False

Phase 3: Memory Retrieval (Optional)

memories = memory_search(query=user_input, top_k=MEMORY_TOP_K)
for mem in memories:
    summarized_memories.append(summarize(mem, max_lines=MEMORY_SUMMARY_MAX))

Phase 4: Token Estimation

Calculate estimated tokens for base_context + summarized_memories.

Phase 5: Safety Valve (Critical)

if estimated_tokens > SAFETY_MARGIN:
    base_context.append("[System Notice] Relevant memory skipped due to token budget.")
    return assemble(base_context)

Hard Rules:

  • ❌ Never downgrade system prompt
  • ❌ Never truncate user input
  • ❌ No "lucky splicing"
  • ✅ Only memory layer is expendable

Phase 6: Final Assembly

final_prompt = assemble(base_context + summarized_memories)
return final_prompt

Memory Data Standards

Allowed in Long-Term Memory

  • ✅ User preferences / identity / long-term goals
  • ✅ Confirmed important conclusions
  • ✅ System-level settings and rules

Forbidden in Long-Term Memory

  • ❌ Raw conversation logs
  • ❌ Reasoning traces
  • ❌ Temporary discussions
  • ❌ Information recoverable from chat history

Quick Start

Copy scripts/prompt_assemble.py to your agent and use:

from prompt_assemble import build_prompt

# In your agent's prompt construction:
final_prompt = build_prompt(user_input, memory_search_fn, get_recent_dialog_fn)

Resources

scripts/

  • prompt_assemble.py - Complete implementation with all phases (PromptAssembler class)

references/

  • memory_standards.md - Detailed memory content guidelines
  • token_estimation.md - Token counting strategies

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

Clever Compact

Your OpenClaw agent forgets everything between sessions — after /new, after compaction, after overnight. Clever Compact fixes all three: injects your last st...

Registry SourceRecently Updated
3110Profile unavailable
Automation

Skill

Persistent, consensus-validated memory for AI agents via SAGE MCP server. Gives you institutional memory that survives across conversations — memories go thr...

Registry SourceRecently Updated
1110Profile unavailable
Automation

Skill

Install and configure the MoltCare Agent Framework - a four-layer configuration system (SOUL/AGENTS/USER/MEMORY) with three-layer trigger architecture (Exact...

Registry SourceRecently Updated
550Profile unavailable
Automation

Workbuddy Add Memory

为WorkBuddy添加更智能的记忆管理功能:自动知识蒸馏→智能检索→工作前回忆

Registry SourceRecently Updated
470Profile unavailable