agent:memory

Guides the user through designing memory for AI agents. Based on "Principles of Building AI Agents" (Bhagwat & Gienow, 2025), Chapters 7-8: Agent Memory and Dynamic Agents.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "agent:memory" with this command: npx skills add ikatsuba/skills/ikatsuba-skills-agent-memory

Memory Architecture

Guides the user through designing memory for AI agents. Based on "Principles of Building AI Agents" (Bhagwat & Gienow, 2025), Chapters 7-8: Agent Memory and Dynamic Agents.

When to use

Use this skill when the user needs to:

  • Design how an agent remembers information across turns and sessions

  • Choose between context window, working memory, and semantic recall

  • Set up memory processors (token limiting, tool call filtering)

  • Plan long-term memory and user profile storage

Instructions

Step 1: Understand Memory Requirements

Use the AskUserQuestion tool to gather context:

  • Does the agent need to remember across sessions? (ephemeral vs. persistent)

  • What user-specific information matters? (preferences, history, profile)

  • How long are typical conversations? (few turns vs. dozens)

  • Does the agent call tools with large outputs? (search results, code, documents)

  • What model and context window are you using?

Read any existing spec documents (.specs/<spec-name>/ ) before proceeding.

Step 2: Memory Architecture Design

Present the three-layer memory model:

Memory Architecture

Layer 1: Conversation Window (Short-term)

Recent messages kept verbatim in the context window.

  • Scope: Current session only
  • Implementation: Last N messages (sliding window)
  • Tuning: lastMessages parameter — how many recent turns to keep

Layer 2: Working Memory (Persistent state)

Long-term facts about the user or task, always included in context.

  • Scope: Across sessions
  • Implementation: Key-value store or structured profile
  • Examples: User name, preferences, subscription tier, language, past decisions
  • Tuning: Keep small — this is injected into every request

Layer 3: Semantic Recall (Long-term, on-demand)

Past conversations and knowledge retrieved by relevance.

  • Scope: Across sessions
  • Implementation: RAG over past conversations / documents
  • Tuning: topK (number of results), messageRange (context around each match)
  • When to use: User references past interactions, asks "remember when..."

Use AskUserQuestion to determine which layers the agent needs:

Agent Type Layer 1 Layer 2 Layer 3

One-shot tool (e.g., code formatter) Minimal No No

Chatbot (no memory) Yes No No

Personal assistant Yes Yes Yes

Support agent Yes Yes (ticket context) Maybe (past tickets)

Research agent Yes No Yes (past research)

Step 3: Working Memory Design

If the agent needs persistent state (Layer 2), define what it stores:

Working Memory Schema

User Profile

FieldTypeSourceUpdated
namestringUser inputOn first mention
languagestringUser input / detectionOn change
tierenum (free/pro/enterprise)Auth systemOn login
preferencesobjectAccumulated from conversationsContinuously

Task State

FieldTypePurpose
currentGoalstringWhat the user is trying to achieve
completedStepsstring[]What has been done
pendingActionsstring[]What needs to happen next

Injection Strategy

Working memory is injected into the system prompt as: <working_memory> {serialized working memory} </working_memory>

Size budget: [N] tokens max — keep concise

Use AskUserQuestion to identify the specific fields for the user's domain.

Step 4: Semantic Recall Configuration

If the agent needs long-term recall (Layer 3), configure it:

Semantic Recall

What is Stored

  • Full conversation transcripts
  • Agent-generated summaries of conversations
  • Tool call results (selectively)
  • User-provided documents
  • Decision rationale

Retrieval Settings

ParameterValueRationale
topK[3-10]Number of past messages/chunks to retrieve
messageRange[1-5]Messages of context around each match
similarityThreshold[0.7-0.9]Minimum relevance score to include
embedding model[Model]Matches quality needs

Storage

OptionProsCons
pgvector (on existing Postgres)No new infra, familiarMay need tuning for scale
PineconeManaged, fast, scalableAdditional service + cost
ChromaOpen-source, local dev friendlySelf-hosted in production

When to Recall

  • User references past interactions ("last time", "remember when", "as before")
  • Agent needs historical context for the current task
  • Retrieval is triggered automatically on every turn (configurable)

Step 5: Memory Processors

Design processors that manage context size and relevance:

Memory Processors

TokenLimiter

Prevents exceeding context window by removing oldest messages.

  • Trigger: Total tokens > [X% of context window]
  • Strategy: Remove oldest messages first, preserve system prompt and working memory
  • Protected: System prompt, working memory, last [N] messages

ToolCallFilter

Removes verbose tool call results from history to save tokens.

  • When to use: Agent calls tools that return large payloads (search, code analysis)
  • Strategy: Keep tool call intent, remove raw response; OR summarize response
  • Tradeoff: Agent always calls tools fresh (no cached results) vs. seeing past tool outputs

SummaryProcessor (optional)

Periodically summarizes older conversation turns.

  • Trigger: Conversation exceeds [N] turns
  • Strategy: Summarize turns [1..N-K] into a paragraph, keep last K turns verbatim
  • Protected: Key decisions, user corrections, error context

Use AskUserQuestion to select which processors are needed.

Step 6: Dynamic Memory Configuration

If the agent adapts based on user context (Pattern 3 from Patterns book):

Dynamic Memory Configuration

User SignalMemory Adjustment
Free tiertopK=3, no semantic recall, basic working memory
Pro tiertopK=10, full semantic recall, rich working memory
EnterprisetopK=20, full recall, extended working memory with org context
New userNo working memory yet, rely on conversation window
Returning userLoad working memory, enable semantic recall

Step 7: Summarize and Offer Next Steps

Present all findings to the user as a structured summary in the conversation. Do NOT write to .specs/ — this skill works directly.

Use AskUserQuestion to offer:

  • Implement memory — scaffold memory configuration and processors in code

  • Set up RAG — run agent:rag if semantic recall was selected

  • Comprehensive design — run agent:design to cover all areas with a spec

Arguments

  • <args>
  • Optional description of the agent or path to existing code

Examples:

  • agent:memory personal-assistant — design memory for a personal assistant

  • agent:memory src/agents/support.ts — review memory in existing agent

  • agent:memory — start fresh

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

agent:secure

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

agent:design

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

agent:prompt

No summary provided by upstream source.

Repository SourceNeeds Review