Memory Quality Auditor
Audit the memory system as a unified retrieval layer (STM/MTM/LTM files + index + spawn citation outcomes).
Scope
-
Retrieval drift signals
-
stale memory ratio
-
evidence injection coverage
-
citation usage/groundedness continuity
Workflow
-
Read memory artifacts and latest eval reports.
-
Compute quality metrics and threshold status.
-
Emit remediation backlog with TDD checks.
-
Record findings in memory and optional evolution recommendation.
Iron Laws
-
ALWAYS establish a baseline metric snapshot before auditing — drift is only meaningful relative to a prior measurement; auditing without a baseline produces absolute numbers that cannot identify regression.
-
NEVER close a memory finding without re-running the affected retrieval query — closing without verification creates false improvement metrics and masks persistent degradation.
-
ALWAYS include citation-groundedness checks in every audit run — uncited memory injections are the primary source of hallucination in agent spawns; skipping this check leaves the highest-risk failure mode undetected.
-
NEVER audit only the STM tier — degradation often originates in MTM/LTM promotion corruption; all three tiers must be sampled in every full audit cycle.
-
ALWAYS emit TDD-ready remediation items with a failing-test condition and expected metric threshold — vague findings ("memory quality is low") cannot be actioned by any agent.
Anti-Patterns
Anti-Pattern Why It Fails Correct Approach
Auditing without a baseline Cannot distinguish regression from steady-state; all findings are ambiguous Snapshot current metrics at session start; compute delta against the previous run
Closing findings without re-check Produces false-positive resolution; degradation persists silently behind green metrics Re-run the specific retrieval query after each remediation; close only on confirmed green metric
Skipping citation groundedness Citation failures are the leading cause of agent hallucination; missing this check omits the highest-severity defect class Include citation_coverage and grounded_ratio metrics in every audit report
Full-mode audit on every spawn Full audit is expensive; running it unconditionally inflates cost and slows workflows Use --mode summary for routine checks; reserve --mode full for scheduled or triggered audits
Auditing STM only MTM/LTM corruption is invisible in STM-only scans; stale LTM entries contaminate future sessions Sample all three tiers: STM (current session), MTM (last 10 sessions), LTM (permanent summaries)
Memory Protocol (MANDATORY)
Before starting: Read .claude/context/memory/learnings.md
After completing:
-
New pattern → .claude/context/memory/learnings.md
-
Issue found → .claude/context/memory/issues.md
-
Decision made → .claude/context/memory/decisions.md
ASSUME INTERRUPTION: If it's not in memory, it didn't happen.