state-management-patterns

State Management Patterns Skill

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "state-management-patterns" with this command: npx skills add akaszubski/autonomous-dev/akaszubski-autonomous-dev-state-management-patterns

State Management Patterns Skill

Standardized state management and persistence patterns for the autonomous-dev plugin ecosystem. Ensures reliable, crash-resistant state persistence across Claude restarts and system failures.

When This Skill Activates

  • Implementing state persistence

  • Managing crash recovery

  • Handling concurrent state access

  • Versioning state schemas

  • Tracking batch operations

  • Managing user preferences

  • Keywords: "state", "persistence", "JSON", "atomic", "crash recovery", "checkpoint"

Core Patterns

  1. JSON Persistence with Atomic Writes

Definition: Store state in JSON files with atomic writes to prevent corruption on crash.

Pattern:

import json from pathlib import Path from typing import Dict, Any import tempfile import os

def save_state_atomic(state: Dict[str, Any], state_file: Path) -> None: """Save state with atomic write to prevent corruption.

Args:
    state: State dictionary to persist
    state_file: Target state file path

Security:
    - Atomic Write: Prevents partial writes on crash
    - Temp File: Write to temp, then rename (atomic operation)
    - Permissions: Preserves file permissions
"""
# Write to temporary file first
temp_fd, temp_path = tempfile.mkstemp(
    dir=state_file.parent,
    prefix=f".{state_file.name}.",
    suffix=".tmp"
)

try:
    # Write JSON to temp file
    with os.fdopen(temp_fd, 'w') as f:
        json.dump(state, f, indent=2)

    # Atomic rename (overwrites target)
    os.replace(temp_path, state_file)

except Exception:
    # Clean up temp file on failure
    if Path(temp_path).exists():
        Path(temp_path).unlink()
    raise

See: docs/json-persistence.md , examples/batch-state-example.py

  1. File Locking for Concurrent Access

Definition: Use file locks to prevent concurrent modification of state files.

Pattern:

import fcntl import json from pathlib import Path from contextlib import contextmanager

@contextmanager def file_lock(filepath: Path): """Acquire exclusive file lock for state file.

Args:
    filepath: Path to file to lock

Yields:
    Open file handle with exclusive lock

Example:
    >>> with file_lock(state_file) as f:
    ...     state = json.load(f)
    ...     state['count'] += 1
    ...     f.seek(0)
    ...     f.truncate()
    ...     json.dump(state, f)
"""
with filepath.open('r+') as f:
    fcntl.flock(f.fileno(), fcntl.LOCK_EX)
    try:
        yield f
    finally:
        fcntl.flock(f.fileno(), fcntl.LOCK_UN)

See: docs/file-locking.md , templates/file-lock-template.py

  1. Crash Recovery Pattern

Definition: Design state to enable recovery after crashes or interruptions.

Principles:

  • State includes enough context to resume operations

  • Progress tracking enables "resume from last checkpoint"

  • State validation detects corruption

  • Migration paths handle schema changes

Example:

@dataclass class BatchState: """Batch processing state with crash recovery support.

Attributes:
    batch_id: Unique batch identifier
    features: List of all features to process
    current_index: Index of current feature
    completed: List of completed feature names
    failed: List of failed feature names
    created_at: State creation timestamp
    last_updated: Last update timestamp
"""
batch_id: str
features: List[str]
current_index: int = 0
completed: List[str] = None
failed: List[str] = None
created_at: str = None
last_updated: str = None

def __post_init__(self):
    if self.completed is None:
        self.completed = []
    if self.failed is None:
        self.failed = []
    if self.created_at is None:
        self.created_at = datetime.now().isoformat()
    self.last_updated = datetime.now().isoformat()

See: docs/crash-recovery.md , examples/crash-recovery-example.py

  1. State Versioning and Migration

Definition: Version state schemas to enable graceful upgrades.

Pattern:

STATE_VERSION = "2.0.0"

def migrate_state(state: Dict[str, Any]) -> Dict[str, Any]: """Migrate state from old version to current.

Args:
    state: State dictionary (any version)

Returns:
    Migrated state (current version)
"""
version = state.get("version", "1.0.0")

if version == "1.0.0":
    # Migrate 1.0.0 → 1.1.0
    state = _migrate_1_0_to_1_1(state)
    version = "1.1.0"

if version == "1.1.0":
    # Migrate 1.1.0 → 2.0.0
    state = _migrate_1_1_to_2_0(state)
    version = "2.0.0"

state["version"] = STATE_VERSION
return state

See: docs/state-versioning.md , templates/state-manager-template.py

Real-World Examples

BatchStateManager Pattern

From plugins/autonomous-dev/lib/batch_state_manager.py :

Features:

  • JSON persistence with atomic writes

  • Crash recovery via --resume flag

  • Progress tracking (completed/failed features)

  • Automatic context management via Claude Code (200K token budget)

  • State versioning for schema upgrades

Note (Issue #218): Deprecated context clearing functions (should_clear_context() , pause_batch_for_clear() , get_clear_notification_message() ) have been removed as Claude Code v2.0+ handles context automatically with 200K token budget.

Usage:

Create batch state

state = create_batch_state(features=["feat1", "feat2", "feat3"]) state.batch_id # "batch-20251116-123456"

Process features

for feature in state.features: try: # Process feature result = process_feature(feature) # Feature implementation updates context automatically

except Exception as e:
    # Track failures for audit trail
    mark_failed(state, feature, str(e))

save_batch_state(state_file, state)  # Atomic write

Resume after crash

state = load_batch_state(state_file) next_feature = get_next_pending_feature(state) # Skips completed

Context Management: Claude Code automatically manages the 200K token budget. No manual context clearing required.

Checkpoint Integration (Issue #79)

Agents save checkpoints using the portable pattern:

Portable Pattern (Works Anywhere)

from pathlib import Path import sys

Portable path detection

current = Path.cwd() while current != current.parent: if (current / ".git").exists(): project_root = current break current = current.parent

Add lib to path

lib_path = project_root / "plugins/autonomous-dev/lib" if lib_path.exists(): sys.path.insert(0, str(lib_path))

try:
    from agent_tracker import AgentTracker
    success = AgentTracker.save_agent_checkpoint(
        agent_name='my-agent',
        message='Task completed - found 5 patterns',
        tools_used=['Read', 'Grep', 'WebSearch']
    )
    print(f"Checkpoint: {'saved' if success else 'skipped'}")
except ImportError:
    print("ℹ️ Checkpoint skipped (user project)")

Features

  • Portable: Works from any directory (user projects, subdirectories, fresh installs)

  • No hardcoded paths: Uses dynamic project root detection

  • Graceful degradation: Returns False, doesn't block workflow

  • Security validated: Path validation (CWE-22), no subprocess (CWE-78)

Design Patterns

  • Progressive Enhancement: Works with or without tracking infrastructure

  • Non-blocking: Never raises exceptions

  • Two-tier: Library imports instead of subprocess calls

See: LIBRARIES.md Section 24 (agent_tracker.py), DEVELOPMENT.md Scenario 2.5, docs/LIBRARIES.md for API

Usage Guidelines

For Library Authors

When implementing stateful features:

  • Use JSON persistence with atomic writes

  • Add file locking for concurrent access protection

  • Design for crash recovery with resumable state

  • Version your state for schema evolution

  • Validate on load to detect corruption

For Claude

When creating or analyzing stateful libraries:

  • Load this skill when keywords match ("state", "persistence", etc.)

  • Follow persistence patterns for reliability

  • Implement crash recovery for long-running operations

  • Use atomic operations to prevent corruption

  • Reference templates in templates/ directory

Token Savings

By centralizing state management patterns in this skill:

  • Before: ~50 tokens per library for inline state management docs

  • After: ~10 tokens for skill reference comment

  • Savings: ~40 tokens per library

  • Total: ~400 tokens across 10 libraries (4-5% reduction)

Progressive Disclosure

This skill uses Claude Code 2.0+ progressive disclosure architecture:

  • Metadata (frontmatter): Always loaded (~180 tokens)

  • Full content: Loaded only when keywords match

  • Result: Efficient context usage, scales to 100+ skills

When you use terms like "state management", "persistence", "crash recovery", or "atomic writes", Claude Code automatically loads the full skill content.

Templates and Examples

Templates (reusable code structures)

  • templates/state-manager-template.py : Complete state manager class

  • templates/atomic-write-template.py : Atomic write implementation

  • templates/file-lock-template.py : File locking utilities

Examples (real implementations)

  • examples/batch-state-example.py : BatchStateManager pattern

  • examples/user-state-example.py : UserStateManager pattern

  • examples/crash-recovery-example.py : Crash recovery demonstration

Documentation (detailed guides)

  • docs/json-persistence.md : JSON storage patterns

  • docs/atomic-writes.md : Atomic write implementation

  • docs/file-locking.md : Concurrent access protection

  • docs/crash-recovery.md : Recovery strategies

Cross-References

This skill integrates with other autonomous-dev skills:

  • library-design-patterns: Two-tier design, progressive enhancement

  • error-handling-patterns: Exception handling and recovery

  • security-patterns: File permissions and path validation

See: skills/library-design-patterns/ , skills/error-handling-patterns/

Maintenance

This skill should be updated when:

  • New state management patterns emerge

  • State schema versioning needs change

  • Concurrency patterns evolve

  • Performance optimizations discovered

Last Updated: 2025-11-16 (Phase 8.8 - Initial creation) Version: 1.0.0

Hard Rules

FORBIDDEN:

  • Storing state without a defined schema or version field

  • Direct file writes without atomic operations (write-then-rename pattern)

  • State files without backup/recovery mechanism

  • Unbounded state growth (MUST have cleanup/rotation strategy)

REQUIRED:

  • All state files MUST include a schema version for migration support

  • State mutations MUST be atomic (no partial writes on failure)

  • State MUST be recoverable from corruption (fallback to defaults)

  • All state access MUST go through a single module (no scattered file reads)

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

library-design-patterns

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

git-github

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

scientific-validation

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

documentation-guide

No summary provided by upstream source.

Repository SourceNeeds Review