multi-agent-architecture-reference

Multi-Agent Architecture Reference

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "multi-agent-architecture-reference" with this command: npx skills add oimiragieo/agent-studio/oimiragieo-agent-studio-multi-agent-architecture-reference

Multi-Agent Architecture Reference

Step 1: Characterize the Task

Answer these four questions before selecting a topology:

  • Task independence: Can sub-tasks run in parallel without shared state? (YES → Swarm or Fan-out)

  • Task types known: Is the set of task types stable and deterministic at design time? (YES → Supervisor)

  • Phase complexity: Does the work require multi-stage sub-orchestration? (YES → Hierarchical or Conductor)

  • Stakes: Does an incorrect outcome require multi-reviewer agreement? (YES → Consensus Voting)

Step 2: Apply the Topology Decision Matrix

Topology Token Cost Best For Failure Modes Existing Skill

Conductor ~6x Sequential phases, ordered agent steps, default agent-studio pattern Orchestrator overload (SE-M01) master-orchestrator.md

Supervisor ~5x Known task types, specialist agents, deterministic routing Single point of failure; router miscalibration (SE-M01) Built into Router

Fan-out/Fan-in ~8x Parallel review/analysis, map-reduce, search Result aggregation complexity wave-executor

Swarm ~8x Independent tasks, load balancing, fault-tolerant processing Coordination overhead; consensus deadlock; orphaned tasks (SE-M02, SE-M05) swarm-coordination

Consensus Voting ~12x High-stakes decisions requiring multi-reviewer agreement Deadlock on split votes (SE-M02) consensus-voting

Hierarchical ~15x EPIC complexity, multiple distinct phases with sub-orchestration Cascade failures; token runaway at depth >3 (SE-M03, SE-M04) Custom per project

Token costs are relative to single-agent baseline (as of 2026). Use as order-of-magnitude guidance.

Step 3: Check Failure Mode Taxonomy

Before finalizing topology, verify mitigation for relevant failure modes:

SE-M01: Coordinator Overload

  • Topologies affected: Supervisor, Conductor, Hierarchical root

  • Symptom: Single coordinator receives more traffic than it can route

  • Fix: Distribute coordination or add routing replicas; use wave-executor for fan-out

SE-M02: Swarm Deadlock

  • Topologies affected: Swarm, Consensus Voting

  • Symptom: Agents wait for each other's consensus indefinitely

  • Fix: Timeout + majority-vote with tie-breaker; set consensus_timeout_ms

SE-M03: Cascade Failure

  • Topologies affected: Hierarchical

  • Symptom: A mid-level agent failure halts all downstream agents

  • Fix: Circuit breakers at each tier; retry with backoff; fallback agents

SE-M04: Token Runaway

  • Topologies affected: Hierarchical

  • Symptom: Spawning too many levels burns tokens exponentially

  • Fix: Set max_depth=3; monitor token budget per level; prefer Conductor over deep Hierarchical

SE-M05: Orphaned Tasks

  • Topologies affected: Swarm

  • Symptom: Agents drop tasks when no ownership is clear

  • Fix: Assign task IDs; use TaskUpdate tracking; require TaskUpdate(in_progress) on pickup

Step 4: Apply Escalation Path

Use the complexity escalation ladder when initial topology is insufficient:

TRIVIAL → Single agent (no multi-agent needed) ↓ (task types > 1, > 3 files) LOW → Supervisor (router delegates to 2-3 specialists) ↓ (parallel processing needed) MEDIUM → Conductor + Fan-out (master-orchestrator + wave-executor) ↓ (multi-phase with sub-orchestration) HIGH → Hierarchical (orchestrators at multiple tiers) ↓ (high-stakes decision required) EPIC → Hierarchical + Consensus Voting (max 3 tiers + voting gate)

Step 5: Reference Existing agent-studio Patterns

Pattern Skill/File Use Case

Conductor (DEFAULT) .claude/agents/orchestrators/master-orchestrator.md

Sequential phase execution; TaskUpdate coordination

Fan-out/Fan-in wave-executor skill Parallel batch processing; EPIC-tier pipelines

Swarm swarm-coordination skill Concurrent independent task execution

Consensus consensus-voting skill High-stakes decisions; multi-reviewer agreement

Supervisor Built into CLAUDE.md

Task routing to specialist agents

When in doubt, start with Conductor. The master-orchestrator pattern drives sequential phases with explicit TaskUpdate coordination — the lowest-risk default for most MEDIUM/HIGH tasks.

Example 1: Code Review Pipeline

  • Task: Review 5 files for security, quality, and style

  • Character: Tasks are independent (YES), parallel OK (YES)

  • Topology: Fan-out/Fan-in (~8x)

  • Pattern: wave-executor skill — spawn 3 reviewers in parallel, aggregate results

Example 2: Feature Implementation

  • Task: Design → Implement → Test → Document

  • Character: Sequential phases, ordered steps (YES)

  • Topology: Conductor (~6x)

  • Pattern: master-orchestrator with TaskUpdate coordination between phases

Example 3: Architecture Decision

  • Task: Choose between 3 database options for production system

  • Character: High stakes, requires agreement (YES)

  • Topology: Consensus Voting (~12x)

  • Pattern: consensus-voting skill — 3 architect agents vote, majority decides

Example 4: Batch Agent Creation

  • Task: Create 10 new agents from specs

  • Character: Independent tasks (YES), fault tolerance > ordering (YES)

  • Topology: Swarm (~8x)

  • Pattern: swarm-coordination skill with task ID assignment per agent

<best_practices>

  • Default to Conductor (master-orchestrator) — it is the lowest-risk pattern for most tasks

  • Never use Hierarchical beyond depth=3 (token runaway risk SE-M04)

  • Always assign TaskUpdate(in_progress) on task pickup in Swarm to prevent SE-M05

  • Use Fan-out (wave-executor) instead of Swarm when tasks have clear aggregation boundary

  • Add consensus gate only for genuinely high-stakes decisions — 12x token cost is significant

  • Document token budget per topology tier when spawning Hierarchical

  • Cross-reference failure mode taxonomy before finalizing topology choice </best_practices>

Iron Laws

  • ALWAYS start with Conductor — default to master-orchestrator for MEDIUM/HIGH tasks; only escalate to Hierarchical when sub-orchestration is explicitly required by the task structure.

  • NEVER exceed depth=3 in Hierarchical — token cost grows exponentially at each tier; depth >3 triggers SE-M04 (token runaway) and is considered an architectural defect.

  • ALWAYS assign TaskUpdate(in_progress) on Swarm task pickup — missing task ownership is the root cause of SE-M05 (orphaned tasks); every agent in a swarm must call TaskUpdate before doing work.

  • NEVER use Consensus Voting for low-stakes decisions — 12x token multiplier is justified only for architecture decisions, security approvals, or irreversible production changes.

  • ALWAYS cross-reference the failure mode taxonomy before finalizing topology — each topology has documented failure modes (SE-M01 through SE-M05); skipping this review leads to production incidents.

Anti-Patterns

Anti-Pattern Problem Fix

Defaulting to Hierarchical for every complex task Token runaway at depth >3; cascade failure risk; over-engineering most tasks Use Conductor (sequential phases) first; only escalate to Hierarchical when sub-orchestration is mandatory

Using Swarm for ordered, dependent tasks Swarm agents run concurrently and cannot enforce ordering; produces race conditions Use Conductor or Fan-out/Fan-in when task ordering matters

Skipping TaskUpdate(in_progress) in Swarm Tasks become orphaned (SE-M05); no ownership tracking; duplicated or dropped work Require every swarm agent to call TaskUpdate(in_progress) as its first action

Adding Consensus Voting speculatively 12x token overhead kills budget for non-critical decisions; slowdown on all downstream tasks Reserve consensus gate for genuinely high-stakes, irreversible decisions only

Mixing topology concerns (Supervisor + Swarm + Hierarchical in one flow) Complexity explosion; routing ambiguity; impossible to debug failures Pick one primary topology per orchestration scope; compose only at well-defined phase boundaries

Memory Protocol (MANDATORY)

Before starting:

Read .claude/context/memory/learnings.md to check for prior multi-agent architecture decisions.

After completing:

  • New topology decision → Append to .claude/context/memory/decisions.md

  • Failure mode encountered → Append to .claude/context/memory/issues.md

  • New pattern discovered → Append to .claude/context/memory/learnings.md

ASSUME INTERRUPTION: Your context may reset. If it's not in memory, it didn't happen.

Related Skills

  • wave-executor — Fan-out/Fan-in implementation

  • swarm-coordination — Swarm topology execution

  • consensus-voting — Byzantine consensus for high-stakes decisions

  • architecture-review — Validate topology choices against NFRs

  • complexity-assessment — Determine complexity level before topology selection

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

filesystem

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

slack-notifications

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

chrome-browser

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

diagram-generator

No summary provided by upstream source.

Repository SourceNeeds Review