Multi-AI Collaboration Skill
Overview
This skill enables the invoking AI agent to act as an Orchestrator, coordinating multiple AI agents (Codex CLI, Gemini CLI, Claude sub-agents) with assigned Personas (specialized expert roles) for collaborative software development tasks.
The primary use case is Cross-Review: having multiple AI agents independently analyze code from different expert perspectives, then synthesizing their findings to provide comprehensive, bias-reduced results.
Prerequisites
Before using this skill, ensure the required CLI tools are installed:
Check available agents
which codex && codex --version which gemini && gemini --version which claude && claude --version
CLI Installation
-
Codex CLI: See OpenAI Codex Documentation
-
Gemini CLI: npm install -g @google/gemini-cli or brew install gemini-cli
-
Claude Code: See Claude Code Documentation
Workflow
┌─────────────────────────────────────────────────────────────────────┐ │ ORCHESTRATOR (Invoking AI) │ │ │ │ The AI agent that invokes this skill becomes the orchestrator. │ │ It coordinates all sub-agents and synthesizes results. │ └─────────────────────────────────────────────────────────────────────┘ │ ┌───────────────────────┼───────────────────────┐ ▼ ▼ ▼ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │ Codex CLI │ │ Gemini CLI │ │ Claude (sub) │ │ latest │ │ latest │ │ latest │ │ │ │ │ │ │ │ Persona: │ │ Persona: │ │ Persona: │ │ Architect │ │ Security │ │ QA Engineer │ └───────────────┘ └───────────────┘ └───────────────┘
Phase 1: Task Analysis (Silent)
The orchestrator performs initial analysis using a Parallel Fan-Out pattern for efficiency:
┌─────────────────────────────────────────────────────────────────────┐ │ Phase 1: Parallel Fan-Out │ │ │ │ ┌─────────────────────┐ ┌─────────────────────┐ │ │ │ Identify target │ │ Detect available │ │ │ │ files/code │ │ AI agents │ PARALLEL │ │ └─────────┬───────────┘ └──────────┬──────────┘ │ │ │ │ │ │ └────────────┬───────────────┘ │ │ ▼ │ │ ┌─────────────────────┐ │ │ │ Analyze task │ │ │ │ nature │ SEQUENTIAL │ │ └──────────┬──────────┘ │ │ ▼ │ │ ┌─────────────────────┐ │ │ │ Recommend │ │ │ │ personas │ │ │ └─────────────────────┘ │ └─────────────────────────────────────────────────────────────────────┘
Step 1 (Parallel): Execute these tasks concurrently as they have no dependencies:
-
Identify target files/code - Use Glob, Grep, Read tools to understand scope
-
Detect available AI agents - Check which CLIs are installed (which codex gemini claude )
Step 2 (Sequential): After parallel tasks complete, execute in order:
-
Analyze task nature - Determine if it's implementation, review, refactoring, investigation (requires file context from Step 1)
-
Recommend personas - Suggest appropriate expert roles based on task nature and available agents
Phase 2: Team Assembly (Interactive)
Use the environment-appropriate user input tool to configure the team:
Q1: Select Personas (Multiple Choice)
Which expert personas should participate in this task?
- 🏗️ Architect - System design, modularity, dependencies
- 🔒 Security Researcher - Vulnerabilities, OWASP, auth/authz
- 🧪 QA Engineer - Test design, edge cases, coverage
- 👁️ Code Reviewer - Code quality, readability, best practices
- ⚡ Performance Engineer - Complexity, memory, caching
- 🔍 Analyzer - Static analysis, bug patterns, type safety
- 📝 Documentarian - API docs, comments, README
- 🧠 Domain Expert - Business logic, requirements fit
Recommended based on task analysis: 1, 2, 4
Q2: Assign AI Agents to Personas
Assign an AI agent to each selected persona:
Architect:
- Codex CLI (latest default) - Recommended: deep reasoning
- Gemini CLI (latest default)
- Claude (sub-agent)
Security Researcher:
- Codex CLI (latest default)
- Gemini CLI (latest default) - Recommended: can search latest CVEs
- Claude (sub-agent)
Code Reviewer:
- Codex CLI (latest default)
- Gemini CLI (latest default)
- Claude (sub-agent) - Recommended: fast iteration
Q3: Select Workflow Mode
Select workflow mode:
- Parallel - All agents work independently, synthesize at end (Recommended for cross-review)
- Sequential - Each agent builds on previous results
- Pipeline - Implementation → Test → Review flow
- Adversarial - Agents critically challenge each other's findings
Phase 3: Execution
The orchestrator executes the configured workflow.
Parallel Mode (Cross-Review)
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ Agent A │ │ Agent B │ │ Agent C │ │ (Codex) │ │ (Gemini) │ │ (Claude) │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ │ │ ▼ ▼ ▼ Result A Result B Result C │ │ │ └────────────┬────┴────────────────┘ ▼ ┌─────────────┐ │ Synthesis │ └─────────────┘
Execution Commands:
Codex CLI (Architect persona) - omit --model to use latest default
codex exec "You are a Senior Software Architect. Analyze the following code for:
- Modularity and separation of concerns
- Dependency management
- Extensibility and maintainability
- Design pattern usage
[CODE_CONTENT]
Provide findings with severity (Critical/High/Medium/Low) and recommendations."
Gemini CLI (Security persona) - omit -m to use latest default
gemini -p "You are a Security Researcher. Analyze the following code for:
- OWASP Top 10 vulnerabilities
- Authentication/authorization issues
- Input validation and sanitization
- Data protection and encryption
[CODE_CONTENT]
Provide vulnerabilities with CVSS scores and remediation steps."
Claude sub-agent (QA persona)
Use the environment-appropriate subagent tool (Claude Code Task / Codex spawn_agent)
"You are a QA Engineer. Based on the code, design:
- Required test cases (unit, integration, e2e)
- Edge cases and boundary conditions
- Security test scenarios
- Performance test considerations
[CODE_CONTENT]"
Sequential Mode
Agent A → Agent B → Agent C → Synthesis │ │ │ └── Pass results to next agent
Each agent receives the previous agent's findings and builds upon them.
Pipeline Mode
Implementer → Tester → Reviewer │ │ │ Code Tests Review │ │ │ └───────────┴──────────┴──→ Quality-assured output
Adversarial Mode (Generator/Critic Pattern)
This mode implements the Generator and Critic pattern from Google ADK for iterative refinement:
┌─────────────────────────────────────────────────────────────────────┐ │ Generator/Critic Iteration Cycle │ │ │ │ ┌──────────────────────────────────────────────────────────────┐ │ │ │ Iteration Loop │ │ │ │ │ │ │ │ ┌─────────────┐ │ │ │ │ │ Generator │──────────────┐ │ │ │ │ │ (Agent A) │ Proposal │ │ │ │ │ └─────────────┘ ▼ │ │ │ │ ▲ ┌─────────────┐ │ │ │ │ │ │ Critic │ │ │ │ │ │ │ (Agent B) │ │ │ │ │ │ └──────┬──────┘ │ │ │ │ │ │ │ │ │ │ │ Feedback ▼ │ │ │ │ │ ┌─────────────┐ │ │ │ │ └──────────────│ Evaluate │ │ │ │ │ │ Quality │ │ │ │ │ └──────┬──────┘ │ │ │ │ │ │ │ │ │ ┌─────────────┴─────────────┐ │ │ │ │ ▼ ▼ │ │ │ │ [Quality OK?] [Max iterations?] │ │ │ │ │ No │ Yes │ │ │ │ └───── Continue ────────────┴── Exit ──────────▶ │ │ │ └──────────────────────────────────────────────────────────────┘ │ │ │ │ ▼ │ │ ┌─────────────────────┐ │ │ │ Final Decision │ │ │ │ (User Input Tool) │ │ │ └─────────────────────┘ │ └─────────────────────────────────────────────────────────────────────┘
Configuration Parameters:
-
max_iterations : Maximum number of generate-critique cycles (default: 3)
-
quality_threshold : Criteria for acceptable output (e.g., no Critical issues)
-
escalate_on_deadlock : Whether to involve user when agents cannot converge
Iteration Cycle:
-
Generate Phase: Generator agent produces a proposal/analysis
-
Critique Phase: Critic agent evaluates and challenges the proposal
-
Evaluate Phase: Check termination conditions:
-
Quality threshold met (no Critical/High severity issues remain)
-
Maximum iterations reached
-
Agents have converged on consensus
-
Refine or Exit: Either continue with refined proposal or exit to final decision
Example Adversarial Flow:
Iteration 1: Generator (Codex/Architect): "Propose microservices architecture" Critic (Gemini/Security): "Challenges: Service-to-service auth gaps, data consistency risks" Quality: Critical issues found → Continue
Iteration 2: Generator: "Refined proposal with OAuth2 service mesh, saga pattern for consistency" Critic: "Medium concerns: Observability gaps, no circuit breaker" Quality: No Critical issues → Continue (optional refinement)
Iteration 3: Generator: "Added distributed tracing, circuit breaker with fallbacks" Critic: "Low concerns: Consider rate limiting for external APIs" Quality: Acceptable → Exit
Final: Present converged proposal to user for approval
Termination Conditions:
Condition Action
Quality threshold met Exit with approved proposal
max_iterations reached Exit with best proposal + unresolved concerns
Agents deadlocked Escalate to user via User Input Tool
Critical regression Revert to previous iteration's proposal
Phase 4: Synthesis
The orchestrator consolidates all results:
-
Collect results from all agents
-
Identify consensus - Points all agents agree on
-
Identify divergence - Points where agents disagree
-
Prioritize actions - Create actionable items with priority
-
Handle conflicts - Use the environment-appropriate user input tool for unresolved disagreements
Personas Reference
🏗️ Architect
Focus Areas:
-
Modularity and separation of concerns
-
Dependency direction and management
-
Extensibility for future changes
-
Design pattern appropriateness
-
Public API/interface design
Output Format:
-
Architecture assessment summary
-
Issues (Critical/High/Medium/Low)
-
Improvement recommendations
-
Diagrams if needed
🔒 Security Researcher
Focus Areas:
-
OWASP Top 10 compliance
-
Authentication and authorization
-
Input validation and sanitization
-
Cryptography and data protection
-
Error handling information leakage
Output Format:
-
Vulnerability summary
-
Findings with CVSS scores
-
Attack scenarios
-
Remediation steps
🧪 QA Engineer
Focus Areas:
-
Test case design (unit/integration/e2e)
-
Edge cases and boundary conditions
-
Regression test needs
-
Test coverage gaps
-
Security testing requirements
Output Format:
-
Test strategy overview
-
Required test cases
-
Edge cases identified
-
Coverage recommendations
👁️ Code Reviewer
Focus Areas:
-
Code readability and clarity
-
Naming conventions
-
Error handling patterns
-
Code duplication
-
Best practices adherence
Output Format:
-
Review summary
-
Issues by category
-
Specific line-level feedback
-
Improvement suggestions
⚡ Performance Engineer
Focus Areas:
-
Time complexity analysis
-
Memory usage patterns
-
N+1 query problems
-
Caching opportunities
-
Resource management
Output Format:
-
Performance assessment
-
Bottleneck identification
-
Optimization recommendations
-
Benchmarking suggestions
🔍 Analyzer
Focus Areas:
-
Bug patterns and anti-patterns
-
Dead code detection
-
Type safety issues
-
Null/undefined handling
-
Race conditions
Output Format:
-
Static analysis results
-
Bug risk assessment
-
Code smell identification
-
Refactoring suggestions
📝 Documentarian
Focus Areas:
-
API documentation completeness
-
Code comment quality
-
README accuracy
-
Type definitions
-
Usage examples
Output Format:
-
Documentation gaps
-
Improvement areas
-
Template suggestions
-
Priority updates
🧠 Domain Expert
Focus Areas:
-
Business logic correctness
-
Requirements alignment
-
Use case coverage
-
Domain terminology
-
Edge case handling
Output Format:
-
Requirements fit analysis
-
Business rule verification
-
Missing functionality
-
Domain-specific recommendations
CLI Command Reference
Codex CLI
Basic invocation (latest default)
codex exec "prompt"
With explicit model (if you must pin it)
codex exec --config model='"<latest-codex-model>"' "prompt"
Reading from file (latest default)
codex exec "Review this code: $(cat src/file.ts)"
Gemini CLI
Basic invocation (latest default)
gemini -p "prompt"
With JSON output (latest default)
gemini -p "prompt" --output-format json
Non-interactive mode (required for scripting)
gemini -p "prompt"
Claude Code (Sub-agent)
For Claude Code, use the Task tool with subagent_type: general-purpose (default/latest model):
Task tool parameters: subagent_type: general-purpose prompt: "[Persona prompt with task]" model: omit to use latest default, or specify if you must pin
For other AI agent CLIs invoking Claude:
Non-interactive mode
claude -p "prompt" --output-format json
With tool restrictions
claude -p "prompt" --allowedTools Read,Grep,Glob
With turn limit
claude -p "prompt" --max-turns 5
Output Template
🎭 Multi-AI Collaboration Report
Executive Summary
[1-2 sentence summary of findings]
Team Configuration
| Persona | AI Agent | Model | Focus |
|---|---|---|---|
| 🏗️ Architect | Codex CLI | latest default | Design & Structure |
| 🔒 Security | Gemini CLI | latest default | Vulnerabilities |
| 🧪 QA | Claude (sub) | latest default | Test Design |
Workflow: Parallel (Cross-Review) Target: [files/directories]
Agent Results
🏗️ Architect (Codex CLI)
Assessment: [Overall status]
Findings:
- [Finding] - Severity: [Level]
- [Finding] - Severity: [Level]
Recommendations:
- [Recommendation]
🔒 Security Researcher (Gemini CLI)
Assessment: [Overall status]
Vulnerabilities:
| ID | Type | Severity | Location |
|---|---|---|---|
| SEC-001 | [Type] | [Severity] | [Location] |
Remediation:
- [Steps]
🧪 QA Engineer (Claude)
Test Strategy:
- [Strategy overview]
Required Tests:
- [Test case]
Edge Cases:
- [Edge case]
Synthesis
✅ Consensus
- [Points all agents agree on]
⚠️ Divergence
| Topic | Architect | Security | QA | Resolution |
|---|---|---|---|---|
| [Topic] | [View] | [View] | [View] | [Status] |
❓ User Decisions Required
- [Decision item]
- Agent A recommends: [X]
- Agent B recommends: [Y]
Priority Actions
🔴 Critical (P0)
- [Action]
🟠 High (P1)
- [Action]
🟡 Medium (P2)
- [Action]
🟢 Low (P3)
- [Action]
Next Steps
- [Step]
- [Step]
Usage Examples
Example 1: Cross-Review a Pull Request
User: Review the authentication module changes in this PR
Orchestrator:
- Identifies target files (src/auth/*)
- Detects available agents (codex, gemini, claude)
- Recommends personas: Architect, Security, Code Reviewer
User Input Tool: "Select personas for this review" User: 1, 2, 4 (Architect, Security, Code Reviewer)
User Input Tool: "Assign agents to personas" User: Codex→Architect, Gemini→Security, Claude→Reviewer
User Input Tool: "Select workflow mode" User: 1 (Parallel)
Execution:
- Codex analyzes architecture
- Gemini checks security
- Claude reviews code quality
- Orchestrator synthesizes results
Example 2: Implementation with QA Split
User: Implement user profile feature with tests
Orchestrator:
- Analyzes requirements
- Recommends Pipeline mode: Implementer → QA → Reviewer
Execution:
- Orchestrator (Claude) implements feature
- Codex creates comprehensive tests
- Gemini reviews implementation and tests
Example 3: Security Audit
User: Perform security audit on payment module
Orchestrator:
- Identifies payment-related files
- Recommends personas: Security, Analyzer, Performance
Execution:
- Gemini (Security): OWASP analysis, CVE search
- Codex (Analyzer): Static analysis, bug patterns
- Claude (Performance): DoS vulnerability, resource limits
Best Practices
-
Start with Parallel mode for unbiased cross-review
-
Use Codex for deep reasoning tasks (architecture, complex bugs)
-
Use Gemini for research tasks (latest vulnerabilities, best practices)
-
Use Claude sub-agents for speed (quick iterations, implementation)
-
Always synthesize divergent opinions - don't just merge results
-
Escalate to user when agents fundamentally disagree
-
Limit personas to 3-4 per task to avoid information overload
Troubleshooting
Agent CLI not found
Check installation
which codex gemini claude
Install missing CLIs
Codex: Follow OpenAI instructions
Gemini: npm install -g @google/gemini-cli
Claude: Download from anthropic.com
Agent timeout
-
Reduce scope of analysis
-
Split into smaller tasks
-
Use simpler prompts
Conflicting results
-
Use Adversarial mode for deeper analysis
-
Escalate to user via the environment-appropriate user input tool
-
Document disagreement in report
Environment-Specific Notes
Codex CLI Environment
-
Use request_user_input for persona selection and workflow mode
-
Use spawn_agent for subagents (latest default model)
-
Use exec_command to invoke external CLIs (gemini, claude)
Claude Code Environment
-
Use AskUserTool for user interactions
-
Use Task tool with subagent_type: general-purpose for Claude sub-agents (latest default model)
-
Use Bash tool to invoke external CLIs (codex, gemini)
Gemini CLI Environment
-
Use numbered prompt options for user selection (no tool calls)
-
Use gemini -p directly for execution (latest default model)
-
For subagents, invoke other CLIs directly (codex/claude) with latest defaults
Other AI Agent Environments
-
Use the platform's equivalent of: user input, subagent, and shell execution tools
-
Default to each CLI's latest model unless explicitly pinned