multi-model-research

Orchestrate multiple frontier LLMs (Claude, GPT-5.1, Gemini 3.0 Pro, Perplexity Sonar, Grok 4.1) for comprehensive research using LLM Council pattern with peer review and synthesis

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "multi-model-research" with this command: npx skills add krishagel/geoffrey/krishagel-geoffrey-multi-model-research

Multi-Model Research Agent

Implements Karpathy's LLM Council pattern for superior research through parallel queries, peer review, and chairman synthesis.

Architecture

Geoffrey/Claude (Native Council Member):

  • Routes simple vs complex queries
  • Calls external API orchestrator (research.py)
  • Provides my own research response
  • Conducts peer review phase
  • Requests GPT-5.1 synthesis (chairman)
  • Saves final report to Obsidian

Python External API Orchestrator:

  • Fetches responses from GPT-5.1, Gemini 3.0 Pro, Perplexity Sonar, Grok 4.1
  • Returns JSON with all external responses
  • I handle all orchestration and synthesis

When to Use This Skill

Use multi-model research when:

  • Complex analysis needed - Multiple perspectives valuable
  • Factual verification critical - Cross-model validation
  • Comprehensive coverage required - No single model sufficient
  • Current information essential - Perplexity provides web grounding
  • Contested topics - Benefit from diverse model perspectives

Simple vs Council Mode

Simple Mode (Perplexity only):

  • Factual lookups
  • Current events
  • Quick research with citations
  • Completes in <15 seconds

Council Mode (Full council):

  • Comparative analysis
  • Deep research
  • Multiple perspectives needed
  • Strategic questions
  • Completes in <90 seconds

Workflow

Simple Query

User: "What are the latest developments in quantum computing?"
     ↓
I decide: Simple query (factual, current)
     ↓
I call: uv run scripts/research.py --query "..." --models perplexity
     ↓
I read: JSON response from Perplexity
     ↓
I format: Markdown report with citations
     ↓
I save: To Obsidian Geoffrey/Research folder
     ↓
I return: Summary to user with Obsidian link

Council Query

User: "Compare the AI strategies of OpenAI, Anthropic, and Google"
     ↓
I decide: Council query (comparative, complex)
     ↓
I call: uv run scripts/research.py --query "..." --models gpt,gemini,perplexity,grok
     ↓
I read: JSON with all external responses
     ↓
I provide: My own (Claude) research response
     ↓
I conduct: Peer review (each model ranks others)
     ↓
I request: GPT-5.1 chairman synthesis
     ↓
I format: Comprehensive markdown report
     ↓
I save: To Obsidian Geoffrey/Research folder
     ↓
I return: Summary with Obsidian link

Output Format

All research reports saved to Obsidian include:

  • Executive Summary (2-3 paragraphs)
  • Key Findings (organized by theme, inline citations)
  • Confidence Assessment (what's certain vs debated)
  • References Section (all sources with URLs and dates)

Citations use numeric format: [1], [2], etc.

Technical Details

Python Script:

cd skills/multi-model-research
uv run scripts/research.py --query "Your question" --models perplexity --output /tmp/responses.json

Config:

  • config.yaml - Model settings, routing rules
  • prompts/system_prompts.yaml - Per-model system prompts
  • prompts/peer_review.md - Peer review template
  • prompts/chairman_synthesis.md - GPT-5.1 synthesis template

Dependencies:

  • httpx (async HTTP client)
  • pyyaml (config parsing)
  • python-dotenv (env vars)
  • python-frontmatter (Obsidian frontmatter)

API Keys Required:

  • OPENAI_API_KEY (GPT-5.1)
  • GEMINI_API_KEY (Gemini 3.0 Pro)
  • PERPLEXITY_API_KEY (Sonar Pro)
  • XAI_API_KEY (Grok 4.1)

All keys configured in ~/.env file.

Examples

Simple Research:

User: "What is RAG in AI?"

I route to: Simple mode (Perplexity)
Output: Concise explanation with current examples and citations
Time: ~10 seconds

Council Research:

User: "Compare serverless vs containers for production ML workloads"

I route to: Council mode (all 4 external + me)
Process:
  1. GPT-5.1: Provides comprehensive technical comparison
  2. Gemini 3.0: Analyzes cost and performance trade-offs
  3. Perplexity: Current industry trends and case studies
  4. Grok 4.1: Developer sentiment from X/Twitter
  5. Claude (me): Synthesize with nuanced analysis
  6. Peer review: Each model ranks others
  7. GPT-5.1 (chairman): Final synthesis

Output: Multi-perspective analysis with citations
Time: ~60 seconds

Limitations

  • Cost: Council mode uses 4-5 API calls per query
  • Latency: Council mode takes 60-90 seconds
  • API Limits: Rate limits may throttle parallel requests
  • Citation Quality: Non-Perplexity models require URL extraction

Future Enhancements

  • Streaming responses during deliberation
  • Cost tracking and budget limits
  • Query history and versioning
  • Custom model weights based on topic
  • Integration with Geoffrey's knowledge base

This skill implements Karpathy's LLM Council pattern released November 22, 2025.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

knowledge-manager

No summary provided by upstream source.

Repository SourceNeeds Review
Research

research

No summary provided by upstream source.

Repository SourceNeeds Review
General

morning-briefing

No summary provided by upstream source.

Repository SourceNeeds Review