autoresearchclaw-autonomous-research

Fully autonomous research pipeline that turns a topic idea into a complete academic paper with real citations, experiments, and conference-ready LaTeX.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "autoresearchclaw-autonomous-research" with this command: npx skills add aradotso/trending-skills/aradotso-trending-skills-autoresearchclaw-autonomous-research

AutoResearchClaw — Autonomous Research Pipeline

Skill by ara.so — Daily 2026 Skills collection.

AutoResearchClaw is a fully autonomous 23-stage research pipeline that takes a natural language topic and produces a complete academic paper: real arXiv/Semantic Scholar citations, sandboxed experiments, statistical analysis, multi-agent peer review, and conference-ready LaTeX (NeurIPS/ICML/ICLR). No hallucinated references. No human babysitting.


Installation

# Clone and install
git clone https://github.com/aiming-lab/AutoResearchClaw.git
cd AutoResearchClaw
python3 -m venv .venv && source .venv/bin/activate
pip install -e .

# Verify CLI is available
researchclaw --help

Requirements: Python 3.11+


Configuration

cp config.researchclaw.example.yaml config.arc.yaml

Minimum config (config.arc.yaml)

project:
  name: "my-research"

research:
  topic: "Your research topic here"

llm:
  provider: "openai"
  base_url: "https://api.openai.com/v1"
  api_key_env: "OPENAI_API_KEY"
  primary_model: "gpt-4o"
  fallback_models: ["gpt-4o-mini"]

experiment:
  mode: "sandbox"
  sandbox:
    python_path: ".venv/bin/python"
export OPENAI_API_KEY="$YOUR_OPENAI_KEY"

OpenRouter config (200+ models)

llm:
  provider: "openrouter"
  api_key_env: "OPENROUTER_API_KEY"
  primary_model: "anthropic/claude-3.5-sonnet"
  fallback_models:
    - "google/gemini-pro-1.5"
    - "meta-llama/llama-3.1-70b-instruct"
export OPENROUTER_API_KEY="$YOUR_OPENROUTER_KEY"

ACP (Agent Client Protocol) — no API key needed

llm:
  provider: "acp"
  acp:
    agent: "claude"   # or: codex, gemini, opencode, kimi
    cwd: "."

The agent CLI (e.g. claude) handles its own authentication.

OpenClaw bridge (optional advanced capabilities)

openclaw_bridge:
  use_cron: true              # Scheduled research runs
  use_message: true           # Progress notifications
  use_memory: true            # Cross-session knowledge persistence
  use_sessions_spawn: true    # Parallel sub-sessions
  use_web_fetch: true         # Live web search in literature review
  use_browser: false          # Browser-based paper collection

Key CLI Commands

# Basic run — fully autonomous, no prompts
researchclaw run --topic "Your research idea" --auto-approve

# Run with explicit config file
researchclaw run --config config.arc.yaml --topic "Mixture-of-experts routing efficiency" --auto-approve

# Run with topic defined in config (omit --topic flag)
researchclaw run --config config.arc.yaml --auto-approve

# Interactive mode — pauses at gate stages for approval
researchclaw run --config config.arc.yaml --topic "Your topic"

# Check pipeline status / resume a run
researchclaw status --run-id rc-20260315-120000-abc123

# List past runs
researchclaw list

Gate stages (5, 9, 20) pause for human approval in interactive mode. Pass --auto-approve to skip all gates.


Python API

from researchclaw.pipeline import Runner
from researchclaw.config import load_config

# Load config and run
config = load_config("config.arc.yaml")
config.research.topic = "Efficient attention mechanisms for long-context LLMs"
config.auto_approve = True

runner = Runner(config)
result = runner.run()

# Access outputs
print(result.artifact_dir)          # artifacts/rc-YYYYMMDD-HHMMSS-<hash>/
print(result.deliverables_dir)      # .../deliverables/
print(result.paper_draft_path)      # .../deliverables/paper_draft.md
print(result.latex_path)            # .../deliverables/paper.tex
print(result.bibtex_path)           # .../deliverables/references.bib
print(result.verification_report)  # .../deliverables/verification_report.json
# Run specific stages only
from researchclaw.pipeline import Runner, StageRange

runner = Runner(config)
result = runner.run(stages=StageRange(start="LITERATURE_COLLECT", end="KNOWLEDGE_EXTRACT"))
# Access knowledge base after a run
from researchclaw.knowledge import KnowledgeBase

kb = KnowledgeBase.load(result.artifact_dir)
findings = kb.get("findings")
literature = kb.get("literature")
decisions = kb.get("decisions")

Output Structure

After a run, all outputs land in artifacts/rc-YYYYMMDD-HHMMSS-<hash>/:

artifacts/rc-20260315-120000-abc123/
├── deliverables/
│   ├── paper_draft.md          # Full academic paper (Markdown)
│   ├── paper.tex               # Conference-ready LaTeX
│   ├── references.bib          # Real BibTeX — auto-pruned to inline citations
│   ├── verification_report.json # 4-layer citation integrity report
│   └── reviews.md              # Multi-agent peer review
├── experiment_runs/
│   ├── run_001/
│   │   ├── code/               # Generated experiment code
│   │   ├── results.json        # Structured metrics
│   │   └── sandbox_output.txt  # Execution logs
├── charts/
│   └── *.png                   # Auto-generated comparison charts
├── evolution/
│   └── lessons.json            # Self-learning lessons for future runs
└── knowledge_base/
    ├── decisions.json
    ├── experiments.json
    ├── findings.json
    ├── literature.json
    ├── questions.json
    └── reviews.json

Pipeline Stages Reference

PhaseStage #NameNotes
A1TOPIC_INITParse and scope research topic
A2PROBLEM_DECOMPOSEBreak into sub-problems
B3SEARCH_STRATEGYBuild search queries
B4LITERATURE_COLLECTReal API calls to arXiv + Semantic Scholar
B5LITERATURE_SCREENGate — approve/reject literature
B6KNOWLEDGE_EXTRACTExtract structured knowledge
C7SYNTHESISSynthesize findings
C8HYPOTHESIS_GENMulti-agent debate to form hypotheses
D9EXPERIMENT_DESIGNGate — approve/reject design
D10CODE_GENERATIONGenerate experiment code
D11RESOURCE_PLANNINGGPU/MPS/CPU auto-detection
E12EXPERIMENT_RUNSandboxed execution
E13ITERATIVE_REFINESelf-healing on failure
F14RESULT_ANALYSISMulti-agent analysis
F15RESEARCH_DECISIONPROCEED / REFINE / PIVOT
G16PAPER_OUTLINEStructure paper
G17PAPER_DRAFTWrite full paper
G18PEER_REVIEWEvidence-consistency check
G19PAPER_REVISIONIncorporate review feedback
H20QUALITY_GATEGate — final approval
H21KNOWLEDGE_ARCHIVESave lessons to KB
H22EXPORT_PUBLISHEmit LaTeX + BibTeX
H23CITATION_VERIFY4-layer anti-hallucination check

Common Patterns

Pattern: Quick paper on a topic

export OPENAI_API_KEY="$OPENAI_API_KEY"
researchclaw run \
  --topic "Self-supervised learning for protein structure prediction" \
  --auto-approve

Pattern: Reproducible run with full config

# config.arc.yaml
project:
  name: "protein-ssl-research"

research:
  topic: "Self-supervised learning for protein structure prediction"

llm:
  provider: "openai"
  api_key_env: "OPENAI_API_KEY"
  primary_model: "gpt-4o"
  fallback_models: ["gpt-4o-mini"]

experiment:
  mode: "sandbox"
  sandbox:
    python_path: ".venv/bin/python"
  max_iterations: 3
  timeout_seconds: 300
researchclaw run --config config.arc.yaml --auto-approve

Pattern: Use Claude via OpenRouter for best reasoning

export OPENROUTER_API_KEY="$OPENROUTER_API_KEY"

cat > config.arc.yaml << 'EOF'
project:
  name: "my-research"
llm:
  provider: "openrouter"
  api_key_env: "OPENROUTER_API_KEY"
  primary_model: "anthropic/claude-3.5-sonnet"
  fallback_models: ["google/gemini-pro-1.5"]
experiment:
  mode: "sandbox"
  sandbox:
    python_path: ".venv/bin/python"
EOF

researchclaw run --config config.arc.yaml \
  --topic "Efficient KV cache compression for transformer inference" \
  --auto-approve

Pattern: Resume after a failed run

# List runs to find the run ID
researchclaw list

# Resume from last completed stage
researchclaw run --resume rc-20260315-120000-abc123

Pattern: Programmatic batch research

import asyncio
from researchclaw.pipeline import Runner
from researchclaw.config import load_config

topics = [
    "LoRA fine-tuning on limited hardware",
    "Speculative decoding for LLM inference",
    "Flash attention variants comparison",
]

config = load_config("config.arc.yaml")
config.auto_approve = True

for topic in topics:
    config.research.topic = topic
    runner = Runner(config)
    result = runner.run()
    print(f"[{topic}] → {result.deliverables_dir}")

Pattern: OpenClaw one-liner (if using OpenClaw agent)

Share the repo URL with OpenClaw, then say:
"Research mixture-of-experts routing efficiency"

OpenClaw auto-reads RESEARCHCLAW_AGENTS.md, clones, installs, configures, and runs the full pipeline.


Compile the LaTeX Output

# Navigate to deliverables
cd artifacts/rc-*/deliverables/

# Compile (requires a LaTeX distribution)
pdflatex paper.tex
bibtex paper
pdflatex paper.tex
pdflatex paper.tex

# Or upload paper.tex + references.bib directly to Overleaf

Troubleshooting

researchclaw: command not found

# Make sure the venv is active and package is installed
source .venv/bin/activate
pip install -e .
which researchclaw

API key errors

# Verify env var is set
echo $OPENAI_API_KEY
# Should print your key (not empty)

# Set it explicitly for the session
export OPENAI_API_KEY="sk-..."

Experiment sandbox failures

The pipeline self-heals at Stage 13 (ITERATIVE_REFINE). If it keeps failing:

# Increase timeout and iterations in config
experiment:
  max_iterations: 5
  timeout_seconds: 600
  sandbox:
    python_path: ".venv/bin/python"

Citation hallucination warnings

Stage 23 (CITATION_VERIFY) runs a 4-layer check. If references are pruned:

  • This is expected behaviour — fake citations are removed automatically
  • Check verification_report.json for details on which citations were rejected and why

PIVOT loop running indefinitely

Stage 15 (RESEARCH_DECISION) may pivot multiple times. To cap iterations:

research:
  max_pivots: 2
  max_refines: 3

LaTeX compilation errors

# Check for missing packages
pdflatex paper.tex 2>&1 | grep "File.*not found"

# Install missing packages (TeX Live)
tlmgr install <package-name>

Out of memory during experiments

# Force CPU mode in config
experiment:
  sandbox:
    device: "cpu"
    max_memory_gb: 4

Key Concepts

  • PIVOT/REFINE Loop: Stage 15 autonomously decides PROCEED, REFINE (tweak params), or PIVOT (new hypothesis direction). All artifacts are versioned.
  • Multi-Agent Debate: Stages 8, 14, 18 use structured multi-perspective debate — not a single LLM pass.
  • Self-Learning: Each run extracts lessons with 30-day time decay. Future runs on similar topics benefit from past mistakes.
  • Sentinel Watchdog: Background monitor detects NaN/Inf in results, checks paper-evidence consistency, scores citation relevance, and guards against fabrication throughout the run.
  • 4-Layer Citation Verification: arXiv lookup → CrossRef lookup → DataCite lookup → LLM relevance scoring. A citation must pass all layers to survive.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

aris-autonomous-ml-research

No summary provided by upstream source.

Repository SourceNeeds Review
Research

daily-stock-analysis

No summary provided by upstream source.

Repository SourceNeeds Review
Research

understand-anything-knowledge-graph

No summary provided by upstream source.

Repository SourceNeeds Review
General

openclaw-control-center

No summary provided by upstream source.

Repository SourceNeeds Review