Research Ideation

Generate structured research questions, testable hypotheses, and empirical strategies from a topic, phenomenon, or dataset.

Input: $ARGUMENTS — a topic (e.g., "minimum wage effects on employment"), a phenomenon (e.g., "why do firms cluster geographically?"), or a dataset description (e.g., "panel of US counties with pollution and health outcomes, 2000-2020").

Steps

Understand the input. Read $ARGUMENTS and any referenced files. Check master_supporting_docs/ for related papers. Check .claude/rules/ for domain conventions.

Generate 3-5 research questions ordered from descriptive to causal:

Descriptive: What are the patterns? (e.g., "How has X evolved over time?")
Correlational: What factors are associated? (e.g., "Is X correlated with Y after controlling for Z?")
Causal: What is the effect? (e.g., "What is the causal effect of X on Y?")
Mechanism: Why does the effect exist? (e.g., "Through what channel does X affect Y?")
Policy: What are the implications? (e.g., "Would policy X improve outcome Y?")

For each research question, develop:

Hypothesis: A testable prediction with expected sign/magnitude
Identification strategy: How to establish causality (DiD, IV, RDD, synthetic control, etc.)
Data requirements: What data would be needed? Is it available?
Key assumptions: What must hold for the strategy to be valid?
Potential pitfalls: Common threats to identification
Related literature: 2-3 papers using similar approaches

Rank the questions by feasibility and contribution.

Save the output to quality_reports/research_ideation_[sanitized_topic].md

Output Format

Research Ideation: [Topic]

Date: [YYYY-MM-DD] Input: [Original input]

Overview

[1-2 paragraphs situating the topic and why it matters]

Research Questions

RQ1: [Question] (Feasibility: High/Medium/Low)

Type: Descriptive / Correlational / Causal / Mechanism / Policy

Hypothesis: [Testable prediction]

Identification Strategy:

Method: [e.g., Difference-in-Differences]
Treatment: [What varies and when]
Control group: [Comparison units]
Key assumption: [e.g., Parallel trends]

Data Requirements:

[Dataset 1 — what it provides]
[Dataset 2 — what it provides]

Potential Pitfalls:

[Threat 1 and possible mitigation]
[Threat 2 and possible mitigation]

Related Work: [Author (Year)], [Author (Year)]

[Repeat for RQ2-RQ5]

Ranking

RQ	Feasibility	Contribution	Priority
1	High	Medium	...
2	Medium	High	...

Suggested Next Steps

[Most promising direction and immediate action]
[Data to obtain]
[Literature to review deeper]

Principles

Be creative but grounded. Push beyond obvious questions, but every suggestion must be empirically feasible.
Think like a referee. For each causal question, immediately identify the identification challenge.
Consider data availability. A brilliant question with no available data is not actionable.
Suggest specific datasets where possible (FRED, Census, PSID, administrative data, etc.).