Deep Researcher v2.0

Comprehensive research methodology with file-based tracking, parallel execution, and context management for investigations requiring 5+ sources.

CRITICAL: All medical evidence and citations must come from PubMed MCP. No exceptions.

Research Modes

Quick Research (1-4 sources): Work in-context, no file structure needed.

Deep Research (5+ sources): Use file-based tracking below.

Research Sources (STRICT POLICY)

ALLOWED for Medical Citations

Source Tool Use Case

PubMed MCP pubmed_search_articles , pubmed_fetch_contents , pubmed_article_connections

ALL medical evidence, trials, mechanisms

Official Guidelines web_fetch to ACC/ESC/ADA/AHA URLs only Guideline recommendations

AstraDB RAG Knowledge pipeline Textbook references, pre-loaded guidelines

NOT ALLOWED for Medical Citations

Source Why Excluded Allowed Use

OpenAlex Quality variable REMOVED

Perplexity Not peer-reviewed Trend discovery only, NEVER cite

General web search Unreliable Topic discovery only, NEVER cite

News articles Not primary evidence Background context only

PubMed Quality Filters

Prefer (Tier 1):

Randomized Controlled Trials (RCTs)
Meta-analyses and Systematic Reviews
Guidelines from ACC/ESC/ADA/AHA

Accept (Tier 2):

Large observational studies from Q1 journals
Cohort studies with >1000 patients
Registry data from established registries

Use Cautiously (Tier 3):

Case series (only if no better evidence)
Expert consensus statements
Narrative reviews (as background, not primary evidence)

Reject:

Case reports (except for rare conditions)
Letters to editor
Preprints without peer review
Animal studies (unless specifically about mechanisms)

Deep Research Workflow

Progress Tracking

Create this checklist and update after each step:

Deep Research Progress:

Step 1: Initialize research project
Step 2: Define scope and plan
Step 3: Execute research threads (parallel when possible)
Step 4: Validate and cross-reference
Step 5: Synthesize from files
Step 6: Generate final report

Step 1: Initialize Research Project

For research requiring 5+ sources, create a project structure:

mkdir -p ~/research_{topic}/sources mkdir -p ~/research_{topic}/threads

Project Structure:

~/research_{topic}/ ├── plan.md # Research questions, scope, thread assignments ├── progress.md # Living checklist, updated throughout ├── sources/ │ └── pubmed.md # PubMed search results and abstracts ├── threads/ │ ├── thread_1.md # Independent research thread │ ├── thread_2.md # Another thread │ └── ... ├── validation.md # Cross-reference and credibility check ├── synthesis.md # Cross-thread analysis └── report.md # Final deliverable

Why file-based? Context windows fill up. Writing findings to files lets you:

Continue researching without context pressure
Synthesize from persistent storage, not memory
Produce larger, more comprehensive reports
Resume if interrupted

Step 2: Define Scope and Research Plan

Write plan.md with:

Research Plan: {Topic}

Primary Question

[The main thing we're trying to answer]

Scope

Include: [what's in scope]
Exclude: [what's explicitly out]
Depth: [overview | detailed | exhaustive]
Deliverable: [report type and length]

Research Threads

Thread 1: {Subtopic A}

Questions to answer: ...
PubMed search strategy: [MeSH terms, filters]
Expected study types: RCTs, meta-analyses, etc.
Can run parallel? Yes/No

Thread 2: {Subtopic B}

Questions to answer: ...
PubMed search strategy: ...
Can run parallel? Yes/No

[Continue for 2-5 threads]

Thread Dependencies

Thread 3 depends on Thread 1 findings
Threads 1, 2, 4 can run in parallel

Synthesis Strategy

How will threads combine into final answer?

Planning Guidelines:

Research Type Threads Pattern

Simple fact-finding 1-2 Sequential

Drug comparison 1 per drug (max 5) Parallel

Complex investigation 3-5 thematic Mixed

Literature review By time period or theme Sequential

Step 3: Execute Research Threads

PubMed Search Strategy

For each thread, use structured PubMed queries:

Example search for SGLT2 CV outcomes

pubmed_search_articles( queryTerm="SGLT2 inhibitor cardiovascular outcomes randomized controlled trial", maxResults=20, sortBy="relevance" )

Then fetch full details for top results

pubmed_fetch_contents(pmids=["PMID1", "PMID2", ...])

Find related articles for key papers

pubmed_article_connections( sourcePmid="key_paper_pmid", relationshipType="pubmed_similar_articles" )

Parallel Execution Pattern

For independent threads, execute PubMed searches in parallel (multiple tool calls in one turn), then write each to its thread file.

Example: Comparing SGLT2 Inhibitors

Thread 1: Empagliflozin → pubmed_search "empagliflozin cardiovascular RCT" → threads/empagliflozin.md Thread 2: Dapagliflozin → pubmed_search "dapagliflozin cardiovascular RCT" → threads/dapagliflozin.md Thread 3: Canagliflozin → pubmed_search "canagliflozin cardiovascular RCT" → threads/canagliflozin.md

Execute all three searches, then write findings to respective files.

Sequential Execution Pattern

For dependent threads, complete each fully before starting the next.

Thread File Format

Each threads/thread_N.md should contain:

Thread: {Subtopic}

PubMed Searches Executed

Query: [exact query] → [N results] → Top PMIDs: [list]
Query: [exact query] → [N results] → Top PMIDs: [list]

Key Findings

Finding 1: [Title]

PMID: [number]
Citation: [Authors, Journal, Year]
Study type: RCT / Meta-analysis / Cohort / etc.
Population: [N patients, characteristics]
Key result: [HR/OR with 95% CI, p-value]
Quality: High / Medium / Low [+ brief justification]

Finding 2: [Title]

PMID: [number] ...

Contradictions Found

PMID X says [claim], PMID Y says [different claim]
Potential explanation: [patient population, endpoints, timing, etc.]

Gaps Identified

No RCT data on [specific question]
Limited evidence in [patient subgroup]

Thread Summary

[2-3 sentence synthesis of this thread's findings with key PMIDs cited]

Context Offloading

After every 5-7 tool calls:

Write current findings to appropriate file
Update progress.md with status
Continue with fresh context

Trigger for offload:

Context feeling "full" (responses slowing, losing track)
Switching between threads
Before any synthesis step

Step 4: Validate and Cross-Reference

Read all thread files, then create validation.md :

Validation Report

Facts Requiring Cross-Reference

Claim	Thread Source	PMID	Verification Status	Confidence
SGLT2i reduces HF hospitalization	Thread 1	12345678	Confirmed by PMIDs 23456789, 34567890	High
Benefit extends to HFpEF	Thread 2	45678901	Conflicting: PMID 56789012 shows null	Investigate

Contradictions Analysis

Contradiction 1: [Description]

Position A: PMID [X], [study name], found [result]
Position B: PMID [Y], [study name], found [result]
Resolution: [Population difference / endpoint difference / timing / unresolved]

Source Quality Assessment

PMID	Study	Type	N	Quality	Notes
12345678	EMPA-REG	RCT	7,020	High	Industry-funded but well-designed
23456789	Meta-analysis	MA	45,000	High	Published in Lancet

Validated Knowledge Base

[List of facts we're confident in, with PMIDs]

SGLT2 inhibitors reduce CV death in T2DM with established CVD (PMID: 12345678, 23456789)
Benefit on HF hospitalization is consistent across the class (PMID: 34567890, 45678901)
...

Step 5: Synthesize from Files

Critical: Read from files, not memory.

Read all thread files

cat ~/research_{topic}/threads/*.md

Read validation

cat ~/research_{topic}/validation.md

Write synthesis.md :

Synthesis: {Topic}

Cross-Thread Patterns

[What themes emerge across multiple threads?]

Key Insights

[Insight that required combining multiple threads]
[Insight that wasn't obvious in any single thread]
...

The Answer

[Direct response to the primary research question, with PMID citations]

Evidence Strength Assessment

Strong evidence (multiple RCTs): [claims]
Moderate evidence (single RCT or consistent observational): [claims]
Limited evidence (observational only): [claims]
Expert opinion / guideline extrapolation: [claims]

Remaining Gaps

[What we still don't know and would need to investigate further]

Step 6: Generate Final Report

Write report.md using the synthesis:

{Title}

Executive Summary

[3-5 sentences: question, key finding, main conclusion with strongest PMID]

Research Question and Scope

[From plan.md]

Methodology

Database: PubMed via NCBI MCP
Search date: [date]
Total articles screened: [N]
Articles included: [N]
Study types: [breakdown]

Findings

{Theme 1}

[Narrative synthesis with inline PMID citations]

{Theme 2}

...

Analysis

[Patterns, implications, connections]

Conclusions

[Primary conclusion with evidence level]
[Secondary conclusions]

Clinical Implications

[If applicable: what this means for practice]

Limitations

[Search limitations]
[Evidence gaps]
[Potential biases]

References

[Full reference list with PMIDs and DOIs]

Author A, Author B, et al. Title. Journal. Year;Vol:Pages. PMID: XXXXXXXX. DOI: XX.XXXX/XXXXX
...

Parallel Research Patterns

Pattern A: Drug/Entity Comparison

Use when: Comparing 2-5 similar entities (drugs, devices, techniques)

User: "Compare CV outcomes of GLP-1 agonists" → Thread per drug (semaglutide, tirzepatide, liraglutide) → All threads parallel (same PubMed structure) → Comparison matrix synthesis

Pattern B: Pro/Con Analysis

Use when: Topic has debate or controversy

User: "Analyze the evidence on aggressive LDL lowering" → Thread 1: Evidence FOR aggressive targets (PubMed: LDL <55 outcomes) → Thread 2: Evidence AGAINST/concerns (PubMed: LDL lowering adverse effects) → Thread 3: Current guidelines (fetch ACC/ESC guideline URLs) → Threads 1-2 parallel, Thread 3 after

Pattern C: Evidence + Guidelines

Use when: Need both primary evidence and clinical guidance

User: "What's the evidence on TAVR durability?" → Thread 1: Trial data (PubMed: TAVR long-term outcomes RCT) → Thread 2: Registry data (PubMed: TAVR registry durability) → Thread 3: Guidelines (fetch ACC/ESC valve guidelines) → All parallel

Pattern D: Historical Evolution

Use when: Understanding how evidence has evolved

User: "How has heart failure treatment evolved?" → Thread 1: Pre-neurohormonal era (PubMed: heart failure treatment 1980-1990) → Thread 2: ACE/ARB/BB era (PubMed: heart failure ACE inhibitor landmark) → Thread 3: Modern era ARNI/SGLT2 (PubMed: heart failure SGLT2 ARNI) → Sequential (each builds context for next)

Quality Checkpoints

After Step 2 (Planning)

Research question is specific and answerable
PubMed search strategies are defined for each thread
Threads are independent where marked parallel
Expected study types are specified

After Step 3 (Execution)

Each thread has 3+ credible PubMed sources
Key claims have specific data (HR, CI, p-value)
All citations have PMIDs
Gaps and contradictions are documented
Thread summaries are written

After Step 4 (Validation)

Key facts cross-referenced across threads
Contradictions analyzed with potential explanations
Source quality assessed for each major citation
Validated knowledge base compiled

After Step 5 (Synthesis)

Cross-thread patterns identified
Primary question directly answered
Evidence strength honestly assessed
Insights go beyond any single thread

Before Delivery

Report structure matches user's requested format
All claims have PMID citations
Executive summary is truly executive (skimmable)
Reference list is complete with DOIs

Common Research Pitfalls

Pitfall Symptom Fix

Context overflow Losing track of earlier findings Write to files every 5-7 tool calls

Confirmation bias All sources agree suspiciously Search for contradicting evidence explicitly

Recency bias Only 2023-2024 sources Include landmark trials regardless of date

Source homogeneity All RCTs, no guidelines Add guideline thread for clinical context

Scope creep Research expanding endlessly Return to plan.md, enforce boundaries

Premature synthesis Concluding before validation Complete Step 4 before Step 5

Memory-based synthesis Citing from recall Read files explicitly during Step 5

Non-PubMed citations Citing Perplexity/web Delete and replace with PubMed source

Example: Full Research Session

User: "Research the current evidence on colchicine for cardiovascular prevention"

Step 1: Initialize

mkdir -p ~/research_colchicine_cv/sources mkdir -p ~/research_colchicine_cv/threads

Step 2: Plan (write to plan.md)

Primary question: What's the evidence for colchicine in CV prevention?
Thread 1: Major RCTs (COLCOT, LoDoCo2, CLEAR SYNERGY)
PubMed: "colchicine cardiovascular randomized controlled trial"
Thread 2: Mechanisms and anti-inflammatory hypothesis
PubMed: "colchicine inflammation atherosclerosis mechanism"
Thread 3: Guidelines and clinical adoption
Fetch: ACC/ESC guideline URLs for stable CAD
Thread 4: Safety and practical considerations
PubMed: "colchicine adverse effects cardiovascular"
Threads 1, 2, 4 parallel; Thread 3 after 1 completes

Step 3: Execute

Parallel searches

pubmed_search_articles(queryTerm="colchicine cardiovascular randomized controlled trial", maxResults=15) pubmed_search_articles(queryTerm="colchicine inflammation atherosclerosis mechanism", maxResults=10) pubmed_search_articles(queryTerm="colchicine adverse effects cardiovascular", maxResults=10)

Fetch top results

pubmed_fetch_contents(pmids=["31733140", "32865377", "37634428"]) # COLCOT, LoDoCo2, CLEAR

Write to thread files

Step 4: Validate

Read all thread files
Cross-reference mortality data across trials
Note: CLEAR SYNERGY neutral vs positive COLCOT/LoDoCo2
Analyze: Patient population differences (post-ACS vs chronic CAD)
Write validation.md

Step 5: Synthesize

Read from files
Pattern: Inflammation hypothesis supported, but patient selection matters
Insight: Post-ACS (COLCOT) benefit clear; chronic stable CAD (CLEAR) less certain
Write synthesis.md

Step 6: Report

Structured report with evidence summary
Clear recommendation by patient type
All PMIDs cited
Complete reference list

Integration with Other Skills

This skill provides research foundation for:

cardiology-editorial → Use research output for trial analysis
cardiology-newsletter-writer → Research before writing
youtube-script-master → Research for script evidence base
x-post-creator-skill → Research before tweet generation

Workflow:

User requests content on topic
Run deep-researcher first (this skill)
Pass validated findings to writing skill
Writing skill cites PMIDs from research output

When NOT to Use This Skill

Simple factual questions (use PubMed MCP directly)
Trend discovery (use Perplexity, but don't cite)
Non-medical topics (this skill is optimized for PubMed)
Quick content needs (use writing skill directly with inline research)

Use this skill when you need:

5+ sources synthesized
Complex multi-faceted questions
Rigorous evidence assessment
Comprehensive literature coverage