paper-to-skill extractor

Paper-to-Skill Extractor

An interactive skill for extracting research paradigms and methodological techniques from cognitive science and neuroscience papers. The output is a well-structured skill conforming to this project's SKILL.md format.

Focus: Strict extraction of reproducible methods — experimental designs, data acquisition parameters, processing pipelines, analysis procedures, and stimulus specifications. This is NOT about summarizing a paper's novelty or theoretical contributions.

Trigger Conditions

Activate this skill when the user:

Provides a paper (PDF path, file, or pasted text) and asks to extract research skills/methods
Uses phrases like "extract skills from this paper", "turn this paper into a skill", "what methods can I reuse from this paper"

Research Planning Protocol

Before extracting skills from a paper, you MUST:

Clarify the extraction goal — What type of methodological knowledge is the user looking for?
Justify the source — Is this paper a suitable source (empirical, methods, review)? What type-specific extraction strategy applies?
Declare expected outputs — What kind of skill(s) do you expect to generate (paradigm design, analysis pipeline, modeling)?
Note limitations — Are there missing parameters, ambiguous descriptions, or domain gaps in this paper?
Present the extraction plan to the user and WAIT for confirmation before proceeding.

For detailed methodology guidance, see the research-literacy skill.

⚠️ Verification Notice

This skill was generated by AI from academic literature. All parameters, thresholds, and citations require independent verification before use in research. If you find errors, please open an issue.

Interactive Workflow

Phase 1: Paper Ingestion

Read the paper provided by the user (PDF path, file content, or pasted sections).

PDF Reading Guidance — Claude Code's Read tool natively supports PDF files. Use the following strategy:

Short PDFs (up to ~10 pages): Read the entire file in a single call with no pages parameter.
Long PDFs (more than 10 pages): Read in chunks using the pages parameter (maximum 20 pages per request). Example sequence: pages: "1-10" , then pages: "11-20" , and so on.
Recommended reading order:

Read pages 1-2 first (abstract + introduction) to identify the paper type and decide whether full extraction is warranted.

Then read the Methods section in detail (locate the relevant page range from the table of contents or section headers).

Read Results and Discussion selectively for reported parameter values not stated in Methods.

Identify the paper type — this determines the extraction strategy:

Experimental paper — contains original experiments with participants
Methods paper — introduces or validates an analysis technique/pipeline
Computational modeling paper — builds or tests formal models of cognition
Review/theoretical paper — synthesizes literature or proposes theoretical frameworks
Confirm the paper type with the user before proceeding.

See references/extraction-guide.md for detailed extraction strategies per paper type.

Phase 2: Content Scanning and Candidate Identification

Scan the paper and identify all extractable methodological content organized into these categories:

Category What to Look For

Experimental Design Paradigm name, trial structure, timing parameters, condition setup, counterbalancing scheme, block design

Data Acquisition Sampling rate, electrode montage, imaging parameters, eye-tracking settings, physiological recording setup

Data Processing Preprocessing steps with parameters, artifact handling methods, data cleaning criteria, epoching parameters

Analysis Methods Statistical models, multiple comparison corrections, effect size calculations, visualization methods, decoding approaches

Stimulus Materials Construction rules, control variables, norming standards, presentation parameters, response mappings

Present candidates to the user in the following format:

I identified the following extractable methods from this paper:

Experimental Design

[1] Paradigm: <name> — <brief description>
[2] Trial structure: <summary of trial flow and timing>

Data Acquisition

[3] <Modality> recording setup: <key parameters>

Data Processing

[4] Preprocessing pipeline: <step summary>
[5] Artifact rejection: <method and criteria>

Analysis Methods

[6] <Analysis name>: <brief description>
[7] <Analysis name>: <brief description>

Stimulus Materials

[8] <Material type>: <construction approach>

Which items would you like me to extract into skills? (Enter numbers, ranges like 1-4, or "all")

Phase 2.5: Suitability Gate

Before presenting candidates, apply this strict suitability filter to each one:

SUITABLE — include if the candidate:

Criterion Examples

Describes an experimental paradigm or design with specifics Trial structure, timing parameters, condition definitions, counterbalancing

Describes a data processing pipeline with parameters Preprocessing steps, filter cutoffs, software settings

Describes an analysis method with concrete steps Statistical model specification, time-frequency decomposition, classification pipeline

Contains specific numerical parameters or settings Thresholds, epoch windows, stimulus dimensions, sample sizes

Describes stimulus construction norms Norming procedures, controlled variables, material selection criteria

Describes a computational model with equations/parameters Model fitting procedure, parameter priors, model comparison strategy

Provides actionable methodological recommendations with specific values "Use minimum 30 trials per condition", "Set high-pass filter no lower than 0.1 Hz"

NOT SUITABLE — filter out if the candidate:

Criterion Examples

Is narrative or historical overview "The study of attention began with William James..."

Is a definition without actionable parameters "Working memory is defined as..."

Is theoretical debate without methods "The modularity hypothesis predicts..."

Is motivation or background only "Previous studies have shown that..." leading to no method

Contains only results without methodological detail "The ANOVA revealed a significant main effect..."

Decision rule: "Does this candidate contain enough specific, actionable detail that a researcher could REPRODUCE a method, pipeline, or paradigm from it?" If YES → [SUITABLE] . If NO or UNCERTAIN → [FILTERED — reason] .

Mark each candidate when presenting to the user. Filtered candidates are shown but de-prioritized — the user can override any filter decision.

Phase 3: User Selection and Confirmation

Receive the user's selection of which items to extract.
For each selected item, perform deep extraction (see extraction depth requirements below).
Present the extracted detail for user review before generating the final skill file.

Phase 4: Skill Generation

Generate the skill file(s) using the standard template (see references/skill-template.md ).
Each generated skill must:
Have valid YAML frontmatter with name , description , and papers fields
Include all numerical parameters with their citations from the source paper
Stay under 500 lines; use references/ subdirectory for overflow
Pass the domain-knowledge litmus test: "Would a competent programmer who has never taken a cognitive science course get this wrong?"
Present the generated skill to the user and ask for confirmation before saving.

Phase 5: Self-Verification (Hallucination Check)

After generating the skill but before saving, perform a systematic verification of every numerical parameter and specific factual claim against the source paper.

Verification procedure — for each numerical value or specific claim in the generated skill:

Locate in source — Find the corresponding statement in the original paper. Use the source location recorded during extraction.
Verify value — Confirm exact numerical match, correct units, and complete context (e.g., "0.1-30 Hz bandpass" must not be truncated to "0.1 Hz").
Classify any issues found:

Issue Type Description Severity

not_found

Claim appears in the skill but cannot be found in the source — likely hallucinated High

value_mismatch

Value exists in source but differs (e.g., skill says "250 ms", source says "200 ms") High

unit_error

Numerical value matches but units are wrong or missing High

context_distortion

Value is technically present but used in misleading context Medium

location_wrong

Value is correct but the claimed source location is wrong Low

incomplete

Skill presents a partial version of a parameter that has important qualifiers Low

Reporting — Present the verification results to the user:

Self-Verification Results:

Claims checked: N
Verified: M
Issues found: K
[HIGH] <claim> — <issue type>: <details>
[LOW] <claim> — <issue type>: <details>

Rules:

High-severity issues (not_found, value_mismatch, unit_error) must be corrected before saving.
Medium/low-severity issues are flagged but the skill can be saved with them annotated.
Do NOT flag reasonable paraphrasing, organizational differences, or standard terminology substitutions.

Extraction Depth Requirements

For every extracted item, the following cross-cutting rules apply to ALL categories:

Cross-Cutting Extraction Rules

Preserve exact numbers — Never round. If the paper says "513 ms", write "513 ms", not "~500 ms".
Track source location — For every extracted numerical value, record where it appears in the paper: "Section X.Y, paragraph N", "Table N", "Figure N caption", or "Supplementary Materials, page N". This enables downstream verification.
Flag missing information — If a standard parameter for this method type is not reported in the paper, explicitly note its absence (e.g., "Filter order: not reported").
Capture rationale — When the authors explain WHY they chose a parameter value, include that justification alongside the value.
Note deviations from convention — When authors explicitly deviate from field conventions, capture both what they did and their stated reason.

These rules apply to every category below. The parameter tables in generated skills must include a Source Location column (see references/skill-template.md ).

Experimental Design Parameters

Paradigm name and classification (e.g., "oddball paradigm", "visual world paradigm")
Number of conditions and their operational definitions
Trial sequence: fixation → stimulus → ISI → response window (with exact ms values)
Number of trials per condition and total
Block structure and rest intervals
Counterbalancing method (Latin square, full counterbalancing, pseudo-randomization constraints)
Practice trial specifications
Participant exclusion criteria applied at the design level

Data Acquisition Parameters

EEG: Sampling rate (Hz), electrode count and montage system, reference electrode, ground electrode, impedance threshold, amplifier model, filter settings during recording
fMRI: TR, TE, voxel size, slice count, slice order, field strength, coil type, number of volumes, dummy scans discarded
Eye-tracking: Sampling rate, calibration procedure (5-point, 9-point), fixation definition criteria, saccade velocity threshold
Behavioral: Response device, response mapping, timeout duration, feedback presence/absence
MEG: Sampling rate, sensor type and count, head position indicator settings, noise reduction method

Data Processing Pipeline

Software used (with version numbers)
Step-by-step sequence with order preserved
Filter parameters: type (FIR/IIR), cutoff frequencies, order/transition bandwidth, causal vs. zero-phase
Re-referencing scheme
Epoching: time window relative to event, baseline correction window
Artifact rejection: method (threshold, ICA, regression), specific thresholds and criteria
Trial/participant exclusion rates reported
Interpolation method for bad channels

Analysis Method Details

Statistical test name and implementation
Model specification (for regression/mixed models: fixed effects, random effects, link function)
Multiple comparison correction: method, parameters (e.g., cluster-forming threshold, number of permutations)
Region of interest definitions (coordinates, anatomical labels, time windows)
Effect size measure used and interpretation benchmarks
Visualization methods with axis specifications

Stimulus Material Specifications

Material type (words, images, sounds, videos)
Total number of stimuli and per-condition counts
Controlled variables and matching criteria (frequency, length, luminance, valence)
Norming source and database (e.g., "SUBTLEX-US word frequency", "IAPS valence ratings")
Presentation parameters: duration, size/visual angle, position, contrast
Randomization constraints applied to stimulus ordering

Computational Modeling Parameters

Model name and class (e.g., "drift-diffusion model", "Bayesian ideal observer", "recurrent neural network")
Model equations and architecture: All equations with variable definitions, relationship between equations, boundary/initial conditions
Free vs. fixed parameters: List each with cognitive interpretation and role
Parameter constraints and priors:
Constraint bounds (lower, upper) for each free parameter
Prior distribution family and hyperparameters (if Bayesian), with justification
Starting values and number of starting points (if frequentist optimization)
Fitting methods:
Objective function (maximum likelihood, least squares, Bayesian posterior)
Optimization algorithm and implementation (software, package, version)
For MCMC: number of chains, samples per chain, burn-in period, thinning, convergence diagnostic (R-hat threshold)
For MLE: convergence criteria, number of random restarts
Data summary statistics used for fitting (if not raw trial data)
Model comparison: Comparison metric (AIC, BIC, WAIC, Bayes factor), how group-level comparison was performed, model recovery/confusion matrix results
Simulation procedures: Parameter settings used, number of simulated datasets, random seed handling, what predictions were generated

Methodological Recommendations (for Reviews/Textbooks)

When extracting from review papers, meta-analyses, or textbook chapters, capture:

Specific parameter recommendations with justification and evidence strength
Recommended analysis pipelines with step-by-step parameter values
Decision trees or flowcharts for method selection (e.g., "if X, use method A; if Y, use method B")
Meta-analytic effect sizes with confidence intervals and moderator results
Sample size recommendations based on reported effect sizes and power analyses
Common methodological pitfalls identified across studies, with concrete examples

Quality Checks Before Output

Before presenting the final skill, verify both structural compliance and content quality.

Structural Compliance Checklist

Every generated skill must pass these checks before saving:

File name: The core file is named exactly SKILL.md (uppercase) — not skill.md , Skill.md , or any other variant
Directory name: Uses kebab-case (lowercase, hyphen-separated) — e.g., mmn-oddball-paradigm/ , not MMN_Oddball_Paradigm/
YAML frontmatter: Contains at minimum name (human-readable) and description (one-sentence summary) fields
Papers field: Frontmatter includes a papers field listing the source paper(s) in "Author, Year" format
Dependencies field: Frontmatter includes dependencies.required: [research-literacy] (all domain skills require this)
Research Planning Protocol: A customized version of the standard preamble is included after the "When to Use" section and before the first domain-specific logic section (see the research-literacy skill for the template)
Line count: SKILL.md is under 500 lines; overflow content is placed in references/ subdirectory
References directory: If supplementary files exist, they live in references/ and are explicitly referenced from SKILL.md
Encoding: UTF-8, LF line endings, 2-space indentation for YAML

Content Quality Checklist

Completeness — Every numerical parameter mentioned in the paper's methods section is captured
Citation accuracy — All values cite the source paper (Author, Year) and page/table number where possible
Reproducibility — Another researcher could implement this method from the skill alone, without reading the original paper
Domain specificity — Every item passes the litmus test: "Would a competent programmer who has never taken a cognitive science course get this wrong?"
Parameter precision — No rounding or approximation of reported values; use exact figures from the paper
Source traceability — Every numerical parameter includes a source location (Section/Table/Figure reference)

Required Structured Sections in Generated Skills

Every generated skill must include these sections (may be empty if no items apply, but must be explicitly checked):

Missing Information

— List standard parameters for this method type that the paper does not report. Format: "- [Parameter name]: Not reported. Standard value from [field/reference] is [value]." This section helps users know what they must determine independently.

Deviations from Convention

— List any methodological choices that deviate from field conventions, with the authors' stated rationale. Format: "- [Choice]: Authors used [X] instead of conventional [Y] because [reason]." This section alerts users to non-standard decisions.

Handling Ambiguity

When the paper is unclear or omits details:

Missing parameters: Flag explicitly — "The paper does not report [X]. This must be determined empirically or sourced from [suggested reference]."
Ambiguous descriptions: Present both plausible interpretations and ask the user to select one.
Non-standard methods: Note deviations from field conventions and flag whether the deviation is intentional (per authors' justification) or potentially an error.
Supplementary materials: Ask the user if supplementary materials are available, as critical method details are often reported there.

Multi-Skill Extraction

When a paper contains multiple independent methods worth extracting:

Generate separate skills for each method that can stand alone (e.g., a paradigm skill and an analysis skill from the same paper).
Cross-reference between skills using relative paths when methods are interdependent.
Each skill must be independently usable — no skill should require reading another skill to function.

Batch Extraction Mode

When the user provides multiple PDFs or a directory of papers, apply the following workflow:

Triggering Batch Mode

Batch mode activates when the user:

Provides two or more PDF paths in a single message
Points to a directory containing multiple papers
Uses phrases like "extract skills from all these papers" or "process this folder"

Batch Processing Steps

Inventory the inputs — List all papers found (file names + page counts if determinable) and present the list to the user for confirmation before reading anything.
Process each paper sequentially — Run each paper through the full 4-phase workflow (Ingestion → Scanning → Selection → Generation). Apply the PDF reading strategy from Phase 1 to every paper.
Present candidates grouped by paper — After scanning all papers, show all extractable candidates together, clearly grouped under each paper's title:

Paper 1: <Title / filename>

[1] Paradigm: ...
[2] Analysis: ...

Paper 2: <Title / filename>

[3] Paradigm: ...
[4] Data Acquisition: ...

Which items would you like to extract? (Enter numbers, ranges, "all", or "all from paper 1")

Allow cross-paper skill merging — If two or more papers describe the same or highly overlapping methods (e.g., both use the same EEG preprocessing pipeline with the same parameters), flag the overlap and offer to merge them into a single skill that cites all source papers. Only merge when the core parameters and decision logic are genuinely shared; keep skills separate when parameter choices differ.

Generate skills independently — Each generated skill must be fully self-contained. No skill may depend on another skill generated from a different paper in the same batch. Cross-reference between skills using relative paths only for closely related methods from the same paper (as in Multi-Skill Extraction above).

Batch Quality Checks

Before finalizing batch output, verify:

Every skill cites its specific source paper(s), not just the batch as a whole.
Merged skills list all contributing papers in the papers frontmatter field.
Skill directory names remain unique across the batch; if two papers generate a similar skill, append a disambiguating suffix (e.g., mmn-oddball-paradigm-smith2019 vs. mmn-oddball-paradigm-jones2021 ).

Output Structure

Generated skills follow the directory convention:

skills/<generated-skill-name>/ SKILL.md # Core skill content (<500 lines) references/ parameter-table.yaml # Full parameter values (if needed) pipeline-steps.md # Extended pipeline details (if needed)

The <generated-skill-name> uses kebab-case and is descriptive of the specific method, not the paper. For example:

A paper on ERP mismatch negativity → skills/mmn-oddball-paradigm/
A paper on fMRI preprocessing → skills/fmri-preprocessing-pipeline/
A paper on drift-diffusion modeling → skills/drift-diffusion-fitting/

Example Interaction Summary

User: Extract skills from this paper. [provides PDF path]

Agent: [Reads paper] This is an experimental paper (EEG study on language processing). Does that match your understanding?

User: Yes.

Agent: I identified the following extractable methods: [Lists candidates by category with numbers] Which items would you like me to extract?

User: 1, 3, 4-6

Agent: Here is the detailed extraction for your selected items: [Shows extracted parameters for review] Does this look correct? Any corrections needed?

User: Looks good, but the sampling rate was 512 Hz not 500 Hz.

Agent: Corrected. Here is the generated skill: [Shows SKILL.md content] Shall I save this to skills/<name>/SKILL.md?

User: Yes.

Agent: [Saves file] Skill saved to skills/<name>/SKILL.md.