cognitive science power analysis

Cognitive Science Power Analysis

Purpose

This skill encodes domain-specific knowledge for planning adequately powered studies in cognitive science and neuroscience. It provides:

Effect size priors calibrated to specific paradigms and modalities (behavioral, EEG/ERP, fMRI, clinical/developmental)
Sample size recommendations grounded in empirical meta-analyses rather than arbitrary conventions
Power analysis workflow guidance tailored to the design complexities of cognitive neuroscience (repeated measures, multilevel, neuroimaging-specific tools)

An AI agent needs this because generic power analysis advice (e.g., "use G*Power with d = 0.5") fails to capture the enormous variability in effect sizes across cognitive science paradigms, and because neuroimaging modalities have unique statistical considerations.

When to Use This Skill

A researcher is designing a new behavioral, EEG, or fMRI experiment and needs sample size justification
A grant proposal requires a power analysis section
A preregistration document needs effect size justification and sample size rationale
Someone asks "how many participants do I need?" for a cognitive/neuroscience study
Reviewing whether a published study was adequately powered

Research Planning Protocol

Before executing the domain-specific steps below, you MUST:

State the research question — What study is being planned and what effect is being powered for?
Justify the method choice — Why this design and analysis approach? What alternatives were considered?
Declare expected outcomes — What is the smallest effect size of interest (SESOI)?
Note assumptions and limitations — What assumptions does this power analysis make? Where could it mislead?
Present the plan to the user and WAIT for confirmation before proceeding.

For detailed methodology guidance, see the research-literacy skill.

⚠️ Verification Notice

This skill was generated by AI from academic literature. All parameters, thresholds, and citations require independent verification before use in research. If you find errors, please open an issue.

Core Workflow

Step 1: Identify the Research Modality and Design

Determine which modality and design type apply:

Modality Common Designs Key Consideration

Behavioral Between-groups, within-subjects, mixed Effect sizes vary enormously by paradigm

EEG/ERP Within-subjects repeated measures Trial count matters as much as participant count

fMRI (task) Within-subjects block/event-related Whole-brain vs. ROI analysis affects power

fMRI (individual differences) Correlational, between-subjects Requires much larger N than task contrasts

Clinical/Developmental Case-control, longitudinal Recruitment constraints often limit N; adjust design

Step 2: Obtain an Effect Size Prior

Do not use generic benchmarks (Cohen's "small/medium/large"). Instead:

Best option: Use a meta-analytic estimate for the specific paradigm. See references/effect-sizes.md for a curated library organized by modality.
Second option: Use the smallest effect size of interest (SESOI) — the minimum effect that would be theoretically or practically meaningful (Lakens, 2022).
Third option: Use pilot data, but apply shrinkage correction — pilot studies systematically overestimate effect sizes (Albers & Lakens, 2018).
Last resort: Use the modality-specific median effect sizes from large-scale meta-analyses (see below).

Modality-level median effect sizes (use only when paradigm-specific estimates are unavailable):

Modality Median Effect Size Source

Behavioral (cognitive psychology) d = 0.40 Brysbaert, 2019

EEG/ERP component differences d = 0.50 - 1.00 Boudewyn et al., 2018; Clayson et al., 2019

fMRI task activation d = 0.75 - 1.00 (within-subject) Poldrack et al., 2017

fMRI brain-behavior correlation r = 0.10 - 0.20 Marek et al., 2022

Clinical group differences d = 0.30 - 0.80 Leucht et al., 2015; Button et al., 2013

Critical warning: The median statistical power in neuroscience has been estimated at only 21% (Button et al., 2013, Nature Reviews Neuroscience). Many published effect sizes are inflated by publication bias. Always apply skepticism to effect sizes from underpowered, unreplicated studies.

Step 3: Conduct the Power Analysis

Choose method based on design complexity:

Simple Designs (t-test, one-way ANOVA, correlation)

Use analytic solutions via G*Power or pwr (R):

Target: 80% power (minimum) or 90% power (recommended) Alpha: 0.05 (two-tailed unless directional hypothesis is justified)

Two-sample t-test: pwr.t.test(d = effect_size, power = 0.80, sig.level = 0.05, type = "two.sample")
Within-subjects t-test: pwr.t.test(d = effect_size_dz, power = 0.80, sig.level = 0.05, type = "paired")
Correlation: pwr.r.test(r = effect_size, power = 0.80, sig.level = 0.05)

Complex Designs (mixed ANOVA, multilevel, mediation)

Use simulation-based power analysis:

simr (R package): For linear mixed-effects models (Green & MacLeod, 2016)
Superpower (R/Shiny): For factorial ANOVA designs (Lakens & Caldwell, 2021)
Monte Carlo simulation: For non-standard designs — simulate data under the expected effect, run analysis, repeat 10,000+ times

Neuroimaging-Specific

fMRIpower: Power for fMRI group analyses (Mumford & Nichols, 2008)
NeuroPowerTools: Web-based fMRI power calculator (Durnez et al., 2016)
For EEG/ERP: No standard tool; use simulation with expected component amplitudes and noise levels. See references/sample-size-guide.md for worked examples.

Step 4: Apply Modality-Specific Rules of Thumb

Use these as sanity checks, not replacements for formal power analysis:

Modality Minimum N (per group/condition) Basis

Behavioral (medium effect, d ≈ 0.5) n = 30-50 per group Brysbaert, 2019

Behavioral (small effect, d ≈ 0.2) n = 80-100 per group Brysbaert, 2019

Behavioral (within-subjects, d_z ≈ 0.4) n = 50-65 Brysbaert, 2019

EEG/ERP (within-subjects) n = 25-40 Boudewyn et al., 2018

fMRI (task activation, within-subjects) n = 30-50 Cremers et al., 2017; Poldrack et al., 2017

fMRI (individual differences / brain-behavior) n = 100+ (ideally 200+) Marek et al., 2022

fMRI (clinical group comparison) n = 30-50 per group Button et al., 2013

Clinical/patient studies n = 20-30 per group (minimum) Leucht et al., 2015

Developmental (cross-sectional age groups) n = 25-40 per age group Mills & Tamnes, 2014

Step 5: Document and Report

For preregistration and manuscripts, the power analysis section must include:

Effect size used and its source (meta-analysis, pilot, SESOI)
Power analysis method (analytic, simulation-based, tool used)
Target power level (80% or 90%) and alpha level
Resulting sample size and any adjustments (attrition, exclusion rate)
Sensitivity analysis: What is the minimum detectable effect at the planned N?

Template language:

"Based on the meta-analytic effect size of d = [X] reported by [Author, Year], a power analysis using [tool] indicated that N = [X] participants per group would be needed to detect this effect with [80/90]% power at alpha = .05 (two-tailed). Anticipating a [X]% attrition/exclusion rate, we plan to recruit N = [adjusted X]."

Common Pitfalls

Using Cohen's generic benchmarks as effect size priors: Cohen (1988) himself warned these were rough guidelines. Cognitive science effects range from d = 0.1 to d = 3.0+ depending on the paradigm. Always use paradigm-specific estimates (Brysbaert, 2019).

Ignoring the distinction between d and d_z: Between-subjects Cohen's d and within-subjects d_z are not interchangeable. Within-subjects designs typically yield larger d_z due to reduced error variance. Confusing them leads to incorrect sample size estimates (Lakens, 2013).

Powering for whole-brain fMRI but reporting ROI results (or vice versa): Whole-brain analyses with multiple comparison correction require larger effects to survive thresholding. Power calculations must match the planned analysis (Mumford & Nichols, 2008).

Treating pilot effect sizes as population estimates: Pilot studies with N = 10-20 produce wildly variable effect size estimates. Apply a correction factor or use the lower bound of the CI (Albers & Lakens, 2018).

Ignoring trial count in EEG/ERP power: For ERP analyses, both participant N and trial count per condition affect statistical power. Insufficient trials per condition reduces signal-to-noise ratio regardless of participant count (Boudewyn et al., 2018; Luck, 2014).

Assuming brain-behavior correlations are large: Marek et al. (2022) demonstrated that brain-wide association studies require thousands of participants for reliable effects. Planning an fMRI individual-differences study with N = 30 is almost certainly underpowered.

Quick Reference Decision Table

Question Answer Recommended Action

"How many subjects for a Stroop study?" Within-subjects Stroop effect is very large (d ≈ 1.0-1.5) N = 15-25 likely sufficient (Brysbaert, 2019)

"How many for an ERP study of N400?" N400 semantic violation effect d ≈ 0.8-1.5 N = 20-30 (Boudewyn et al., 2018)

"How many for fMRI brain-behavior correlation?" True r likely 0.10-0.20 N = 200+ minimum (Marek et al., 2022)

"How many for a patient vs. control comparison?" Effects vary widely (d ≈ 0.3-0.8) N = 30-80 per group depending on expected effect

"Can I use my pilot N=12 effect size?" Pilot effect is unreliable Use meta-analytic estimate instead; if unavailable, use lower CI bound of pilot

References

Albers, C., & Lakens, D. (2018). When power analyses based on pilot data are biased. Journal of Experimental Social Psychology, 74, 187-195.
Boudewyn, M. A., Luck, S. J., Farrens, J. L., & Kappenman, E. S. (2018). How many trials does it take to get a significant ERP effect? Psychophysiology, 55(6), e13049.
Brysbaert, M. (2019). How many participants do we really need? Journal of Cognition, 2(1), 16.
Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., & Munafo, M. R. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14(5), 365-376.
Clayson, P. E., Carbine, K. A., Baldwin, S. A., & Larson, M. J. (2019). Methodological reporting behavior, sample sizes, and statistical power in studies of event-related potentials. Psychophysiology, 56(11), e13437.
Cremers, H. R., Wager, T. D., & Yarkoni, T. (2017). The relation between statistical power and inference in fMRI. PLoS ONE, 12(11), e0184923.
Green, P., & MacLeod, C. J. (2016). SIMR: An R package for power analysis of generalized linear mixed models by simulation. Methods in Ecology and Evolution, 7(4), 493-498.
Lakens, D. (2013). Calculating and reporting effect sizes to facilitate cumulative science. Frontiers in Psychology, 4, 863.
Lakens, D. (2022). Sample size justification. Collabra: Psychology, 8(1), 33267.
Lakens, D., & Caldwell, A. R. (2021). Simulation-based power analysis for factorial ANOVA designs. Advances in Methods and Practices in Psychological Science, 4(1).
Leucht, S., Hierl, S., Kissling, W., Dold, M., & Davis, J. M. (2015). Putting the efficacy of psychiatric and general medicine medication into perspective. British Journal of Psychiatry, 200(2), 97-106.
Luck, S. J. (2014). An Introduction to the Event-Related Potential Technique (2nd ed.). MIT Press.
Marek, S., Tervo-Clemmens, B., Calabro, F. J., et al. (2022). Reproducible brain-wide association studies require thousands of individuals. Nature, 603, 654-660.
Mills, K. L., & Tamnes, C. K. (2014). Methods and considerations for longitudinal structural brain imaging analysis across development. Developmental Cognitive Neuroscience, 9, 172-190.
Mumford, J. A., & Nichols, T. E. (2008). Power calculation for group fMRI studies accounting for arbitrary design and temporal autocorrelation. NeuroImage, 39(1), 261-268.
Poldrack, R. A., Baker, C. I., Durnez, J., et al. (2017). Scanning the horizon: Towards transparent and reproducible neuroimaging research. Nature Reviews Neuroscience, 18(2), 115-126.

See references/effect-sizes.md for the full effect size reference library and references/sample-size-guide.md for detailed sample size guidance by modality.

cognitive science power analysis

Safety Notice

Copy this and send it to your AI assistant to learn

Source Transparency

Related Skills

cognitive science statistical analysis

creativity self-efficacy mediation analysis

neural population decoding analysis

erp data analysis