fMRI GLM Analysis Guide

Purpose

The General Linear Model (GLM) is the standard statistical framework for task-based fMRI analysis. It models the observed BOLD time series as a linear combination of expected signal components (task regressors convolved with the hemodynamic response function) plus confound regressors plus noise (Poline & Brett, 2012; Poldrack et al., 2011, Ch. 4).

This skill encodes the domain-specific judgment needed to correctly specify a GLM for fMRI data. A competent programmer without neuroimaging training would get many of these decisions wrong -- choosing the wrong HRF model, setting an inappropriate high-pass filter cutoff, omitting critical confound regressors, or applying invalid statistical thresholds. Each decision described here requires understanding the biophysics of BOLD signal, the noise characteristics of fMRI data, and the statistical assumptions of the model.

When to Use This Skill

Specifying a first-level (single-subject) GLM for task fMRI
Choosing HRF models, confound regressors, and temporal filtering parameters
Defining contrasts to test experimental hypotheses
Setting up second-level (group) analyses
Selecting appropriate multiple comparison correction methods
Reviewing or troubleshooting an existing fMRI analysis pipeline

Research Planning Protocol

Before executing the domain-specific steps below, you MUST:

State the research question — What specific brain-behavior hypothesis is this GLM testing?
Justify the method choice — Why task fMRI with GLM? What alternatives (resting-state, EEG, behavior-only) were considered?
Declare expected outcomes — Which regions/contrasts do you expect to show effects, and in what direction?
Note assumptions and limitations — What does the GLM assume? Where could it mislead?
Present the plan to the user and WAIT for confirmation before proceeding.

For detailed methodology guidance, see the research-literacy skill.

⚠️ Verification Notice

This skill was generated by AI from academic literature. All parameters, thresholds, and citations require independent verification before use in research. If you find errors, please open an issue.

First-Level Analysis Decision Tree

Step 1: HRF Model Selection

The hemodynamic response function (HRF) models the neurovascular coupling delay between neural activity and the measured BOLD signal. The canonical HRF peaks at approximately 5-6 seconds post-stimulus (Glover, 1999).

HRF Model	When to Use	Trade-off
Canonical (double-gamma)	Default for most task designs when timing is well-established	Assumes fixed HRF shape; highest statistical power (1 parameter per condition) (Lindquist et al., 2009)
Canonical + temporal derivative	When peak latency may vary by ~1 s across conditions or regions	Captures timing shifts; 2 parameters per condition (Friston et al., 1998; Henson et al., 2002)
Canonical + temporal + dispersion derivatives	When both latency and width of HRF may vary	Maximum flexibility with basis functions; 3 parameters per condition, reduced power (Henson et al., 2002)
Finite Impulse Response (FIR)	When HRF shape is unknown or expected to deviate substantially from canonical form	No shape assumption; many parameters (one per time bin); requires many trials for stable estimation (Glover, 1999; Dale, 1999)

Decision logic:

Is the HRF shape well-established for your task and population?
 |
 +-- YES --> Is timing precision critical to your hypothesis?
 | |
 | +-- YES --> Canonical + temporal derivative
 | |
 | +-- NO --> Canonical HRF (default)
 |
 +-- NO --> Do you have enough trials (>40 per condition) for stable estimation?
 |
 +-- YES --> FIR model (exploratory) or canonical + derivatives
 |
 +-- NO --> Canonical + temporal derivative (safest compromise)

Domain warning: When using derivative basis functions, the contrast for the main condition should weight only the canonical regressor (not the derivatives). An F-test across all basis functions tests whether any component differs from zero (Calhoun et al., 2004). See references/design-matrix-guide.md for details.

Step 2: High-Pass Filter Cutoff

Low-frequency drifts from scanner instability, subject physiology, and slow head motion must be removed. This is implemented as a discrete cosine transform (DCT) basis set added to the design matrix (Poldrack et al., 2011, Ch. 5).

Default cutoff period: 128 seconds (0.0078 Hz) -- the SPM default, appropriate for most event-related and block designs (Poldrack et al., 2011, Ch. 5)
Rule: The cutoff period must be at least 2x the longest interval between events of the same condition type (Poldrack et al., 2011, Ch. 5)
Block designs: For a block design with 30-second blocks alternating between two conditions, the full cycle is 60 seconds. A 128-second cutoff safely preserves this signal
Slow event-related designs: If trials of the same condition occur every 90 seconds, a 128-second cutoff is too aggressive. Use at least 180 seconds (2 x 90)

Domain warning: Setting the cutoff too low (long period) leaves drift in the data, inflating noise. Setting it too high (short period) attenuates your experimental signal. Always verify that your design's fundamental frequency is preserved by the filter.

Step 3: Confound Regression

Confound regressors model variance from non-neural sources. Omitting them inflates false positive rates; including too many reduces statistical power.

Motion Parameters

Head motion is the single largest source of structured artifact in fMRI (Power et al., 2012).

Model	Parameters	When to Use	Source
Standard 6-parameter	3 translation + 3 rotation	Minimum acceptable model	Friston et al., 1996
24-parameter (Friston)	6 current + 6 prior timepoint + 12 squared terms	Recommended default for task fMRI	Friston et al., 1996
6-parameter + derivatives	6 current + 6 temporal derivatives	Intermediate model	Satterthwaite et al., 2013

Domain insight: The 24-parameter model (Friston et al., 1996) captures both linear and nonlinear motion effects, including spin-history artifacts from previous-timepoint head positions. The squared terms model the nonlinear relationship between motion and BOLD signal changes.

Framewise Displacement and Spike Regressors

Calculate framewise displacement (FD) as the sum of absolute values of the six motion derivatives, with rotations converted to mm on a 50-mm radius sphere (Power et al., 2012)
Flag volumes with FD > 0.5 mm as motion spikes (Power et al., 2012); for more stringent control, use FD > 0.2 mm (Power et al., 2014)
Model each spike as a separate indicator regressor (one column per flagged volume). This is mathematically equivalent to censoring/scrubbing but preserves the temporal structure of the design matrix

Physiological Noise: CompCor

When physiological recordings (pulse, respiration) are unavailable:

aCompCor (anatomical CompCor): Extract top 5-6 principal components from white matter and CSF masks (Behzadi et al., 2007; Muschelli et al., 2014)
tCompCor (temporal CompCor): Extract components from high-temporal-variance voxels (Behzadi et al., 2007)
aCompCor is generally preferred because it uses anatomically defined noise ROIs and is less susceptible to removing neural signal (Ciric et al., 2017)

Domain warning -- global signal regression: Regressing out the global mean signal is controversial. It improves motion artifact removal but introduces artifactual anticorrelations in functional connectivity analyses (Murphy & Fox, 2017). For task-based GLM, global signal regression is generally not recommended unless specifically justified.

Step 4: Serial Autocorrelation Correction

fMRI time series exhibit temporal autocorrelation due to hemodynamic smoothing and physiological noise. Ignoring this inflates t-statistics and false positive rates (Woolrich et al., 2001).

Method	Implementation	Software
AR(1) prewhitening	Models autocorrelation as first-order autoregressive process	SPM (default), Nilearn
ARMA(1,1)	Autoregressive moving-average; more flexible	AFNI (3dREMLfit)
Tukey taper prewhitening	Nonparametric spectral smoothing of autocorrelation	FSL FILM (Woolrich et al., 2001)

Domain insight: Recent work has shown that AR(1) may be insufficient for modern multiband acquisitions with short TRs (< 1 s), where higher-order autocorrelation structure is present (Olszowy et al., 2019). For short-TR data, consider ARMA(1,1) or FSL's FILM approach.

Step 5: Construct and Verify the Design Matrix

Before fitting the model:

Visualize the design matrix -- Check for empty or near-empty columns, which indicate conditions with no events or events lost to the high-pass filter
Check condition-regressor correlation -- Correlation > 0.5 between task regressors indicates collinearity, reducing power and making contrasts uninterpretable (Mumford et al., 2015)
Verify the high-pass filter preserves signal -- Overlay the filter cutoff on the power spectrum of each task regressor
Inspect the variance inflation factor (VIF) -- VIF > 5 for any regressor indicates problematic multicollinearity (Poldrack et al., 2011, Ch. 4)

See references/design-matrix-guide.md for detailed guidance on design matrix construction.

Contrast Specification

Contrasts define the specific hypotheses you test within the fitted GLM.

t-Contrasts (Directional)

A t-contrast is a single row vector of weights applied to the parameter estimates. It tests a directional hypothesis.

Contrast Type	Weight Vector Example	Tests
Activation vs. baseline	[1 0 0 ...]	Is condition A > 0?
Condition difference	[1 -1 0 ...]	Is condition A > condition B?
Linear trend	[-1 0 1 ...]	Does activation increase linearly across 3 levels?
Interaction (2x2)	[1 -1 -1 1 ...]	Does the difference A1-A2 differ from B1-B2?

Rules for valid t-contrasts (Poline & Brett, 2012):

Weights must sum to zero for difference contrasts
Only task regressors should receive non-zero weights (never confound regressors)
When using derivative basis functions, contrast only the canonical regressor

F-Contrasts (Omnibus)

An F-contrast tests whether any of several effects are non-zero. It is specified as a matrix (multiple rows).

Use Case	When to Use
Main effect of factor	Testing whether any level of a factor differs from any other
HRF model with derivatives	Testing whether the canonical + derivative basis set captures any response
Any-difference test	Testing whether any condition differs from baseline

Domain insight: An F-test for the full basis set (canonical + derivatives) tests whether there is any evoked response, regardless of its exact timing or shape. This is more sensitive than a t-test on the canonical regressor alone when the true HRF deviates from canonical form (Calhoun et al., 2004).

Second-Level Analysis (Group Inference)

First-level contrast maps (one per subject) serve as input to the group model.

Fixed Effects vs. Mixed Effects

Approach	Models	Generalizability	When to Use	Source
Fixed effects (FFX)	Within-subject variance only	Only to the scanned subjects	Multiple runs within one subject	Poldrack et al., 2011, Ch. 8
Mixed effects (MFX)	Within- + between-subject variance	To the population	Group inference across subjects	Mumford & Nichols, 2009
OLS (summary statistics)	Between-subject variance only	To the population (if homogeneity holds)	Standard group analysis; valid and near-optimal under moderate variance heterogeneity	Mumford & Nichols, 2009

Decision logic:

Are you combining runs within one subject?
 |
 +-- YES --> Fixed effects (concatenation or run-by-run with FFX)
 |
 +-- NO --> Are you making group-level inferences?
 |
 +-- YES --> Mixed effects (FLAME in FSL) or OLS summary statistics
 OLS is valid and near-optimal for balanced designs
 (Mumford & Nichols, 2009)

Domain warning: Using a fixed-effects analysis for group inference treats between-subject variability as zero, dramatically inflating false positive rates. Results would apply only to the specific subjects scanned, not to the population (Friston et al., 2005; Mumford & Nichols, 2009).

Common Second-Level Designs

One-sample t-test: Design matrix = intercept only (column of ones). Tests group mean activation against zero
Two-sample t-test: Compares two groups (e.g., patients vs. controls). Include group-mean confounds (age, motion) as covariates
Paired t-test: Within-subject comparison of two conditions at the group level
ANOVA / Flexible factorial: Multiple factors; requires careful specification of variance components

Multiple Comparison Correction

With approximately 100,000 voxels tested simultaneously, correction for multiple comparisons is essential. See references/statistical-inference.md for detailed guidance.

Quick Reference

Method	Controls	Recommended Threshold	When to Use
Voxelwise FWE (RFT)	Family-wise error	p < 0.05 FWE	Highly localized effects expected (Worsley et al., 1996)
FDR (Benjamini-Hochberg)	False discovery rate	q < 0.05	Distributed effects; moderate correction (Genovese et al., 2002)
Cluster-based (RFT)	Cluster-level FWE	CDT p < 0.001, then cluster p < 0.05 FWE	Standard approach; use CDT of p < 0.001 (Eklund et al., 2016)
TFCE	Voxelwise FWE via permutation	p < 0.05 FWE-corrected	No arbitrary CDT; good sensitivity (Smith & Nichols, 2009)
Permutation testing	FWE (nonparametric)	p < 0.05 FWE	Gold standard; no distributional assumptions (Nichols & Holmes, 2002)

Critical domain knowledge: Cluster-based inference with a cluster-defining threshold (CDT) of p < 0.01 produces inflated false positive rates (up to 70% instead of the nominal 5%). Always use CDT of p < 0.001 or stricter (Eklund et al., 2016). For permutation tests, use at least 5,000-10,000 permutations for publication-quality results (Nichols & Holmes, 2002).

Common Pitfalls

Double-dipping / circular analysis: Selecting voxels or ROIs based on the effect of interest, then testing that same effect in the selected data. This inflates effect sizes and produces invalid statistics (Kriegeskorte et al., 2009). Use independent data for selection and testing, or use a priori ROIs
Collinear regressors: Highly correlated task regressors make individual contrasts uninterpretable. Check the correlation matrix of the design matrix; correlations > 0.5 between task regressors are problematic (Mumford et al., 2015)
HRF misspecification: Using the canonical HRF when the true response deviates substantially (e.g., in pediatric or clinical populations) produces biased estimates and reduced power (Lindquist et al., 2009). Consider adding temporal derivatives or using FIR
High-pass filter too aggressive: If the filter cutoff period is shorter than the longest inter-event interval, the task signal is attenuated. Always verify cutoff against your experimental timing
Ignoring temporal autocorrelation: Using ordinary least squares (OLS) without prewhitening inflates t-statistics and false positive rates (Woolrich et al., 2001)
Omitting motion regressors: Motion artifacts are the largest source of false positives in fMRI. At minimum, include 6 rigid-body parameters; the 24-parameter model is preferred (Power et al., 2012; Friston et al., 1996)
Using uncorrected thresholds for inference: p < 0.001 uncorrected is insufficient for publication. Always apply FWE, FDR, or permutation-based correction (Eklund et al., 2016)
Fixed-effects for group inference: Treats between-subject variance as zero; results do not generalize beyond the scanned sample (Mumford & Nichols, 2009)

Minimum Reporting Checklist

Based on the OHBM COBIDAS guidelines (Nichols et al., 2017) and Poldrack et al. (2008):

References

Behzadi, Y., Restom, K., Liau, J., & Liu, T. T. (2007). A component based noise correction method (CompCor) for BOLD and perfusion based fMRI. NeuroImage, 37(1), 90-101.
Calhoun, V. D., Stevens, M. C., Pearlson, G. D., & Kiehl, K. A. (2004). fMRI analysis with the general linear model: Removal of latency-induced amplitude bias by incorporation of hemodynamic derivative terms. NeuroImage, 22(1), 252-257.
Ciric, R., Wolf, D. H., Power, J. D., et al. (2017). Benchmarking of participant-level confound regression strategies for the control of motion artifact in studies of functional connectivity. NeuroImage, 154, 174-187.
Dale, A. M. (1999). Optimal experimental design for event-related fMRI. Human Brain Mapping, 8(2-3), 109-114.
Eklund, A., Nichols, T. E., & Knutsson, H. (2016). Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates. PNAS, 113(28), 7900-7905.
Friston, K. J., Fletcher, P., Josephs, O., Holmes, A., Rugg, M. D., & Turner, R. (1998). Event-related fMRI: Characterizing differential responses. NeuroImage, 7(1), 30-40.
Friston, K. J., Stephan, K. E., Lund, T. E., Morcom, A., & Kiebel, S. (2005). Mixed-effects and fMRI studies. NeuroImage, 24(1), 244-252.
Friston, K. J., Williams, S., Howard, R., Frackowiak, R. S. J., & Turner, R. (1996). Movement-related effects in fMRI time-series. Magnetic Resonance in Medicine, 35(3), 346-355.
Genovese, C. R., Lazar, N. A., & Nichols, T. (2002). Thresholding of statistical maps in functional neuroimaging using the false discovery rate. NeuroImage, 15(4), 870-878.
Glover, G. H. (1999). Deconvolution of impulse response in event-related BOLD fMRI. NeuroImage, 9(4), 416-429.
Henson, R. N. A., Price, C. J., Rugg, M. D., Turner, R., & Friston, K. J. (2002). Detecting latency differences in event-related BOLD responses. NeuroImage, 15(1), 83-97.
Kriegeskorte, N., Simmons, W. K., Bellgowan, P. S., & Baker, C. I. (2009). Circular analysis in systems neuroscience: The dangers of double dipping. Nature Neuroscience, 12(5), 535-540.
Lindquist, M. A. (2008). The statistical analysis of fMRI data. Statistical Science, 23(4), 439-464.
Lindquist, M. A., Loh, J. M., Atlas, L. Y., & Wager, T. D. (2009). Modeling the hemodynamic response function in fMRI: Efficiency, bias and mis-modeling. NeuroImage, 45(1 Suppl), S187-S198.
Mumford, J. A., & Nichols, T. E. (2009). Simple group fMRI modeling and inference. NeuroImage, 47(4), 1469-1475.
Mumford, J. A., Poline, J. B., & Poldrack, R. A. (2015). Orthogonalization of regressors in fMRI models. PLoS ONE, 10(4), e0126255.
Murphy, K., & Fox, M. D. (2017). Towards a consensus regarding global signal regression for resting state functional connectivity MRI. NeuroImage, 154, 169-173.
Muschelli, J., Nebel, M. B., Caffo, B. S., et al. (2014). Reduction of motion-related artifacts in resting state fMRI using aCompCor. NeuroImage, 96, 22-35.
Nichols, T. E., Das, S., Eickhoff, S. B., et al. (2017). Best practices in data analysis and sharing in neuroimaging using MRI (COBIDAS). Nature Neuroscience, 20(3), 299-303.
Nichols, T. E., & Holmes, A. P. (2002). Nonparametric permutation tests for functional neuroimaging: A primer with examples. Human Brain Mapping, 15(1), 1-25.
Olszowy, W., Aston, J., Rua, C., & Williams, G. B. (2019). Accurate autocorrelation modeling substantially improves fMRI reliability. Nature Communications, 10, 1220.
Poldrack, R. A., Fletcher, P. C., Henson, R. N., Worsley, K. J., Brett, M., & Nichols, T. E. (2008). Guidelines for reporting an fMRI study. NeuroImage, 40(2), 409-414.
Poldrack, R. A., Mumford, J. A., & Nichols, T. E. (2011). Handbook of Functional MRI Data Analysis. Cambridge University Press.
Poline, J. B., & Brett, M. (2012). The general linear model and fMRI: Does love last forever? NeuroImage, 62(2), 871-880.
Power, J. D., Barnes, K. A., Snyder, A. Z., Schlaggar, B. L., & Petersen, S. E. (2012). Spurious but systematic correlations in functional connectivity MRI networks arise from subject motion. NeuroImage, 59(3), 2142-2154.
Power, J. D., Mitra, A., Laumann, T. O., Snyder, A. Z., Schlaggar, B. L., & Petersen, S. E. (2014). Methods to detect, characterize, and remove motion artifact in resting state fMRI. NeuroImage, 84, 320-341.
Satterthwaite, T. D., Elliott, M. A., Gerraty, R. T., et al. (2013). An improved framework for confound regression and filtering for control of motion artifact in the preprocessing of resting-state functional connectivity data. NeuroImage, 64, 240-256.
Smith, S. M., & Nichols, T. E. (2009). Threshold-free cluster enhancement: Addressing problems of smoothing, threshold dependence and localisation in cluster inference. NeuroImage, 44(1), 83-98.
Woolrich, M. W., Ripley, B. D., Brady, M., & Smith, S. M. (2001). Temporal autocorrelation in univariate linear modeling of FMRI data. NeuroImage, 14(6), 1370-1386.
Worsley, K. J., Marrett, S., Neelin, P., Vandal, A. C., Friston, K. J., & Evans, A. C. (1996). A unified statistical approach for determining significant signals in images of cerebral activation. Human Brain Mapping, 4(1), 58-73.

See references/ for detailed design matrix construction guide and statistical inference methods.