Exploring predictive and reproducible modeling with the single-subject FIAC dataset

Xu Chen¹, Francisco Pereira, Wayne Lee, Stephen Strother, Tom Mitchell

Affiliations

PMID: 16565951
PMCID: PMC6871419
DOI: 10.1002/hbm.20243

Exploring predictive and reproducible modeling with the single-subject FIAC dataset

Xu Chen et al. Hum Brain Mapp. 2006 May.

. 2006 May;27(5):452-61.

doi: 10.1002/hbm.20243.

Authors

Xu Chen¹, Francisco Pereira, Wayne Lee, Stephen Strother, Tom Mitchell

Affiliation

¹ Rotman Research Institute, Baycrest, Toronto, Ontario, Canada. xchen@rotman-baycrest.on.ca

PMID: 16565951
PMCID: PMC6871419
DOI: 10.1002/hbm.20243

Abstract

Predictive modeling of functional magnetic resonance imaging (fMRI) has the potential to expand the amount of information extracted and to enhance our understanding of brain systems by predicting brain states, rather than emphasizing the standard spatial mapping. Based on the block datasets of Functional Imaging Analysis Contest (FIAC) Subject 3, we demonstrate the potential and pitfalls of predictive modeling in fMRI analysis by investigating the performance of five models (linear discriminant analysis, logistic regression, linear support vector machine, Gaussian naive Bayes, and a variant) as a function of preprocessing steps and feature selection methods. We found that: (1) independent of the model, temporal detrending and feature selection assisted in building a more accurate predictive model; (2) the linear support vector machine and logistic regression often performed better than either of the Gaussian naive Bayes models in terms of the optimal prediction accuracy; and (3) the optimal prediction accuracy obtained in a feature space using principal components was typically lower than that obtained in a voxel space, given the same model and same preprocessing. We show that due to the existence of artifacts from different sources, high prediction accuracy alone does not guarantee that a classifier is learning a pattern of brain activity that might be usefully visualized, although cross-validation methods do provide fairly unbiased estimates of true prediction accuracy. The trade-off between the prediction accuracy and the reproducibility of the spatial pattern should be carefully considered in predictive modeling of fMRI. We suggest that unless the experimental goal is brain-state classification of new scans on well-defined spatial features, prediction alone should not be used as an optimization procedure in fMRI data analysis.

PubMed Disclaimer

Figures

**Figure 1**
The first‐dimension results of 4‐class LDA analysis on the whole brain of Subject 3. a: Axial slices 10–13 of the Z‐score rSPI (see the Resampling Framework and Cross‐Validation section). b: Plot of canonical variates score (CVS) as a function of the condition. The data were preprocessed by 2D smoothing and detrending with a 1‐cycle cosine‐basis‐function cutoff. The LDA was performed on the first 10 principal components (PCs) of each run. In the CVS plot of each dimension, “%” is the percentage of total variance accounted for, “e” is the canonical eigenvalue, and “cc” is the canonical correlation coefficient (image right = brain left).

**Figure 2**
The spatial patterns corresponding to the SVM analysis of the detrended data with features selected in either voxel space (a) or PC space (b) (image right = brain left). In voxel space, 200 voxels were selected by the intensity level method (ILFS) for each run. Panel a highlights the selected voxels in slices 14–18 when run1 was used as a training set. The overlap voxels—those were also selected when run2 was used as a training set—are highlighted in yellow, others in red. In PC space, 10 PCs were selected by nested cross‐validation for each run and then passed to the SVM model. The resultant Z‐score rSPI is shown in panel b. See Figure 1a for the color scale of Figure 2b.

**Figure 3**
The first three dimensions of 4‐class LDA analysis on the masked brain (lower 13 slices removed to avoid artifacts). The masked brain was smoothed and detrended with a 2‐cycle cosine‐basis‐function cutoff. The number of the principal components passed to the LDA is 5. Selected slices (14–18) of different dimensional rSPIs are shown in panel a (row A: 1st dimension; row B: 2nd dimension; row C: 3rd dimension) (image right = brain left). The dotted black circles in panel a indicate the regions whose peak Z‐score locations are reported in Table III. Corresponding plots of canonical variates score (CVS) as a function of the condition are shown in panel b (from left to right). The “%,” “e,” and “cc” headings on the CVS plots are defined in the legend of Figure 1.

**Figure 4**
The spatial patterns for the GLM using the Gamma HRF model. a: Axial slices 12–16 of the Z‐score rSPI for the overall effect (four conditions vs. baseline). b: Axial slices 12–16 of the t‐statistic SPI for the main sentence effect with concatenated runs (row A) and the t‐statistic SPI for (Condition4–Condition1) with run1 (row B) (image right = brain left).

See this image and copyright information in PMC

References

1. Dehaene‐Lambertz G, Dehaene S, Anton JL, Campagne A, Ciuciu P, Dehaene GP, Denghien I, Jobert A, LeBihan D, Sigman M, Pallier C, Poline JB (2006): Functional segregation of cortical language areas by sentence repetition. Hum Brain Mapp 27: xx–xx. - PMC - PubMed
1. Formisano E, Esposito F, Kriegeskorte N, Tedeschi G, Di Salle F, Goebel R (2002): Spatial independent component analysis of functional magnetic resonance imaging time‐series: characterization of the cortical components. Neurocomputing 49: 241–254.
1. Gavrilescu M, Shaw ME, Stuart GW, Eckersley P, Svalbe ID, Egan GF (2002): Simulation of the effects of global normalization procedures in functional MRI. Neuroimage 17: 532–542. - PubMed
1. Hastie T, Tibshirani R, Friedman J (2001): The elements of statistical learning theory. New York: Springer.
1. Haxby JV, Gobbini MI, Furey ML, Ishai A, Schouten JL, Pietrini P (2001): Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 293: 2425–2430. - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Exploring predictive and reproducible modeling with the single-subject FIAC dataset

Affiliation

Exploring predictive and reproducible modeling with the single-subject FIAC dataset

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical