Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug 1:236:118044.
doi: 10.1016/j.neuroimage.2021.118044. Epub 2021 Apr 10.

Resample aggregating improves the generalizability of connectome predictive modeling

Affiliations

Resample aggregating improves the generalizability of connectome predictive modeling

David O'Connor et al. Neuroimage. .

Abstract

It is a longstanding goal of neuroimaging to produce reliable, generalizable models of brain behavior relationships. More recently, data driven predictive models have become popular. However, overfitting is a common problem with statistical models, which impedes model generalization. Cross validation (CV) is often used to estimate expected model performance within sample. Yet, the best way to generate brain behavior models, and apply them out-of-sample, on an unseen dataset, is unclear. As a solution, this study proposes an ensemble learning method, in this case resample aggregating, encompassing both model parameter estimation and feature selection. Here we investigate the use of resampled aggregated models when used to estimate fluid intelligence (fIQ) from fMRI based functional connectivity (FC) data. We take advantage of two large openly available datasets, the Human Connectome Project (HCP), and the Philadelphia Neurodevelopmental Cohort (PNC). We generate aggregated and non-aggregated models of fIQ in the HCP, using the Connectome Prediction Modelling (CPM) framework. Over various test-train splits, these models are evaluated in sample, on left-out HCP data, and out-of-sample, on PNC data. We find that a resample aggregated model performs best both within- and out-of-sample. We also find that feature selection can vary substantially within-sample. More robust feature selection methods, as detailed here, are needed to improve cross sample performance of CPM based brain behavior models.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Analytic workflow. The PNC data is only used for out-of-sample testing. The HCP data is split into train and test samples. The train sample (400 subjects) is used to train 3 types of models: (1) resample aggregated models, (2) CV models, and (3) train-only models. All models are then tested within-sample on the test HCP sample, and out-of-sample on the PNC dataset.
Fig. 2.
Fig. 2.
Within and out-of-sample model performance, stratified by data split. In the left panel (purple), the first three columns show performance of resample aggregated models within-sample, columns 4–7 show the CV models, and the eight column shows performance of the train-only models. Each column has 20 boxplots, color-coded (in rainbow) by train/test split. All models are tested within-sample on 100 random subsamples of 200 subjects from the HCP test sample. The second panel (green) shows the performance of the same models (same order as the left panel) out-of-sample using random subsamples of 200 subjects from the PNC data set.
Fig. 3.
Fig. 3.
Within and out-of-sample model performance. Column one (shaded in purple) shows performance of all models, across all data splits, within-sample. All models are tested on 100 random subsamples of 200 subjects from the test sample of the HCP data set. The second column (shaded in green) shows the performance of the same models tested on random subsamples of 200 subjects from the PNC data set. The box and whisker plots show the median, interquartile range, and 5%–95% markers of the performance distribution. The underlying shaded violin plots show the shape of the model performance distribution.
Fig. 4.
Fig. 4.
Distribution of feature (edge) occurrence across subsamples for the ensemble models. For the bagged model (left), nearly 11,851 features occur in between 0% and 10% of bootstraps, compared to 6056 for the subsample 200 model (right) and 2839 for the subsample 300 model (center).
Fig. 5.
Fig. 5.
Relationship between effect size and feature occurrence for each aggregated model, across all resamples. The bagged models are shown in blue, the subsample 200 in orange, and the subsample 300 in green. The subplots on the right and top show probability density plots of the feature occurrence and effect size respectively.
Fig. 6.
Fig. 6.
Resample aggregated model performance within sample (white boxplots), and out of sample (gray boxplots) across feature frequency thresholds. This reflects the performance as tested on subsamples of 200 participants. The box and whisker plots show the median, interquartile range, and 5% – 95% markers of the performance distribution. The underlying shaded distribution shows the individual data points. The top panel shows the performance of the bagged models, as the feature threshold is increased (reducing the number of features included). Middle shows the performance of the subsample 300 models, as the feature threshold is increased. Bottom shows the performance of the subsample 200 models, as the feature threshold is increased. In the background of all plots, a density-based histogram of the percent of features (as a function of all features selected for a given model) included is shown in blue.

Similar articles

Cited by

References

    1. Woo C-W, Chang LJ, Lindquist MA, Wager TD, 2017. Building better biomarkers: brain models in translational neuroimaging. Nat. Neurosci 20 (3), 365–377 March. - PMC - PubMed
    1. Bzdok D, Ioannidis JPA, 2019. Exploration, Inference, and Prediction in Neuroscience and Biomedicine. Trends Neurosci. 42 (4), 251–262 Elsevier Ltd 01-April-. - PubMed
    1. Insel T, et al., 2010. Research domain criteria (RDoC): toward a. Am. J. Psychiatry Online 748–751 no. July. - PubMed
    1. Badhwar AP, et al., 2020. Multivariate consistency of resting-state fMRI connectivity maps acquired on a single individual over 2.5 years, 13 sites and 3 vendors. Neuroimage 205, 116210 January. - PubMed
    1. Keilholz SD, Pan W-J, Billings J, Nezafati M, Shakil S, 2017. Noise and non-neuronal contributions to the BOLD signal: applications to and insights from animal studies. Neuroimage 154, 267–281 July. - PMC - PubMed

Publication types