Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Nov 1:201:116038.
doi: 10.1016/j.neuroimage.2019.116038. Epub 2019 Jul 20.

Combining multiple connectomes improves predictive modeling of phenotypic measures

Affiliations

Combining multiple connectomes improves predictive modeling of phenotypic measures

Siyuan Gao et al. Neuroimage. .

Abstract

Resting-state and task-based functional connectivity matrices, or connectomes, are powerful predictors of individual differences in phenotypic measures. However, most of the current state-of-the-art algorithms only build predictive models based on a single connectome for each individual. This approach neglects the complementary information contained in connectomes from different sources and reduces prediction performance. In order to combine different task connectomes into a single predictive model in a principled way, we propose a novel prediction framework, termed multidimensional connectome-based predictive modeling. Two specific algorithms are developed and implemented under this framework. Using two large open-source datasets with multiple tasks-the Human Connectome Project and the Philadelphia Neurodevelopmental Cohort, we validate and compare our framework against performing connectome-based predictive modeling (CPM) on each task connectome independently, CPM on a general functional connectivity matrix created by averaging together all task connectomes for an individual, and CPM with a naïve extension to multiple connectomes where each edge for each task is selected independently. Our framework exhibits superior performance in prediction compared with the other competing methods. We found that different tasks contribute differentially to the final predictive model, suggesting that the battery of tasks used in prediction is an important consideration. This work makes two major contributions: First, two methods for combining multiple connectomes from different task conditions in one predictive model are demonstrated; Second, we show that these models outperform a previously validated single connectome-based predictive model approach.

Keywords: Elastic net; Functional connectivity; Lasso; Machine learning; Neural networks; fMRI.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Algorithm flow chart for three major models mentioned.
a) The original CPM flow chart b) cCPM extends CPM to handle multiple connectomes per individual by replacing the correlation step in CPM with a canonical correlation analysis (CCA) step. c) rCPM extends CPM to handle multiple connectomes per individual by replacing the pooling (i.e. averaging) and linear regression step with a ridge regression step.
Figure 2.
Figure 2.. Comparison of the predictive modeling approaches’ ability to predict an individual’s gF.
a) HCP dataset. b) PNC dataset. Purple box plots show results from CPM on a single task. The orange, green, red, and blue box plots show results from combining multiple task connectomes using GFC-CPM, GFC-ridge, CPM, cCPM, and rCPM, respectively. Box plots show cross-validated RCV2 with the error bars representing the 25th and 75th percentiles, respectively. Values below (or above) the 25th (or 75th percentiles) are shown as *. The best results across different edge selection thresholds are shown. Task acronyms: GAM: Gambling, LAN: Language, MOT: Motor, REL: Relational, SOC: Social, WM: Working Memory, EMO: Emotion.
Figure 3.
Figure 3.. Different tasks’ contributions to the model.
a) Visualization of the selected edges for different tasks in the model. Top row represents 2% and bottom row represents 20% of total number of selected edges. 81.7% and 99.8% of feature contribution (combined sum of each feature’s regression coefficient times its standard deviation) in regression are possessed respectively by those networks. Anatomical acronyms: PFC = Prefrontal, MOT = MotorStrip, INS = Insula, PAR = Parietal, TEM = Temporal, OCC = Occipital, LIM = Limbic, CER = Cerebellum, SUB = Subcortical, BSM = Brainstem. b) Different tasks’ average contribution fraction to the cCPM and rCPM model. c) Different tasks’ contributions to the model, summarized at the network level. Network acronyms: MF=Medial Frontal, FP=Frontoparietal, DMN=Default Mode Network, MOT=Motor Cortex, V1=Visual I, V2=Visual II, VA=Visual Association, SA=Salience.
Figure 4.
Figure 4.. Forward task selection for cCPM and rCPM.
a) shows the results for cCPM while b) shows the results for rCPM. The optimal task combinations for the two algorithms are both using 6 tasks, where cCPM excludes the Language task while rCPM excludes the Emotion task. However, the algorithm performance of using all 7 available tasks is not significantly worse than using 6 tasks (cCPM: p = 0.38, rCPM: p = 0.40), and overall, including more tasks significantly improves prediction.
Figure 5.
Figure 5.. Models’ performance with various hyperparameters.
a) Varying edge selection threshold for the HCP dataset. b) Varying edge selection threshold for the PNC dataset. c) Varying penalty weighting parameter for rCPM. In a) and b) edge selection threshold=1.0 represents no edge selection. Horizontal line indicates prediction accuracy with λ chosen by inner cross validation.
Figure 6.
Figure 6.. Different models’ generalizability to independent, external datasets.
a) Trained on HCP and applied to PNC. b) Trained on PNC and applied to HCP. b) Trained on PNC and applied to HCP. The results are presented as Pearson correlation between predicted and actual measures. Models trained on either HCP or PNC datasets can significantly predict gF in the other dataset.
Figure 7.
Figure 7.. Comparison of ridge regression with Elastic Net and Lasso.
α, the weighting parameter between ridge and lasso-type regularization, is varied across different values. α = 0 is the same as ridge regression while α = 1 is the same as lasso. Ridge regression generates the most accurate prediction in both the HCP and PNC datasets.

References

    1. Belkin M, Hsu D, Xu J, 2019. Two models of double descent for weak features.
    1. Bilker WB, Hansen JA, Brensinger CM, Richard J, Gur RE, Gur RC, 2012. Development of Abbreviated Nine-Item Forms of the Raven’s Standard Progressive Matrices Test. Assessment. 10.1177/1073191112446655 - DOI - PMC - PubMed
    1. Bouckaert RR, Frank E, 2010. Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms. 10.1007/978-3-540-24775-3_3 - DOI
    1. Cui Z, Gong G, 2018. The effect of machine learning regression algorithms and sample size on individualized behavioral prediction with functional connectivity features. Neuroimage. 10.1016/j.neuroimage.2018.06.001 - DOI - PubMed
    1. Dadi K, Rahim M, Abraham A, Chyzhyk D, Milham M, Thirion B, Varoquaux G, 2019. Benchmarking functional connectome-based predictive models for resting-state fMRI. Neuroimage. 10.1016/j.neuroimage.2019.02.062 - DOI - PubMed

Publication types