. 2024 Jul 1;8(2):576-596.

doi: 10.1162/netn_a_00363. eCollection 2024.

Comparing the stability and reproducibility of brain-behavior relationships found using canonical correlation analysis and partial least squares within the ABCD sample

Hajer Nakua^{1

2}, Ju-Chi Yu¹, Hervé Abdi³, Colin Hawco^{1

4}, Aristotle Voineskos^{1

4}, Sean Hill^{1

4}, Meng-Chuan Lai^{1

2

5

4}, Anne L Wheeler^{5

4}, Anthony Randal McIntosh⁶, Stephanie H Ameis^{1

4}

Affiliations

¹ Campbell Family Mental Health Research Institute, Centre for Addiction and Mental Health, Toronto, Ontario, Canada.
² Institute of Medical Science, University of Toronto, Toronto, Ontario, Canada.
³ The University of Texas at Dallas, Richardson, TX, USA.
⁴ Department of Psychiatry, Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada.
⁵ Program in Neurosciences and Mental Health, The Hospital for Sick Children, Ontario, Canada.
⁶ Simon Fraser University, Vancouver, British Columbia, Canada.

PMID: 38952810
PMCID: PMC11168718
DOI: 10.1162/netn_a_00363

Comparing the stability and reproducibility of brain-behavior relationships found using canonical correlation analysis and partial least squares within the ABCD sample

Hajer Nakua et al. Netw Neurosci. 2024.

. 2024 Jul 1;8(2):576-596.

doi: 10.1162/netn_a_00363. eCollection 2024.

Authors

Affiliations

¹ Campbell Family Mental Health Research Institute, Centre for Addiction and Mental Health, Toronto, Ontario, Canada.
² Institute of Medical Science, University of Toronto, Toronto, Ontario, Canada.
³ The University of Texas at Dallas, Richardson, TX, USA.
⁴ Department of Psychiatry, Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada.
⁵ Program in Neurosciences and Mental Health, The Hospital for Sick Children, Ontario, Canada.
⁶ Simon Fraser University, Vancouver, British Columbia, Canada.

PMID: 38952810
PMCID: PMC11168718
DOI: 10.1162/netn_a_00363

Abstract

Canonical correlation analysis (CCA) and partial least squares correlation (PLS) detect linear associations between two data matrices by computing latent variables (LVs) having maximal correlation (CCA) or covariance (PLS). This study compared the similarity and generalizability of CCA- and PLS-derived brain-behavior relationships. Data were accessed from the baseline Adolescent Brain Cognitive Development (ABCD) dataset (N > 9,000, 9-11 years). The brain matrix consisted of cortical thickness estimates from the Desikan-Killiany atlas. Two phenotypic scales were examined separately as the behavioral matrix; the Child Behavioral Checklist (CBCL) subscale scores and NIH Toolbox performance scores. Resampling methods were used to assess significance and generalizability of LVs. LV₁ for the CBCL brain relationships was found to be significant, yet not consistently stable or reproducible, across CCA and PLS models (singular value: CCA = .13, PLS = .39, p < .001). LV₁ for the NIH brain relationships showed similar relationships between CCA and PLS and was found to be stable and reproducible (singular value: CCA = .21, PLS = .43, p < .001). The current study suggests that stability and reproducibility of brain-behavior relationships identified by CCA and PLS are influenced by the statistical characteristics of the phenotypic measure used when applied to a large population-based pediatric sample.

Keywords: Brain-behavior relationships; Cortical thickness; Multivariate modeling; Population-based samples.

Plain language summary

Clinical neuroscience research is going through a translational crisis largely due to the challenges of producing meaningful and generalizable results. Two critical limitations within clinical neuroscience research are the use of univariate statistics and between-study methodological variation. Univariate statistics may not be sensitive enough to detect complex relationships between several variables, and methodological variation poses challenges to the generalizability of the results. We compared two widely used multivariate statistical approaches, canonical correlations analysis (CCA) and partial least squares correlation (PLS), to determine the generalizability and stability of their solutions. We show that the properties of the measures inputted into the analysis likely play a more substantial role in the generalizability and stability of results compared to the specific approach applied (i.e., CCA or PLS).

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

<b>Figure 1.</b> — **Figure 1.**
Overview of the analytical pipeline used in the current study. (A) The main CCA and PLS analysis: the decomposition of the cross-product matrices to produce singular values and loadings (reweighted singular vectors in the case of CCA). (B) The sum of squares permutation analysis to assess the statistical significance of each LV. (C) The split-half analysis used to assess the similarity between respective loadings from each split-half. In this analysis, the brain and behavioral data are both split into two different matrices (resulting in X₁, X₂, Y₁, Y₂) for each of the 10,000 iterations. Each respective split pair (e.g., X₁ and Y₁) then undergoes the CCA or PLS analysis shown in panel A. The resulting loadings from the respective split pairs are correlated (e.g., X₁ and X₂) to provide a distribution of Pearson correlation coefficients between each respective brain and behavior loading across the 10,000 iterations. (D) The train-test resampling analysis that assesses how well the singular values from the training sample can predict the singular values of the test sample. For each of the 10,000 iterations, the dataset was split into an 80% train set (X₈₀, Y₈₀) and 20% test set (X₂₀, Y₂₀). The train set underwent the CCA or PLS analysis shown in panel A. The respective loadings from the SVD of the training set were used to solve for the singular values (S) of the test set. The resulting predicted singular values of each of the 10,000 train/test analyses were plotted as a distribution. Importantly, each analysis is independent from one another (i.e., the results of one analysis are not used in another) and not sequential (i.e., the permutation test in panel B did not necessarily need to happen prior to the split-half or train-test analysis). The bootstrap confidence interval estimation, not shown here, was used to assess the stability of the parameter estimates of variable weights in the loadings. SVD = singular value decomposition.

<b>Figure 2.</b> — **Figure 2.**
Unthresholded behavior and brain loadings from the PLS and CCA analysis performed in the CBCL brain analysis (N = 9,191). The highest PLS-derived behavior loadings were aggressive behavior, thought problems, and stress problems. The highest PLS-derived brain loadings were right pars triangularis, right inferior parietal cortex, and left posterior cingulate cortex. The highest CCA-derived behavior loadings were social problems, anxious/depressive symptoms, and stress problems. The highest CCA-derived brain loadings were the right superior temporal gyrus, left fusiform gyrus, and right lingual gyrus. Panel B shows the latent scores between XU and YV for LV₁. Prior to calculating the latent scores, the brain and behavioral loadings have been standardized by the singular values. Panel C depicts the train-test distributions of the predicted singular values of the test sample for each iteration. Asterisks indicate the LVs that showed a distribution with a Z-score greater than 1.96. LV₁ from CCA was found to be reliable (i.e., LV₁ of the training sample can reliably predict the singular values of LV₁ from the test sample). The lack of any other significant distributions of predicted singular values suggest that the 80% train set does not reliably and consistently predict the singular values from the 20% test set. OCD = obsessive compulsive symptoms; withdep = withdrawn/depression symptoms; sct = sluggish-cognitive-tempo; anxdep = anxious/depressive symptoms; rulebreak = rule-breaking behavior.

<b>Figure 3.</b> — **Figure 3.**
Unthresholded behavior and brain loadings of LV₁ from the PLS and CCA analysis performed between NIH Cognitive Toolbox scores and cortical thickness. The largest PLS- and CCA-derived loadings were found for the list sorting task and the picture vocabulary task. The highest CCA-derived brain loadings were the left pars opercularis, superior frontal gyrus, and parahippocampal gyrus. The highest PLS-derived brain loadings were the left pars opercularis, parahippocampal gyrus, and medial orbitofrontal gyrus. Panel B shows the latent scores between XU and YV for LV₁. Prior to calculating the latent scores, the brain and behavioral loadings have been standardized by the singular values. Overall, there is a similar relationship between the brain and behavioral latent scores when comparing CCA and PLS. Panel C shows the results of the train-test resampling analysis. Asterisks indicate the LVs that showed a distribution with a Z-score greater than 1.96. LV₁ (Z-score: PLS = 4.8, CCA = 7.8) and LV₃ (Z-score: PLS = 2.6, CCA = 2.2) for both PLS and CCA, and LV₂ (Z-score: CCA = 2.02) for CCA, were found to be reliable (i.e., singular values of these LVs of the training sample can reliably predict the singular values from the test sample). Flanker = Flanker task; pattern = pattern comparison processing speed task; cardsort = dimensional change card sort task; reading = oral reading recognition task; picture = picture vocabulary task; list = list sorting working memory task; picvocab = picture vocabulary task.

<b>Figure 4.</b> — **Figure 4.**
This figure depicts the distributions of the resampled loadings from the split-half analysis for the CBCL brain and NIH brain analyses. The x-axis from the split-half distributions are the Pearson correlation coefficients between respective loadings from each split-half analysis (e.g., U₁ and U₂ from the analysis comparing X₁ and Y₁ and separately, X₂ and Y₂). In the CBCL brain analysis, the distribution of Pearson correlation coefficients centered around 0 for the majority of LVs, indicating minimal correspondence between respective loadings from the split-halves. This suggests that characteristics of participants are highly influential in the loadings derived from CCA or PLS models in the CBCL brain analysis. In the NIH brain analysis, the distribution of Pearson correlation coefficients are centered around r = 0.6–0.8, indicating high correspondence between respective split-halves and that loadings from CCA and PLS models remain similar regardless of which participants are included in each iteration. Asterisks indicate the LVs which showed a distribution with a Z-score greater than 1.96.

See this image and copyright information in PMC

Update of

Comparing the stability and reproducibility of brain-behaviour relationships found using Canonical Correlation Analysis and Partial Least Squares within the ABCD Sample.
Nakua H, Yu JC, Abdi H, Hawco C, Voineskos A, Hill S, Lai MC, Wheeler AL, McIntosh AR, Ameis SH. Nakua H, et al. bioRxiv [Preprint]. 2023 Mar 9:2023.03.08.531763. doi: 10.1101/2023.03.08.531763. bioRxiv. 2023. Update in: Netw Neurosci. 2024 Jul 01;8(2):576-596. doi: 10.1162/netn_a_00363. PMID: 36945610 Free PMC article. Updated. Preprint.

Cited by

Examining Relationships between Functional and Structural Brain Network Architecture, Age, and Attention Skills in Early Childhood.
Rokos L, Bray SL, Neudorf J, Samson AD, Shen K, McIntosh AR. Rokos L, et al. eNeuro. 2025 Jul 30;12(7):ENEURO.0430-24.2025. doi: 10.1523/ENEURO.0430-24.2025. Print 2025 Jul. eNeuro. 2025. PMID: 40659509 Free PMC article.
Contrastive functional connectivity defines neurophysiology-informed symptom dimensions in major depression.
Zhu H, Tong X, Carlisle NB, Xie H, Keller CJ, Oathes DJ, Liu F, Nemeroff CB, Fonzo GA, Zhang Y. Zhu H, et al. Cell Rep Med. 2025 Jun 17;6(6):102151. doi: 10.1016/j.xcrm.2025.102151. Epub 2025 May 28. Cell Rep Med. 2025. PMID: 40441140 Free PMC article.
Multivariate Resting-State Functional Connectivity Features Linked to Transdiagnostic Psychopathology in Early Psychosis.
Wang HR, Liu ZQ, Nomi JS, Schleifer CH, Bearden CE, Misic B, Uddin LQ, Karlsgodt KH. Wang HR, et al. bioRxiv [Preprint]. 2025 Jun 9:2025.06.04.654984. doi: 10.1101/2025.06.04.654984. bioRxiv. 2025. PMID: 40661622 Free PMC article. Preprint.
Interpretable and integrative deep learning for discovering brain-behaviour associations.
Ambroise C, Grigis A, Houenou J, Frouin V. Ambroise C, et al. Sci Rep. 2025 Jan 17;15(1):2312. doi: 10.1038/s41598-024-85032-5. Sci Rep. 2025. PMID: 39824899 Free PMC article.
A Shared Multivariate Brain-Behavior Relationship in a Transdiagnostic Sample of Adolescents.
Bashford-Largo J, Nakua H, Blair RJR, Dominguez A, Hatch M, Blair KS, Dobbertin M, Ameis S, Bajaj S. Bashford-Largo J, et al. Biol Psychiatry Cogn Neurosci Neuroimaging. 2024 Apr;9(4):377-386. doi: 10.1016/j.bpsc.2023.07.015. Epub 2023 Aug 11. Biol Psychiatry Cogn Neurosci Neuroimaging. 2024. PMID: 37572936 Free PMC article.

See all "Cited by" articles

References

1. Abdi, H., Guillemot, V., Eslami, A., & Beaton, D. (2017). Canonical correlation analysis. In Alhajj R. & Rokne J. (Eds.), Encyclopedia of social network analysis and mining. New York, NY: Springer. 10.1007/978-1-4614-7163-9_110191-1 - DOI
1. Achenbach, T. M., & Ruffle, T. M. (2000). The Child Behavior Checklist and related forms for assessing behavioral/emotional problems and competencies. Pediatrics in Review, 21(8), 265–271. 10.1542/pir.21.8.265, - DOI - PubMed
1. Albaugh, M. D., Ducharme, S., Karama, S., Watts, R., Lewis, J. D., Orr, C., … Brain Development Cooperative Group. (2017). Anxious/depressed symptoms are related to microstructural maturation of white matter in typically developing youths. Development and Psychopathology, 29(3), 751–758. 10.1017/S0954579416000444, - DOI - PubMed
1. Alexander, L. M., Salum, G. A., Swanson, J. M., & Milham, M. P. (2020). Measuring strengths and weaknesses in dimensional psychiatry. Journal of Child Psychology and Psychiatry, 61(1), 40–50. 10.1111/jcpp.13104, - DOI - PMC - PubMed
1. Ameis, S. H., Ducharme, S., Albaugh, M. D., Hudziak, J. J., Botteron, K. N., Lepage, C., … Karama, S. (2014). Cortical thickness, cortico-amygdalar networks, and externalizing behaviors in healthy children. Biological Psychiatry, 75(1), 65–72. 10.1016/j.biopsych.2013.06.008, - DOI - PubMed

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Comparing the stability and reproducibility of brain-behavior relationships found using canonical correlation analysis and partial least squares within the ABCD sample

Affiliations

Comparing the stability and reproducibility of brain-behavior relationships found using canonical correlation analysis and partial least squares within the ABCD sample

Authors

Affiliations

Abstract

Plain language summary

Conflict of interest statement

Figures

Update of

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources

Abstract

Plain language summary

Conflict of interest statement

Figures

Update of

Similar articles

Cited by

References

Related information

LinkOut - more resources

Full Text Sources