Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul 1;8(2):576-596.
doi: 10.1162/netn_a_00363. eCollection 2024.

Comparing the stability and reproducibility of brain-behavior relationships found using canonical correlation analysis and partial least squares within the ABCD sample

Affiliations

Comparing the stability and reproducibility of brain-behavior relationships found using canonical correlation analysis and partial least squares within the ABCD sample

Hajer Nakua et al. Netw Neurosci. .

Abstract

Canonical correlation analysis (CCA) and partial least squares correlation (PLS) detect linear associations between two data matrices by computing latent variables (LVs) having maximal correlation (CCA) or covariance (PLS). This study compared the similarity and generalizability of CCA- and PLS-derived brain-behavior relationships. Data were accessed from the baseline Adolescent Brain Cognitive Development (ABCD) dataset (N > 9,000, 9-11 years). The brain matrix consisted of cortical thickness estimates from the Desikan-Killiany atlas. Two phenotypic scales were examined separately as the behavioral matrix; the Child Behavioral Checklist (CBCL) subscale scores and NIH Toolbox performance scores. Resampling methods were used to assess significance and generalizability of LVs. LV1 for the CBCL brain relationships was found to be significant, yet not consistently stable or reproducible, across CCA and PLS models (singular value: CCA = .13, PLS = .39, p < .001). LV1 for the NIH brain relationships showed similar relationships between CCA and PLS and was found to be stable and reproducible (singular value: CCA = .21, PLS = .43, p < .001). The current study suggests that stability and reproducibility of brain-behavior relationships identified by CCA and PLS are influenced by the statistical characteristics of the phenotypic measure used when applied to a large population-based pediatric sample.

Keywords: Brain-behavior relationships; Cortical thickness; Multivariate modeling; Population-based samples.

Plain language summary

Clinical neuroscience research is going through a translational crisis largely due to the challenges of producing meaningful and generalizable results. Two critical limitations within clinical neuroscience research are the use of univariate statistics and between-study methodological variation. Univariate statistics may not be sensitive enough to detect complex relationships between several variables, and methodological variation poses challenges to the generalizability of the results. We compared two widely used multivariate statistical approaches, canonical correlations analysis (CCA) and partial least squares correlation (PLS), to determine the generalizability and stability of their solutions. We show that the properties of the measures inputted into the analysis likely play a more substantial role in the generalizability and stability of results compared to the specific approach applied (i.e., CCA or PLS).

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

<b>Figure 1.</b>
Figure 1.
Overview of the analytical pipeline used in the current study. (A) The main CCA and PLS analysis: the decomposition of the cross-product matrices to produce singular values and loadings (reweighted singular vectors in the case of CCA). (B) The sum of squares permutation analysis to assess the statistical significance of each LV. (C) The split-half analysis used to assess the similarity between respective loadings from each split-half. In this analysis, the brain and behavioral data are both split into two different matrices (resulting in X1, X2, Y1, Y2) for each of the 10,000 iterations. Each respective split pair (e.g., X1 and Y1) then undergoes the CCA or PLS analysis shown in panel A. The resulting loadings from the respective split pairs are correlated (e.g., X1 and X2) to provide a distribution of Pearson correlation coefficients between each respective brain and behavior loading across the 10,000 iterations. (D) The train-test resampling analysis that assesses how well the singular values from the training sample can predict the singular values of the test sample. For each of the 10,000 iterations, the dataset was split into an 80% train set (X80, Y80) and 20% test set (X20, Y20). The train set underwent the CCA or PLS analysis shown in panel A. The respective loadings from the SVD of the training set were used to solve for the singular values (S) of the test set. The resulting predicted singular values of each of the 10,000 train/test analyses were plotted as a distribution. Importantly, each analysis is independent from one another (i.e., the results of one analysis are not used in another) and not sequential (i.e., the permutation test in panel B did not necessarily need to happen prior to the split-half or train-test analysis). The bootstrap confidence interval estimation, not shown here, was used to assess the stability of the parameter estimates of variable weights in the loadings. SVD = singular value decomposition.
<b>Figure 2.</b>
Figure 2.
Unthresholded behavior and brain loadings from the PLS and CCA analysis performed in the CBCL brain analysis (N = 9,191). The highest PLS-derived behavior loadings were aggressive behavior, thought problems, and stress problems. The highest PLS-derived brain loadings were right pars triangularis, right inferior parietal cortex, and left posterior cingulate cortex. The highest CCA-derived behavior loadings were social problems, anxious/depressive symptoms, and stress problems. The highest CCA-derived brain loadings were the right superior temporal gyrus, left fusiform gyrus, and right lingual gyrus. Panel B shows the latent scores between XU and YV for LV1. Prior to calculating the latent scores, the brain and behavioral loadings have been standardized by the singular values. Panel C depicts the train-test distributions of the predicted singular values of the test sample for each iteration. Asterisks indicate the LVs that showed a distribution with a Z-score greater than 1.96. LV1 from CCA was found to be reliable (i.e., LV1 of the training sample can reliably predict the singular values of LV1 from the test sample). The lack of any other significant distributions of predicted singular values suggest that the 80% train set does not reliably and consistently predict the singular values from the 20% test set. OCD = obsessive compulsive symptoms; withdep = withdrawn/depression symptoms; sct = sluggish-cognitive-tempo; anxdep = anxious/depressive symptoms; rulebreak = rule-breaking behavior.
<b>Figure 3.</b>
Figure 3.
Unthresholded behavior and brain loadings of LV1 from the PLS and CCA analysis performed between NIH Cognitive Toolbox scores and cortical thickness. The largest PLS- and CCA-derived loadings were found for the list sorting task and the picture vocabulary task. The highest CCA-derived brain loadings were the left pars opercularis, superior frontal gyrus, and parahippocampal gyrus. The highest PLS-derived brain loadings were the left pars opercularis, parahippocampal gyrus, and medial orbitofrontal gyrus. Panel B shows the latent scores between XU and YV for LV1. Prior to calculating the latent scores, the brain and behavioral loadings have been standardized by the singular values. Overall, there is a similar relationship between the brain and behavioral latent scores when comparing CCA and PLS. Panel C shows the results of the train-test resampling analysis. Asterisks indicate the LVs that showed a distribution with a Z-score greater than 1.96. LV1 (Z-score: PLS = 4.8, CCA = 7.8) and LV3 (Z-score: PLS = 2.6, CCA = 2.2) for both PLS and CCA, and LV2 (Z-score: CCA = 2.02) for CCA, were found to be reliable (i.e., singular values of these LVs of the training sample can reliably predict the singular values from the test sample). Flanker = Flanker task; pattern = pattern comparison processing speed task; cardsort = dimensional change card sort task; reading = oral reading recognition task; picture = picture vocabulary task; list = list sorting working memory task; picvocab = picture vocabulary task.
<b>Figure 4.</b>
Figure 4.
This figure depicts the distributions of the resampled loadings from the split-half analysis for the CBCL brain and NIH brain analyses. The x-axis from the split-half distributions are the Pearson correlation coefficients between respective loadings from each split-half analysis (e.g., U1 and U2 from the analysis comparing X1 and Y1 and separately, X2 and Y2). In the CBCL brain analysis, the distribution of Pearson correlation coefficients centered around 0 for the majority of LVs, indicating minimal correspondence between respective loadings from the split-halves. This suggests that characteristics of participants are highly influential in the loadings derived from CCA or PLS models in the CBCL brain analysis. In the NIH brain analysis, the distribution of Pearson correlation coefficients are centered around r = 0.6–0.8, indicating high correspondence between respective split-halves and that loadings from CCA and PLS models remain similar regardless of which participants are included in each iteration. Asterisks indicate the LVs which showed a distribution with a Z-score greater than 1.96.

Update of

Similar articles

Cited by

References

    1. Abdi, H., Guillemot, V., Eslami, A., & Beaton, D. (2017). Canonical correlation analysis. In Alhajj R. & Rokne J. (Eds.), Encyclopedia of social network analysis and mining. New York, NY: Springer. 10.1007/978-1-4614-7163-9_110191-1 - DOI
    1. Achenbach, T. M., & Ruffle, T. M. (2000). The Child Behavior Checklist and related forms for assessing behavioral/emotional problems and competencies. Pediatrics in Review, 21(8), 265–271. 10.1542/pir.21.8.265, - DOI - PubMed
    1. Albaugh, M. D., Ducharme, S., Karama, S., Watts, R., Lewis, J. D., Orr, C., … Brain Development Cooperative Group. (2017). Anxious/depressed symptoms are related to microstructural maturation of white matter in typically developing youths. Development and Psychopathology, 29(3), 751–758. 10.1017/S0954579416000444, - DOI - PubMed
    1. Alexander, L. M., Salum, G. A., Swanson, J. M., & Milham, M. P. (2020). Measuring strengths and weaknesses in dimensional psychiatry. Journal of Child Psychology and Psychiatry, 61(1), 40–50. 10.1111/jcpp.13104, - DOI - PMC - PubMed
    1. Ameis, S. H., Ducharme, S., Albaugh, M. D., Hudziak, J. J., Botteron, K. N., Lepage, C., … Karama, S. (2014). Cortical thickness, cortico-amygdalar networks, and externalizing behaviors in healthy children. Biological Psychiatry, 75(1), 65–72. 10.1016/j.biopsych.2013.06.008, - DOI - PubMed

LinkOut - more resources