Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jan-Feb;18(1):227-239.
doi: 10.1109/TCBB.2019.2947428. Epub 2021 Feb 3.

Multi-Task Sparse Canonical Correlation Analysis with Application to Multi-Modal Brain Imaging Genetics

Multi-Task Sparse Canonical Correlation Analysis with Application to Multi-Modal Brain Imaging Genetics

Lei Du et al. IEEE/ACM Trans Comput Biol Bioinform. 2021 Jan-Feb.

Abstract

Brain imaging genetics studies the genetic basis of brain structures and functionalities via integrating genotypic data such as single nucleotide polymorphisms (SNPs) and imaging quantitative traits (QTs). In this area, both multi-task learning (MTL) and sparse canonical correlation analysis (SCCA) methods are widely used since they are superior to those independent and pairwise univariate analysis. MTL methods generally incorporate a few of QTs and could not select features from multiple QTs; while SCCA methods typically employ one modality of QTs to study its association with SNPs. Both MTL and SCCA are computational expensive as the number of SNPs increases. In this paper, we propose a novel multi-task SCCA (MTSCCA) method to identify bi-multivariate associations between SNPs and multi-modal imaging QTs. MTSCCA could make use of the complementary information carried by different imaging modalities. MTSCCA enforces sparsity at the group level via the G2,1-norm, and jointly selects features across multiple tasks for SNPs and QTs via the l2,1-norm. A fast optimization algorithm is proposed using the grouping information of SNPs. Compared with conventional SCCA methods, MTSCCA obtains better correlation coefficients and canonical weights patterns. In addition, MTSCCA runs very fast and easy-to-implement, indicating its potential power in genome-wide brain-wide imaging genetics.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Illustration of the pairwise correlation coefficients and LD values (r2 ≥ 0.2) of SNPs from Chromosome 19 of an ADNI database. (1) The three sub figures above show the correlation coefficients r among SNPs with number of 1,000, and 5,000, and 13,000. (2) The three sub figures below are the corresponding values of LD. All figures show that SNPs clearly form groups and the block diagonal structure always exists as the number of SNPs increases.
Fig. 2.
Fig. 2.
Illustration of the simplified covariance matrix XX, where Xgk and Xgk+1 are two LD blocks, and XgkXgk is abbreviated as (XX)gk. Since the correlation between the two blocks are very low (XgkXgk+10 and Xgk+1Xgk0), their covariance can be ignored.
Fig. 3.
Fig. 3.
Canonical weights u (mean value) estimated on synthetic data. The first row is the ground truth, and each remaining row corresponds to an SCCA method: (1) Two-view SCCA, (2) mSCCA (Multi-view SCCA), (3) MTSCCA (Multi-task SCCA). In each subfigure, the horizontal axis represents the indices of each u, and the vertical axis represents the estimated weight value.
Fig. 4.
Fig. 4.
Canonical weights V (mean value) estimated on synthetic data. The first row is the ground truth, and each remaining row corresponds to an SCCA method: (1) Two-view SCCA, (2) mSCCA (Multi-view SCCA), (3) MTSCCA (Multi-task SCCA). In each subfigure, the horizontal axis represents the indices of vj (j = 1, 2), and the vertical axis represents the estimated weight value.
Fig. 5.
Fig. 5.
Performance comparison: The mean and standard deviation (SD) of the canonical correlation coefficients (CCCs) obtained from 5-fold cross-validation trials are plotted, where each error bar indicates ±0.5SD. The subtitle SNPs-AV45 means the CCCs are calculated between the SNPs data and the AV45-PET data.
Fig. 6.
Fig. 6.
Comparison of canonical weights in terms of each imaging modality across five trials. Each row corresponds to a SCCA method: (1) Two-view SCCA; (2) mSCCA; (3) MTSCCA. Within each panel, there are three rows corresponding to three type of imaging QTs, i.e. AV45, FDG and VBM.

References

    1. Saykin AJ, Shen L, Yao X, Kim S, Nho K, and et al., “Genetic studies of quantitative MCI and AD phenotypes in ADNI: Progress, opportunities, and plans,” Alzheimer’s & Dementia, vol. 11, no. 7, pp. 792–814, 2015. - PMC - PubMed
    1. Shen L, Thompson PM, Potkin SG, Bertram L, Farrer LA, and et al., “Genetic analysis of quantitative phenotypes in ad and mci: imaging, cognition and biomarkers,” Brain Imaging and Behavior, vol. 8, no. 2, pp. 183–207, 2014. - PMC - PubMed
    1. Mueller SG, Weiner MW, Thal LJ, Petersen RC, Jack C, Jagust W, Trojanowski JQ, Toga AW, and Beckett L, “The alzheimer’s disease neuroimaging initiative,” Neuroimaging Clinics of North America, vol. 15, no. 4, pp. 869–877, 2005. - PMC - PubMed
    1. Lee S, Zhu J, and Xing EP, “Adaptive multi-task lasso: with application to eqtl detection,” in NIPS, 2010, pp. 1306–1314.
    1. Wang H, Nie F, Huang H, Kim S, Nho K, Risacher SL, Saykin AJ, and Shen L, “Identifying quantitative trait loci via group-sparse multitask regression and feature selection: an imaging genetics study of the ADNI cohort,” Bioinformatics, vol. 28, no. 2, pp. 229–237, 2012. - PMC - PubMed

Publication types