Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jul;17(3):468-83.
doi: 10.1093/biostatistics/kxw001. Epub 2016 Feb 9.

Canonical variate regression

Affiliations

Canonical variate regression

Chongliang Luo et al. Biostatistics. 2016 Jul.

Abstract

In many fields, multi-view datasets, measuring multiple distinct but interrelated sets of characteristics on the same set of subjects, together with data on certain outcomes or phenotypes, are routinely collected. The objective in such a problem is often two-fold: both to explore the association structures of multiple sets of measurements and to develop a parsimonious model for predicting the future outcomes. We study a unified canonical variate regression framework to tackle the two problems simultaneously. The proposed criterion integrates multiple canonical correlation analysis with predictive modeling, balancing between the association strength of the canonical variates and their joint predictive power on the outcomes. Moreover, the proposed criterion seeks multiple sets of canonical variates simultaneously to enable the examination of their joint effects on the outcomes, and is able to handle multivariate and non-Gaussian outcomes. An efficient algorithm based on variable splitting and Lagrangian multipliers is proposed. Simulation studies show the superior performance of the proposed approach. We demonstrate the effectiveness of the proposed approach in an [Formula: see text] intercross mice study and an alcohol dependence study.

Keywords: Canonical correlation analysis; Integrative analysis; Reduced-rank regression; Supervised learning.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Simulation: boxplots of the estimated regression coefficients formula image corresponding to the true canonical variate directions formula image. For the Gaussian response model, reported is the case with formula image, formula image, formula image, formula image and formula image. For the binary response model, reported is the case with formula image, formula image, formula image and formula image. (a) Gaussian case. (b) Binary case.
Fig. 2.
Fig. 2.
Applications: pairs of canonical variate directions extracted by CVR. The left panel is for the mouse body weight data application, in which the correlation coefficients between the first pair and the second pair are 0.83 and 0.51, respectively. The right panel is for the alcohol dependence data application, in which the correlation coefficients between the first pair and the second pair are 0.90 and 0.83, respectively. (a) Mouse body weight data. (b) Alcohol dependence data.

References

    1. Biémont C. (2010). From genotype to phenotype. What do Epigenetics and Epigenomics tell us. Heredity 105, 1–3. - PubMed
    1. Boyd S., Parikh N., Chu E., Peleato B. Eckstein J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning 3, 1–122.
    1. Boyd S., Vandenberghe L. (2004). Convex Optimization. NY, USA: Cambridge University Press.
    1. Carroll L., Voisey J. Van Daal A. (2004). Mouse models of obesity. Clinics in Dermatology 22, 345–349. - PubMed
    1. Cervino A. C., Li G., Edwards S., Zhu J., Laurie C., Tokiwa G., Lum P. Y., Wang S., Castellini L. W., Lusis A. J., et al. (2005). Integrating QTL and high-density SNP analyses in mice to identify Insig2 as a susceptibility gene for plasma cholesterol levels. Genomics 86, 505–517. - PubMed

Publication types

LinkOut - more resources