Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov 5;37(21):3889-3895.
doi: 10.1093/bioinformatics/btab576.

Predicting correlated outcomes from molecular data

Affiliations

Predicting correlated outcomes from molecular data

Armin Rauschenberger et al. Bioinformatics. .

Abstract

Motivation: Multivariate (multi-target) regression has the potential to outperform univariate (single-target) regression at predicting correlated outcomes, which frequently occur in biomedical and clinical research. Here we implement multivariate lasso and ridge regression using stacked generalization.

Results: Our flexible approach leads to predictive and interpretable models in high-dimensional settings, with a single estimate for each input-output effect. In the simulation, we compare the predictive performance of several state-of-the-art methods for multivariate regression. In the application, we use clinical and genomic data to predict multiple motor and non-motor symptoms in Parkinson's disease patients. We conclude that stacked multivariate regression, with our adaptations, is a competitive method for predicting correlated outcomes.

Availability and implementation: The R package joinet is available on GitHub (https://github.com/rauschenberger/joinet) and cran (https://cran.r-project.org/package=joinet).

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
To estimate the effect of input j on the linear predictor for output k, we first estimate the effects of input j on the linear predictor for each output (base layer) and then estimate the effects of all cross-validated linear predictors on the linear predictor for output k (meta layer)
Fig. 2.
Fig. 2.
Spearman’s rank correlation coefficients. Left: correlation between outputs from the same tool at different visits (averaged across combinations of visits). Right: correlation between outputs from different tools at the same visit (averaged across visits)
Fig. 3.
Fig. 3.
Percentage change in cross-validated mean squared error from univariate to multivariate regression. Left: we support the prediction for one tool at one visit with the same tool at other visits. Right: we support the prediction for one tool at one visit (row) with another tool at the same visit (column). All values are averaged across visits (1/2/3), regularization methods (lasso/ridge) and data types (clinical/omics/both), i.e. 18 settings
Fig. 4.
Fig. 4.
Percentage change in cross-validated mean squared error from prediction by the mean to univariate (first row) and multivariate regression (second row: support from other visits; third row: support from other tool). All values are averaged across visits (1/2/3), regularization methods (lasso/ridge), data types (clinical/omics/both) and coaching variables (third row only), i.e. 18 settings (first and second row) or 18 × 7 = 126 settings (third row)

References

    1. Biesheuvel C.J. et al. (2008) Polytomous logistic regression analysis could be applied more often in diagnostic research. J. Clin. Epidemiol., 61, 125–134. - PubMed
    1. Bostanabad R. et al. (2018) Leveraging the nugget parameter for efficient Gaussian process modeling. Int. J. Numer. Methods Eng., 114, 501–516.
    1. Breiman L. (1996) Stacked regressions. Mach. Learn., 24, 49–64.
    1. Breiman L., Friedman J.H. (1997) Predicting multivariate responses in multiple linear regression. J. R. Stat. Soc. Ser. B (Stat. Methodol.), 59, 3–54.
    1. Cao H. et al. (2019) RMTL: an R library for multi-task learning. Bioinformatics, 35, 1797–1798. - PubMed

Publication types