Multiset correlation and factor analysis enables exploration of multi-omics data
- PMID: 37601969
- PMCID: PMC10435377
- DOI: 10.1016/j.xgen.2023.100359
Multiset correlation and factor analysis enables exploration of multi-omics data
Abstract
Multi-omics datasets are becoming more common, necessitating better integration methods to realize their revolutionary potential. Here, we introduce multi-set correlation and factor analysis (MCFA), an unsupervised integration method tailored to the unique challenges of high-dimensional genomics data that enables fast inference of shared and private factors. We used MCFA to integrate methylation markers, protein expression, RNA expression, and metabolite levels in 614 diverse samples from the Trans-Omics for Precision Medicine/Multi-Ethnic Study of Atherosclerosis multi-omics pilot. Samples cluster strongly by ancestry in the shared space, even in the absence of genetic information, while private spaces frequently capture dataset-specific technical variation. Finally, we integrated genetic data by conducting a genome-wide association study (GWAS) of our inferred factors, observing that several factors are enriched for GWAS hits and trans-expression quantitative trait loci. Two of these factors appear to be related to metabolic disease. Our study provides a foundation and framework for further integrative analysis of ever larger multi-modal genomic datasets.
© 2023 The Authors.
Conflict of interest statement
T.L. is a paid adviser or consultant of GSK, Pfizer, and Goldfinch Bio and has equity in Variant Bio. F.A. is an employee and shareholder of Illumina, Inc.
Figures




References
-
- Hotelling H. Relations Between Two Sets of Variates. Biometrika. 1936;28:321–377. doi: 10.2307/2333955. - DOI
Grants and funding
- U54 HG003067/HG/NHGRI NIH HHS/United States
- 75N92020D00001/HL/NHLBI NIH HHS/United States
- R01 MH106842/MH/NIMH NIH HHS/United States
- N01 HC095167/HL/NHLBI NIH HHS/United States
- R01 AG057422/AG/NIA NIH HHS/United States
- P30 DK063491/DK/NIDDK NIH HHS/United States
- R01 HL121270/HL/NHLBI NIH HHS/United States
- R01 HL142028/HL/NHLBI NIH HHS/United States
- UL1 TR000040/TR/NCATS NIH HHS/United States
- N01 HC095166/HL/NHLBI NIH HHS/United States
- N01 HC095160/HL/NHLBI NIH HHS/United States
- 75N92020D00002/HL/NHLBI NIH HHS/United States
- HHSN268201500003C/HL/NHLBI NIH HHS/United States
- P30 DK040561/DK/NIDDK NIH HHS/United States
- N01 HC095161/HL/NHLBI NIH HHS/United States
- 75N92020D00005/HL/NHLBI NIH HHS/United States
- N01 HC095168/HL/NHLBI NIH HHS/United States
- R01 HL120393/HL/NHLBI NIH HHS/United States
- UL1 TR001079/TR/NCATS NIH HHS/United States
- N01 HC095169/HL/NHLBI NIH HHS/United States
- R01 HL077612/HL/NHLBI NIH HHS/United States
- K99 HG012373/HG/NHGRI NIH HHS/United States
- N01 HC095159/HL/NHLBI NIH HHS/United States
- 75N92020D00003/HL/NHLBI NIH HHS/United States
- R01 HL093081/HL/NHLBI NIH HHS/United States
- R01 HL105756/HL/NHLBI NIH HHS/United States
- UL1 TR001420/TR/NCATS NIH HHS/United States
- 75N92020D00004/HL/NHLBI NIH HHS/United States
- N01 HC095163/HL/NHLBI NIH HHS/United States
- 75N92020D00007/HL/NHLBI NIH HHS/United States
- U01 AG068880/AG/NIA NIH HHS/United States
- HHSN268201500003I/HL/NHLBI NIH HHS/United States
- 75N92020D00006/HL/NHLBI NIH HHS/United States
- R01 HL117626/HL/NHLBI NIH HHS/United States
- N01 HC095162/HL/NHLBI NIH HHS/United States
- UL1 TR001881/TR/NCATS NIH HHS/United States
- N01 HC095165/HL/NHLBI NIH HHS/United States
- N01 HC095164/HL/NHLBI NIH HHS/United States