Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 7;21(1):e1011519.
doi: 10.1371/journal.pgen.1011519. eCollection 2025 Jan.

Improving polygenic prediction from summary data by learning patterns of effect sharing across multiple phenotypes

Affiliations

Improving polygenic prediction from summary data by learning patterns of effect sharing across multiple phenotypes

Deborah Kunkel et al. PLoS Genet. .

Abstract

Polygenic prediction of complex trait phenotypes has become important in human genetics, especially in the context of precision medicine. Recently, mr.mash, a flexible and computationally efficient method that models multiple phenotypes jointly and leverages sharing of effects across such phenotypes to improve prediction accuracy, was introduced. However, a drawback of mr.mash is that it requires individual-level data, which are often not publicly available. In this work, we introduce mr.mash-rss, an extension of the mr.mash model that requires only summary statistics from Genome-Wide Association Studies (GWAS) and linkage disequilibrium (LD) estimates from a reference panel. By using summary data, we achieve the twin goal of increasing the applicability of the mr.mash model to data sets that are not publicly available and making it scalable to biobank-size data. Through simulations, we show that mr.mash-rss is competitive with, and often outperforms, current state-of-the-art methods for single- and multi-phenotype polygenic prediction in a variety of scenarios that differ in the pattern of effect sharing across phenotypes, the number of phenotypes, the number of causal variants, and the genomic heritability. We also present a real data analysis of 16 blood cell phenotypes in the UK Biobank, showing that mr.mash-rss achieves higher prediction accuracy than competing methods for the majority of traits, especially when the data set has smaller sample size.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Prediction accuracy in simulations with different patterns of effect sharing across phenotypes.
Each panel summarizes the accuracy of the test set predictions in 20 simulations. The thick, black line in each box gives the median R2. The dotted and dashed lines give the maximum accuracy achievable, i.e., the simulated hg2.
Fig 2
Fig 2. Prediction accuracy in simulations with different genetic architecture.
Each panel summarizes the accuracy of the test set predictions in 20 simulations. The thick, black line in each box gives the median R2. The dotted lines give the maximum accuracy achievable, i.e., the simulated hg2.
Fig 3
Fig 3. Prediction accuracy for the 16 blood cell traits in the full UK Biobank data.
The thick, black line in each box gives the median R2.
Fig 4
Fig 4. Relationship between improvement in prediction accuracy and genomic heritability in the full UK Biobank data.
Phenotypes are plotted along the x-axis by their genomic heritability (hg2) and along the y-axis by the change in R2 relative to the LDpred2-auto (Panel A) and SBayesR (Panel B); that is, (R2(mr.mash-rss)—R2(other method))/R2(other method). The blue line represents the linear regression fit with 95% confidence bands.
Fig 5
Fig 5. Prediction accuracy for the 16 blood cell traits in the sampled UK Biobank data.
The thick, black line in each box gives the median R2.

Update of

References

    1. Hickey JM, Chiurugwi T, Mackay I, Powell W. Genomic prediction unifies animal and plant breeding programs to form platforms for biological discovery. Nature genetics. 2017;49(9):1297–1303. doi: 10.1038/ng.3920 - DOI - PubMed
    1. Lewis CM, Vassos E. Polygenic risk scores: from research tools to clinical instruments. Genome medicine. 2020;12(1):1–11. doi: 10.1186/s13073-020-00742-5 - DOI - PMC - PubMed
    1. Wainberg M, Sinnott-Armstrong N, Mancuso N, Barbeira AN, Knowles DA, Golan D, et al.. Opportunities and challenges for transcriptome-wide association studies. Nature genetics. 2019;51(4):592–599. doi: 10.1038/s41588-019-0385-z - DOI - PMC - PubMed
    1. Walsh B, Lynch M. Evolution and selection of quantitative traits. Oxford University Press; 2018.
    1. Cao C, Ding B, Li Q, Kwok D, Wu J, Long Q. Power analysis of transcriptome-wide association study: Implications for practical protocol choice. PLoS genetics. 2021;17(2):e1009405. doi: 10.1371/journal.pgen.1009405 - DOI - PMC - PubMed

MeSH terms

LinkOut - more resources