Testing for differences in polygenic scores in the presence of confounding

Jennifer Blanc¹, Jeremy J Berg¹

Affiliations

PMID: 40233174
PMCID: PMC12135188 (available on 2026-04-15)
DOI: 10.1093/genetics/iyaf071

Testing for differences in polygenic scores in the presence of confounding

Jennifer Blanc et al. Genetics. 2025.

. 2025 Jun 4;230(2):iyaf071.

doi: 10.1093/genetics/iyaf071.

Authors

Jennifer Blanc¹, Jeremy J Berg¹

Affiliation

¹ Department of Human Genetics, University of Chicago, 920 E 58th St CLSC, Chicago, IL 60637, USA.

PMID: 40233174
PMCID: PMC12135188 (available on 2026-04-15)
DOI: 10.1093/genetics/iyaf071

Abstract

Polygenic scores have become an important tool in human genetics, enabling the prediction of individuals' phenotypes from their genotypes. Understanding how the pattern of differences in polygenic score predictions across individuals intersects with variation in ancestry can provide insights into the evolutionary forces acting on the trait in question and is important for understanding health disparities. However, because most polygenic scores are computed using effect estimates from population samples, they are susceptible to confounding by both genetic and environmental effects that are correlated with ancestry. The extent to which this confounding drives patterns in the distribution of polygenic scores depends on the patterns of population structure in both the original estimation panel and in the prediction/test panel. Here, we use theory from population and statistical genetics, together with simulations, to study the procedure of testing for an association between polygenic scores and axes of ancestry variation in the presence of confounding. We use a general model of genetic relatedness to describe how confounding in the estimation panel biases the distribution of polygenic scores in ways that depends on the degree of overlap in population structure between panels. We then show how this confounding can bias tests for associations between polygenic scores and important axes of ancestry variation in the test panel. Specifically, for any given test, there exists a single axis of population structure in the genome-wide association study (GWAS) panel that needs to be controlled for in order to protect the test. In the context of this result, we study the behavior of multiple approaches to control for stratification along this axis, including standard methods such using principal components as fixed covariates in the GWAS, linear mixed models, and a novel approach for directly estimating the axis using the test panel genotypes. Our analyses highlight the role of estimation noise in the models of population structure as a plausible source of residual confounding in polygenic score analyses.

Keywords: confounding; polygenic scores; population structure.

© The Author(s) 2025. Published by Oxford University Press on behalf of The Genetics Society of America. All rights reserved. For commercial re-use, please contact reprints@oup.com for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site—for further information please contact journals.permissions@oup.com.

PubMed Disclaimer

Conflict of interest statement

Conflicts of interest: The author(s) declare no conflicts of interest.

Update of

Testing for differences in polygenic scores in the presence of confounding.
Blanc J, Berg JJ. Blanc J, et al. bioRxiv [Preprint]. 2024 Jun 26:2023.03.12.532301. doi: 10.1101/2023.03.12.532301. bioRxiv. 2024. Update in: Genetics. 2025 Jun 4;230(2):iyaf071. doi: 10.1093/genetics/iyaf071. PMID: 36993707 Free PMC article. Updated. Preprint.

References

1. Abdellaoui A, Dolan CV, Verweij KJH, Nivard MG. 2022. Gene–environment correlations across geographic regions affect genome-wide association studies. Nat Genet. 54:1345–1354. - PMC - PubMed
1. Abdellaoui A, Hugh-Jones D, Yengo L, Kemper KE, Nivard MG, Veul L, Holtz Y, Zietsch BP, Frayling TM, Wray NR, et al. 2019. Genetic correlates of social stratification in Great Britain. Nat Hum Behav. 3(12):1332–1342. doi: 10.1038/s41588-022-01158-0. - DOI - PubMed
1. Baik J, Ben Arous G, Péché S. 2005. Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices.
1. Berg JJ, Coop G. 2014. A population genetic signal of polygenic adaptation. PLoS Genet. 10(8):e1004412. doi: 10.1371/journal.pgen.1004412. - DOI - PMC - PubMed
1. Berg JJ, Harpak A, Sinnott-Armstrong N, Joergensen AM, Mostafavi H, Field Y, Boyle EA, Zhang X, Racimo F, Pritchard JK, et al. 2019. Reduced signal for polygenic adaptation of height in UK Biobank. Elife. 8:e39725. doi: 10.7554/eLife.39725. - DOI - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Silverchair Information Systems

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Testing for differences in polygenic scores in the presence of confounding

Affiliation

Testing for differences in polygenic scores in the presence of confounding

Authors

Affiliation

Abstract

Conflict of interest statement

Update of

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources