A resource-efficient tool for mixed model association analysis of large-scale data
- PMID: 31768069
- DOI: 10.1038/s41588-019-0530-8
A resource-efficient tool for mixed model association analysis of large-scale data
Abstract
The genome-wide association study (GWAS) has been widely used as an experimental design to detect associations between genetic variants and a phenotype. Two major confounding factors, population stratification and relatedness, could potentially lead to inflated GWAS test statistics and hence to spurious associations. Mixed linear model (MLM)-based approaches can be used to account for sample structure. However, genome-wide association (GWA) analyses in biobank samples such as the UK Biobank (UKB) often exceed the capability of most existing MLM-based tools especially if the number of traits is large. Here, we develop an MLM-based tool (fastGWA) that controls for population stratification by principal components and for relatedness by a sparse genetic relationship matrix for GWA analyses of biobank-scale data. We demonstrate by extensive simulations that fastGWA is reliable, robust and highly resource-efficient. We then apply fastGWA to 2,173 traits on array-genotyped and imputed samples from 456,422 individuals and to 2,048 traits on whole-exome-sequenced samples from 46,191 individuals in the UKB.
References
-
- Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019). - PubMed
-
- DeWan, A. et al. HTRA1 promoter polymorphism in wet age-related macular degeneration. Science 314, 989–992 (2006). - PubMed
-
- Burton, P. R. et al. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007).
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources