A generalized linear mixed model association tool for biobank-scale data

Longda Jiang^#^{1

2}, Zhili Zheng^#¹, Hailing Fang^{2

3}, Jian Yang^{4

5

6}

Affiliations

¹ Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia.
² School of Life Sciences, Westlake University, Hangzhou, China.
³ Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, China.
⁴ Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia. jian.yang@westlake.edu.cn.
⁵ School of Life Sciences, Westlake University, Hangzhou, China. jian.yang@westlake.edu.cn.
⁶ Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, China. jian.yang@westlake.edu.cn.

^# Contributed equally.

PMID: 34737426
DOI: 10.1038/s41588-021-00954-4

A generalized linear mixed model association tool for biobank-scale data

Longda Jiang et al. Nat Genet. 2021 Nov.

. 2021 Nov;53(11):1616-1621.

doi: 10.1038/s41588-021-00954-4. Epub 2021 Nov 4.

Authors

Longda Jiang^#^{1

2}, Zhili Zheng^#¹, Hailing Fang^{2

3}, Jian Yang^{4

5

6}

Affiliations

¹ Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia.
² School of Life Sciences, Westlake University, Hangzhou, China.
³ Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, China.
⁴ Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia. jian.yang@westlake.edu.cn.
⁵ School of Life Sciences, Westlake University, Hangzhou, China. jian.yang@westlake.edu.cn.
⁶ Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, China. jian.yang@westlake.edu.cn.

^# Contributed equally.

PMID: 34737426
DOI: 10.1038/s41588-021-00954-4

Abstract

Compared with linear mixed model-based genome-wide association (GWA) methods, generalized linear mixed model (GLMM)-based methods have better statistical properties when applied to binary traits but are computationally much slower. In the present study, leveraging efficient sparse matrix-based algorithms, we developed a GLMM-based GWA tool, fastGWA-GLMM, that is severalfold to orders of magnitude faster than the state-of-the-art tools when applied to the UK Biobank (UKB) data and scalable to cohorts with millions of individuals. We show by simulation that the fastGWA-GLMM test statistics of both common and rare variants are well calibrated under the null, even for traits with extreme case-control ratios. We applied fastGWA-GLMM to the UKB data of 456,348 individuals, 11,842,647 variants and 2,989 binary traits (full summary statistics available at http://fastgwa.info/ukbimpbin ), and identified 259 rare variants associated with 75 traits, demonstrating the use of imputed genotype data in a large cohort to discover rare variants for binary complex traits.

PubMed Disclaimer

References

1. Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018). - PubMed - PMC - DOI
1. Astle, W. J. et al. The allelic landscape of human blood cell trait variation and links to common complex disease. Cell 167, 1415–1429.e19 (2016). - PubMed - PMC - DOI
1. Kemp, J. P. et al. Identification of 153 new loci associated with heel bone mineral density and functional involvement of GPC6 in osteoporosis. Nat. Genet. 49, 1468 (2017). - PubMed - PMC - DOI
1. Wray, N. R. et al. Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression. Nat. Genet. 50, 668–681 (2018). - PubMed - PMC - DOI
1. Tin, A. et al. Target genes, variants, tissues and transcriptional pathways influencing human serum urate levels. Nat. Genet. 51, 1459–1474 (2019).

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Nature Publishing Group

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A generalized linear mixed model association tool for biobank-scale data

Affiliations

A generalized linear mixed model association tool for biobank-scale data

Authors

Affiliations

Abstract

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources