Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar;54(3):263-273.
doi: 10.1038/s41588-021-00997-7. Epub 2022 Mar 7.

Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data

Collaborators, Affiliations

Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data

Pierrick Wainschtein et al. Nat Genet. 2022 Mar.

Abstract

Analyses of data from genome-wide association studies on unrelated individuals have shown that, for human traits and diseases, approximately one-third to two-thirds of heritability is captured by common SNPs. However, it is not known whether the remaining heritability is due to the imperfect tagging of causal variants by common SNPs, in particular whether the causal variants are rare, or whether it is overestimated due to bias in inference from pedigree data. Here we estimated heritability for height and body mass index (BMI) from whole-genome sequence data on 25,465 unrelated individuals of European ancestry. The estimated heritability was 0.68 (standard error 0.10) for height and 0.30 (standard error 0.10) for body mass index. Low minor allele frequency variants in low linkage disequilibrium (LD) with neighboring variants were enriched for heritability, to a greater extent for protein-altering variants, consistent with negative selection. Our results imply that rare variants, in particular those in regions of low linkage disequilibrium, are a major source of the still missing heritability of complex traits and disease.

PubMed Disclaimer

Conflict of interest statement

The remaining authors declare no competing interests.

Figures

Figure 1:
Figure 1:
GREML-LDMS estimates with 8 bins (2 LD bins for each of the 4 MAF bins) correcting for 20 PCs (calculated from LD-pruned HM3 SNPs) after imputing SNPs from Illumina InfiniumCore24, GSA 24 and Affymetrix Axiom arrays using Haplotype Reference Consortium reference panels for N=25,465 samples. (A) Estimates of hG+IMP2 for height are between 0.50-0.56 (SE 0.06-0.07). (B) Estimates for BMI are between 0.16-0.21 (SE 0.07). The large SEs of the estimates for variants with MAF between 0.0001 to 0.001 can be explained by the large number of imputed variants in this MAF bin because the sampling variance of a SNP-based heritability estimate is proportional to the effective number of independent variants . Between ~19.0M and ~20.0M variants in total are included in the analysis. The number of variants in each of the 4 MAF bins (twice the number in each LD bin) can be found in Supplementary Figure 8.
Figure 2:
Figure 2:
GREML-LDMS of height and BMI for N=25,465 samples using 3 or 4 LD groups for each MAF bin correcting for 48/160/320 PCs computed from WGS variants. Each variant was allocated in a tertile or a quartile according to its LD score. (A) Estimates using 3 LD bins for height: 0.68 (SE 0.09 – 0.10). (B) Estimates using 3 LD bins for BMI: 0.30 – 0.32 (SE 0.10). (C) Estimates using 4 LD bins for height: 0.67 – 0.68 (SE 0.10). (D) Estimates using 4 LD bins for BMI: 0.28 – 0.30 (SE 0.10).
Figure 3:
Figure 3:
Variance explained per variant (the estimate of genetic variance divided by the number of variants in each bin) from GREML-LDMS with the low-LD and low-MAF (< 0.1) variants partitioned into 2 distinct categories according to the SnpEff putative effect of the variant (protein-altering or non-protein-altering), correcting for 48 PCs from WGS variants for N=25,465 samples. There is a total of 11 genetic components in this analysis. There is an apparent enrichment of heritability in the protein-altering groupings (low LD) over non-protein-altering (low LD) or high LD variants for height (A) as well as for BMI, although the standard errors for this trait are large (B).

Comment in

References

    1. Lynch M & Walsh B Genetics and analysis of quantitative traits. (Sinauer, 1998).
    1. Fisher RA XV.—The Correlation between Relatives on the Supposition of Mendelian Inheritance. Transactions of the Royal Society of Edinburgh 52, 399–433, doi:10.1017/s0080456800012163 (1918). - DOI
    1. MacArthur J et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res 45, D896–D901, doi:10.1093/nar/gkwll33 (2017). - DOI - PMC - PubMed
    1. Gazal S et al. Linkage disequilibrium-dependent architecture of human complex traits shows action of negative selection. Nat Genet 49, 1421–1427, doi:10.1038/ng.3954 (2017). - DOI - PMC - PubMed
    1. Zeng J et al. Signatures of negative selection in the genetic architecture of human complex traits. Nat Genet 50, 746–753, doi:10.1038/s41588-018-0101-4 (2018). - DOI - PubMed

Methods References

    1. Chang CC et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7, doi:10.1186/s13742-015-0047-8 (2015). - DOI - PMC - PubMed
    1. Taliun D et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590, 290–299, doi:10.1038/s41586-021-03205-y (2021). - DOI - PMC - PubMed
    1. Maples BK, Gravel S, Kenny EE & Bustamante CD RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am J Hum Genet 93, 278–288, doi:10.1016/j.ajhg.2013.06.020 (2013). - DOI - PMC - PubMed
    1. Jiang L et al. A resource-efficient tool for mixed model association analysis of large-scale data. Nat Genet 51, 1749–1755, doi:10.1038/s41588-019-0530-8 (2019). - DOI - PubMed
    1. Yang J, Lee SH, Goddard ME & Visscher PM GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet 88, 76–82, doi:10.1016/j.ajhg.2010.11.011 (2011). - DOI - PMC - PubMed

Publication types

MeSH terms