Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 May 9;13(1):2532.
doi: 10.1038/s41467-022-30208-8.

Contribution of rare whole-genome sequencing variants to plasma protein levels and the missing heritability

Affiliations

Contribution of rare whole-genome sequencing variants to plasma protein levels and the missing heritability

Marcin Kierczak et al. Nat Commun. .

Abstract

Despite the success of genome-wide association studies, much of the genetic contribution to complex traits remains unexplained. Here, we analyse high coverage whole-genome sequencing data, to evaluate the contribution of rare genetic variants to 414 plasma proteins. The frequency distribution of genetic variants is skewed towards the rare spectrum, and damaging variants are more often rare. We estimate that less than 4.3% of the narrow-sense heritability is expected to be explained by rare variants in our cohort. Using a gene-based approach, we identify Cis-associations for 237 of the proteins, which is slightly more compared to a GWAS (N = 213), and we identify 34 associated loci in Trans. Several associations are driven by rare variants, which have larger effects, on average. We therefore conclude that rare variants could be of importance for precision medicine applications, but have a more limited contribution to the missing heritability of complex diseases.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Distribution of MAF, CADD/Eigen values, and fraction of variances across MAF-bins for SNVs and indels.
In all figures, the dark grey area indicates the rare variants, as defined in our analyses (MAF < = 0.0239), and the light grey indicates the common variants (MAF > 0.0239). a The MAF distribution of the variants identified in the NSPHS. The bars represent the proportion of variants with a MAF within each frequency bin. b Distribution of per sample allele counts for different MAF-bins. The bars represent proportion of alleles per sample belonging to different MAF-bins. Averages across all samples (N = 1021 with WGS data were used to derive statistics) are shown and the error bars represent the 95% width of the distribution in the cohort. c The fraction of the SNVs and indels being rare vs. common, for different CADD and Eigen values. d The proportion of genotype variance that can be attributed to variants within the MAF-bins. Each bar represents the sum of all genotype variances for variants with the MAF-bin divided by the sum of genotype variances across all variants. e, f Proportion of additive genetic variance (narrow-sense heritability) that can be attributed to variants in different MAF-bins when allelic effect sizes are weighted by e CADD values and f Eigen values.
Fig. 2
Fig. 2. MAFs and effect sizes for the lead GWAS SNVs and indels.
a Distribution of MAFs for the lead GWAS-significant (Wald-test, p < 5.00 × 10–8) primary and conditional hits. A MAF threshold of 0.01 was used in the GWAS and, consequently, no GWAS hits had a MAF below 0.01. b Effect sizes from the GWAS, in relation to MAF for the primary and conditional GWAS hits. All effect estimates are reported as absolute values.
Fig. 3
Fig. 3. Overview of the SNV-sets, and SKAT tests performed, as well as overlap between the results.
a Overview of the SNV-sets, SKAT tests performed, and overlap between the results for the different SNV-sets. The Venn diagram shows the number of overlapping loci with any significant SKAT association between the different SNV-sets. For Cis-associations a p-value of 5.88 × 10–6 was considered as threshold for significance, while for Trans-associations a threshold p-value of 4.67 × 10–10 was adopted. b Fraction of loci identified in the different models within each SNV-sets. A total of 198, 190, 182, 33, and 27 loci were identified with the five SNV-sets, respectively (N in the legend). Each bar represents the fraction of these N loci that were significant for the different SKAT models. The seven models are: (1) Unweighted; (2) CADD or Eigen weighted; (3) MAF weighted, β(1, 25); (4) MAF weighted, β(1, 5); (5) MAF weighted, β(0.5, 0.5); (6) CommonRare; (7) Rare only.
Fig. 4
Fig. 4. Overlap between the identified loci for the different SNV-sets and the GWAS.
The number of loci is the total number of independent loci identified with each SNV-set or GWAS, and the overlap size is the number of loci that overlaps between SNV-sets and GWAS for: a Cis-associations, and b Trans-associations. For the Cis-associations, a p-value (SKAT-test) of 5.88 × 10‑6 was considered as threshold for significance for the SKAT analyses, and a p-value (Wald-test) of 3.00 × 10−8 for the GWAS. For Trans-associations, a p-value (SKAT-test) of 4.67 × 10−10 was considered as threshold for significance for the SKAT analyses, and a p-value (Wald-test) of 3.92 × 10−11 for the GWAS.
Fig. 5
Fig. 5. Regional plots for three proteins.
The grey circles are the –log10 p-values (Wald-test) from the GWAS. Horizontal lines indicate the –log10 p-values (SKAT-test) for the Trans-Flank-sets in black and Trans-CDS-sets in blue. As can be clearly seen, multiple, partly overlapping Trans-Flank-sets have been analysed. a Results for TNFRSF10C levels, b VWC2 levels, and c NEP levels.
Fig. 6
Fig. 6. Overlap between the total number of associations for the different SKAT model.
The bars to the left represent the total number of associations per model, and the bars in the top (overlap size) the number of associations that overlap between the different models. a is the small SNV-sets (CDS-sets and Reg-sets) in the NSPHS, b the larger Flank-SNV-sets, and c the CDS-sets in the UKB.

Similar articles

Cited by

References

    1. Yengo L, et al. Meta-analysis of genome-wide association studies for height and body mass index in ~700 000 individuals of European ancestry. Hum. Mol. Genet. 2018;27:3641–3649. doi: 10.1093/hmg/ddy271. - DOI - PMC - PubMed
    1. Karlsson T, et al. Contribution of genetics to visceral adiposity and its relation to cardiovascular and metabolic disease. Nat. Med. 2019;25:1390–1395. doi: 10.1038/s41591-019-0563-7. - DOI - PubMed
    1. Locke AE, et al. Genetic studies of body mass index yield new insights for obesity biology. Nature. 2015;518:197–206. doi: 10.1038/nature14177. - DOI - PMC - PubMed
    1. Farooqi IS, et al. Clinical spectrum of obesity and mutations in the melanocortin 4 receptor gene. N. Engl. J. Med. 2003;348:1085–1095. doi: 10.1056/NEJMoa022050. - DOI - PubMed
    1. Ameur A, et al. SweGen: A whole-genome data resource of genetic variability in a cross-section of the Swedish population. Eur. J. Hum. Genet. 2017;25:1253–1260. doi: 10.1038/ejhg.2017.130. - DOI - PMC - PubMed

Publication types

MeSH terms