. 2022 May 9;13(1):2532.

doi: 10.1038/s41467-022-30208-8.

Contribution of rare whole-genome sequencing variants to plasma protein levels and the missing heritability

Affiliations

¹ Department of Cell and Molecular Biology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
² Department of Medical Biochemistry and Microbiology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
³ Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
⁴ Department of Biochemistry and Biophysics, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Stockholm University, Stockholm, Sweden.
⁵ Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden. asa.johansson@igp.uu.se.

^# Contributed equally.

PMID: 35534486
PMCID: PMC9085767
DOI: 10.1038/s41467-022-30208-8

Contribution of rare whole-genome sequencing variants to plasma protein levels and the missing heritability

Marcin Kierczak et al. Nat Commun. 2022.

. 2022 May 9;13(1):2532.

doi: 10.1038/s41467-022-30208-8.

Authors

Affiliations

¹ Department of Cell and Molecular Biology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
² Department of Medical Biochemistry and Microbiology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
³ Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
⁴ Department of Biochemistry and Biophysics, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Stockholm University, Stockholm, Sweden.
⁵ Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden. asa.johansson@igp.uu.se.

^# Contributed equally.

PMID: 35534486
PMCID: PMC9085767
DOI: 10.1038/s41467-022-30208-8

Abstract

Despite the success of genome-wide association studies, much of the genetic contribution to complex traits remains unexplained. Here, we analyse high coverage whole-genome sequencing data, to evaluate the contribution of rare genetic variants to 414 plasma proteins. The frequency distribution of genetic variants is skewed towards the rare spectrum, and damaging variants are more often rare. We estimate that less than 4.3% of the narrow-sense heritability is expected to be explained by rare variants in our cohort. Using a gene-based approach, we identify Cis-associations for 237 of the proteins, which is slightly more compared to a GWAS (N = 213), and we identify 34 associated loci in Trans. Several associations are driven by rare variants, which have larger effects, on average. We therefore conclude that rare variants could be of importance for precision medicine applications, but have a more limited contribution to the missing heritability of complex diseases.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Fig. 1. Distribution of MAF, CADD/Eigen values, and fraction of variances across MAF-bins for SNVs and indels.**
In all figures, the dark grey area indicates the rare variants, as defined in our analyses (MAF < = 0.0239), and the light grey indicates the common variants (MAF > 0.0239). a The MAF distribution of the variants identified in the NSPHS. The bars represent the proportion of variants with a MAF within each frequency bin. b Distribution of per sample allele counts for different MAF-bins. The bars represent proportion of alleles per sample belonging to different MAF-bins. Averages across all samples (N = 1021 with WGS data were used to derive statistics) are shown and the error bars represent the 95% width of the distribution in the cohort. c The fraction of the SNVs and indels being rare vs. common, for different CADD and Eigen values. d The proportion of genotype variance that can be attributed to variants within the MAF-bins. Each bar represents the sum of all genotype variances for variants with the MAF-bin divided by the sum of genotype variances across all variants. e, f Proportion of additive genetic variance (narrow-sense heritability) that can be attributed to variants in different MAF-bins when allelic effect sizes are weighted by e CADD values and f Eigen values.

**Fig. 2. MAFs and effect sizes for the lead GWAS SNVs and indels.**
a Distribution of MAFs for the lead GWAS-significant (Wald-test, p < 5.00 × 10^–8) primary and conditional hits. A MAF threshold of 0.01 was used in the GWAS and, consequently, no GWAS hits had a MAF below 0.01. b Effect sizes from the GWAS, in relation to MAF for the primary and conditional GWAS hits. All effect estimates are reported as absolute values.

**Fig. 3. Overview of the SNV-sets, and SKAT tests performed, as well as overlap between the results.**
a Overview of the SNV-sets, SKAT tests performed, and overlap between the results for the different SNV-sets. The Venn diagram shows the number of overlapping loci with any significant SKAT association between the different SNV-sets. For *Cis*-associations a p-value of 5.88 × 10^–6 was considered as threshold for significance, while for *Trans*-associations a threshold p-value of 4.67 × 10^–10 was adopted. b Fraction of loci identified in the different models within each SNV-sets. A total of 198, 190, 182, 33, and 27 loci were identified with the five SNV-sets, respectively (N in the legend). Each bar represents the fraction of these N loci that were significant for the different SKAT models. The seven models are: (1) Unweighted; (2) CADD or Eigen weighted; (3) MAF weighted, β(1, 25); (4) MAF weighted, β(1, 5); (5) MAF weighted, β(0.5, 0.5); (6) CommonRare; (7) Rare only.

**Fig. 4. Overlap between the identified loci for the different SNV-sets and the GWAS.**
The number of loci is the total number of independent loci identified with each SNV-set or GWAS, and the overlap size is the number of loci that overlaps between SNV-sets and GWAS for: a *Cis*-associations, and b *Trans*-associations. For the *Cis*-associations, a p-value (SKAT-test) of 5.88 × 10^‑6 was considered as threshold for significance for the SKAT analyses, and a p-value (Wald-test) of 3.00 × 10⁻⁸ for the GWAS. For *Trans*-associations, a p-value (SKAT-test) of 4.67 × 10⁻¹⁰ was considered as threshold for significance for the SKAT analyses, and a p-value (Wald-test) of 3.92 × 10⁻¹¹ for the GWAS.

**Fig. 5. Regional plots for three proteins.**
The grey circles are the –log10 p-values (Wald-test) from the GWAS. Horizontal lines indicate the –log10 p-values (SKAT-test) for the *Trans*-Flank-sets in black and *Trans*-CDS-sets in blue. As can be clearly seen, multiple, partly overlapping *Trans*-Flank-sets have been analysed. a Results for TNFRSF10C levels, b VWC2 levels, and c NEP levels.

**Fig. 6. Overlap between the total number of associations for the different SKAT model.**
The bars to the left represent the total number of associations per model, and the bars in the top (overlap size) the number of associations that overlap between the different models. a is the small SNV-sets (CDS-sets and Reg-sets) in the NSPHS, b the larger Flank-SNV-sets, and c the CDS-sets in the UKB.

See this image and copyright information in PMC

Cited by

Common host variation drives malaria parasite fitness in healthy human red cells.
Ebel ER, Kuypers FA, Lin C, Petrov DA, Egan ES. Ebel ER, et al. Elife. 2021 Sep 23;10:e69808. doi: 10.7554/eLife.69808. Elife. 2021. PMID: 34553687 Free PMC article.
Unravelling the genetic architecture of human complex traits through whole genome sequencing.
Bocher O, Willer CJ, Zeggini E. Bocher O, et al. Nat Commun. 2023 Jun 14;14(1):3520. doi: 10.1038/s41467-023-39259-x. Nat Commun. 2023. PMID: 37316478 Free PMC article.
Genetic determinants of plasma protein levels in the Estonian population.
Kalnapenkis A, Jõeloo M, Lepik K, Kukuškina V, Kals M, Alasoo K; Estonian Biobank Research Team; Mägi R, Esko T, Võsa U. Kalnapenkis A, et al. Sci Rep. 2024 Apr 2;14(1):7694. doi: 10.1038/s41598-024-57966-3. Sci Rep. 2024. PMID: 38565889 Free PMC article.
Rare variant associations with plasma protein levels in the UK Biobank.
Dhindsa RS, Burren OS, Sun BB, Prins BP, Matelska D, Wheeler E, Mitchell J, Oerton E, Hristova VA, Smith KR, Carss K, Wasilewski S, Harper AR, Paul DS, Fabre MA, Runz H, Viollet C, Challis B, Platt A; AstraZeneca Genomics Initiative; Vitsios D, Ashley EA, Whelan CD, Pangalos MN, Wang Q, Petrovski S. Dhindsa RS, et al. Nature. 2023 Oct;622(7982):339-347. doi: 10.1038/s41586-023-06547-x. Epub 2023 Oct 4. Nature. 2023. PMID: 37794183 Free PMC article.
Copy number variations and their effect on the plasma proteome.
Schmitz D, Li Z, Lo Faro V, Rask-Andersen M, Ameur A, Rafati N, Johansson Å. Schmitz D, et al. Genetics. 2023 Dec 6;225(4):iyad179. doi: 10.1093/genetics/iyad179. Genetics. 2023. PMID: 37793096 Free PMC article.

See all "Cited by" articles

References

1. Yengo L, et al. Meta-analysis of genome-wide association studies for height and body mass index in ~700 000 individuals of European ancestry. Hum. Mol. Genet. 2018;27:3641–3649. doi: 10.1093/hmg/ddy271. - DOI - PMC - PubMed
1. Karlsson T, et al. Contribution of genetics to visceral adiposity and its relation to cardiovascular and metabolic disease. Nat. Med. 2019;25:1390–1395. doi: 10.1038/s41591-019-0563-7. - DOI - PubMed
1. Locke AE, et al. Genetic studies of body mass index yield new insights for obesity biology. Nature. 2015;518:197–206. doi: 10.1038/nature14177. - DOI - PMC - PubMed
1. Farooqi IS, et al. Clinical spectrum of obesity and mutations in the melanocortin 4 receptor gene. N. Engl. J. Med. 2003;348:1085–1095. doi: 10.1056/NEJMoa022050. - DOI - PubMed
1. Ameur A, et al. SweGen: A whole-genome data resource of genetic variability in a cross-section of the Swedish population. Eur. J. Hum. Genet. 2017;25:1253–1260. doi: 10.1038/ejhg.2017.130. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Contribution of rare whole-genome sequencing variants to plasma protein levels and the missing heritability

Affiliations

Contribution of rare whole-genome sequencing variants to plasma protein levels and the missing heritability

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources