Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Aug 1;36(8):1701-1710.
doi: 10.1093/molbev/msz092.

Applicability of the Mutation-Selection Balance Model to Population Genetics of Heterozygous Protein-Truncating Variants in Humans

Affiliations

Applicability of the Mutation-Selection Balance Model to Population Genetics of Heterozygous Protein-Truncating Variants in Humans

Donate Weghorn et al. Mol Biol Evol. .

Abstract

The fate of alleles in the human population is believed to be highly affected by the stochastic force of genetic drift. Estimation of the strength of natural selection in humans generally necessitates a careful modeling of drift including complex effects of the population history and structure. Protein-truncating variants (PTVs) are expected to evolve under strong purifying selection and to have a relatively high per-gene mutation rate. Thus, it is appealing to model the population genetics of PTVs under a simple deterministic mutation-selection balance, as has been proposed earlier (Cassa et al. 2017). Here, we investigated the limits of this approximation using both computer simulations and data-driven approaches. Our simulations rely on a model of demographic history estimated from 33,370 individual exomes of the Non-Finnish European subset of the ExAC data set (Lek et al. 2016). Additionally, we compared the African and European subset of the ExAC study and analyzed de novo PTVs. We show that the mutation-selection balance model is applicable to the majority of human genes, but not to genes under the weakest selection.

Keywords: genetic drift; protein-truncating variants; selection inference.

PubMed Disclaimer

Figures

<sc>Fig</sc>. 1.
Fig. 1.
Comparison of the deterministic mutation–selection balance model with the model that includes the effects of genetic drift, in the NFE demography. (a) Fold change in the coefficient of variation (squares) and the mean (crosses) of the number of PTV mutations, k, relative to the deterministic case, obtained from simulations of a realistic demography of the ExAC NFE sample for different values of heterozygous selection strength shet. (b) Heat map of gene-specific estimates for all 16,279 tested genes from the NFE sample, showing deterministic (x axis) and drift-inclusive (y axis) shet estimates. Note the double-logarithmic axes in both panels.
<sc>Fig</sc>. 2.
Fig. 2.
In the strong selection limit, shet is a predictor of the fraction of de novo PTVs, f. De novo fraction of PTV mutations was estimated for 6,203 (out of 16,279) genes with at least one PTV (de novo or transmitted) in an ASD cohort of ∼4,000 parent–child trios (y axis) and compared with the deterministic s^het derived from the NFE sample (x axis). Red dots denote individual genes (genes with f^=0 were assigned f^=2×104 for illustration purposes). Black squares connected by black lines denote the mean in bins along the x axis of logarithmic width Δlog[s^het]=0.25 (number of genes per bin from left to right: {1, 10, 43, 148, 400, 811, 1,117, 1,158, 870, 597, 359, 360, 236, 90, 3}). Vertical error bars show the standard error of the mean per bin for f^. Corresponding error bars for s^het are smaller than the marker size. Gray line denotes the diagonal.
<sc>Fig</sc>. 3.
Fig. 3.
Comparison of per-gene selection estimates, s^het, with a measure of probability of loss-of-function intolerance, pLI (Lek et al. 2016). Shown is the correlation with independent measures of gene importance. (ac) Data on disease severity, penetrance, and age of onset (x axes) for a set of 113 haploinsufficient disease-associated genes of high confidence (ClinGen Dosage Sensitivity Project) were compared with deterministic NFE s^het (top row) and pLI (bottom row) predictions (y axis). (d) The fraction of de novo PTVs (f^), shown in bins of width 0.25 for 5,930 genes with at least one transmitted or de novo PTV and a pLI annotation (x axis), was derived from an ASD trio-sequencing data set (top: NFE s^het, bottom: pLI). Note that due to the small number of PTVs per gene in the ASD cohort, the distribution of f^ on the range [0, 1] is not smooth. Red dots denote individual genes, gray boxes enclose the central quartiles of the distribution in each category, and black horizontal bars through gray boxes show the median. Note the logarithmic y axis in the top row, whereas the bottom row has a linear y axis.

Similar articles

Cited by

References

    1. Browning SR, Browning BL.. 2015. Accurate non-parametric estimation of recent effective population size from segments of identity by descent. Am J Hum Genet. 973:404–418. - PMC - PubMed
    1. Bürger R, Wagner GP, Stettinger F.. 1989. How much heritable variation can be maintained in finite populations by mutation–selection balance? Evolution 438:1748–1766. - PubMed
    1. Cassa CA, Weghorn D, Balick DJ, Jordan DM, Nusinow D, Samocha KE, O’Donnell-Luria A, MacArthur DG, Daly MJ, Beier DR, et al. 2017. Estimating the selective effects of heterozygous protein-truncating variants from human exome data. Nat Genet. 495:806.. - PMC - PubMed
    1. Cassa CA, Weghorn D, Balick DJ, Jordan DM, Nusinow D, Samocha KE, O’Donnell-Luria A, MacArthur DG, Daly MJ, Beier DR, et al. 2019. Reply to selective effects of heterozygous protein-truncating variants. Nat Genet. 511:3.. - PMC - PubMed
    1. Charlesworth B, Hill WG.. 2019. Selective effects of heterozygous protein-truncating variants. Nat Genet. 511:2.. - PubMed

Publication types

Substances