Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr 21;21(4):e1011427.
doi: 10.1371/journal.pgen.1011427. eCollection 2025 Apr.

Isolating selective from non-selective forces using site frequency ratios

Affiliations

Isolating selective from non-selective forces using site frequency ratios

Jody Hey et al. PLoS Genet. .

Abstract

A new method is introduced for estimating the distribution of mutation fitness effects using site frequency spectra. Unlike previous methods, which make assumptions about non-selective factors, or that try to incorporate such factors into the underlying model, this new method mostly avoids non-selective effects by working with the ratios of counts of selected sites to neutral sites. An expression for the likelihood of a set of selected/neutral ratios is found by treating the ratio of two Poisson random variables as the ratio of two gaussian random variables. This approach also avoids the need to estimate the relative mutation rates of selected and neutral sites. Simulations over a wide range of demographic models, with linked selection effects show that the new SFRatios method performs well for statistical tests of selection, and it performs well for estimating the distribution of selection effects. Performance was better with weak selection models and for expansion and structured demographic models than for bottleneck models. Applications to two populations of Drosophila melanogaster reveal clear but very weak selection on synonymous sites. For nonsynonymous sites, selection was found to be consistent with previous estimates and stronger for an African population than for one from North Carolina.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Comparison of SFSs and Selected/Neutral ratio for Wright-Fisher (WF) and other demographic models.
Selection model: 2Ns~Lognormal(3.0,1.2) (see methods), with expectation -40.3. For all simulations, θSθ=1.0. A. Folded SFS values for a sample of 200 chromosomes, scaled to the value for allele count 1. B. The ratio of selected to neutral counts for folded SFSs.
Fig 2
Fig 2. Statistical performance for wright-fisher poisson random field (WF-PRF) and SFRatios.
Top row (panels A, B): The probability of rejecting the null (neutral) when the alternative model (selected) is true for different probabilities of false positive (α) and varying strengths of 2Ns. Middle row (panels C, D). Receiver operator characteristic (ROC) curves, with area under the curve (AUC). Bottom row (panels E, F). Cumulative observed distributions of the likelihood ratio test statistic, with χ2,1df comparison for sets of 500 simulations.
Fig 3
Fig 3. Simulation results for fixed
γ values. Boxplots of γ values estimated for simulated data generated under different demographic models. For each box, an arrow indicates the location of the true value. A. Population with a constant size. B. Population expansion. C. Population bottleneck. D. Two divergent subpopulations.
Fig 4
Fig 4. Estimator performance for
γ lognormal distribution parameters. For each simulated data set, γ values were drawn from one of 4 lognormal distributions. 20 data sets were simulated for each demographic model and each γ distribution. A. lognormal distributions, for random variable x,(1 < x<∞),2Ns=1-x. B. Constant population size Wright-Fisher. C. Population expansion. D. Population bottleneck. E. Two populations. F. African Origin model.
Fig 5
Fig 5. Estimator performance for the ratio of mutation rates, ρ.
For each simulated data set, 2Ns values were drawn from one of a 5 lognormal distributions (see Fig 4). 20 data sets were simulated for each demographic model and each 2Ns distribution, each with a true mutation rate ratio of 0.35. A. Constant Wright-Fisher B. Population expansion. C. Population Bottleneck. D. Two populations. E. African Origin model.
Fig 6
Fig 6. SFSs and ratios for two Drosophila populations.
A. SFS counts normalized to that singleton bin count to enable comparisons. B. SFS ratios for both populations, for nonsynonymous and synonymous SFSs divided by the short intron SFS. Expected values generated under the best fit models are shown with dashed lines.
Fig 7
Fig 7. Best-fit estimated 2Ns densities for Drosophila populations.
A. Nonsynonymous variation, Zambia. B. Nonsynonymous variation, North Carolina. C. Synonymous variation, Zambia. D. Synonymous variation, North Carolina.

References

    1. Lewontin RC, Krakauer J. Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms. Genetics. 1973;74:175–95. - PMC - PubMed
    1. Beaumont MA, Balding DJ. Identifying adaptive genetic divergence among populations from genome scans. Mol Ecol. 2004;13(4):969–80. doi: 10.1111/j.1365-294x.2004.02125.x - DOI - PubMed
    1. Hudson RR, Kreitman M, Aguadé M. A test of neutral molecular evolution based on nucleotide data. Genetics. 1987;116:153–9. doi: 10.1093/genetics/116.1.153 - DOI - PMC - PubMed
    1. Miyata T, Yasunaga T. Molecular evolution of mRNA: a method for estimating evolutionary rates of synonymous and amino acid substitutions from homologous nucleotide sequences and its application. J Mol Evol. 1980;16(1):23–36. doi: 10.1007/BF01732067 - DOI - PubMed
    1. Wright S. The distribution of gene frequencies in populations. Proc Natl Acad Sci U S A. 1937;23:307–20. doi: 10.1073/pnas.23.6.307 - DOI - PMC - PubMed

LinkOut - more resources