. 2025 Apr 21;21(4):e1011427.

doi: 10.1371/journal.pgen.1011427. eCollection 2025 Apr.

Isolating selective from non-selective forces using site frequency ratios

Jody Hey¹, Vitor A C Pavinato¹

Affiliations

PMID: 40258089
PMCID: PMC12064048
DOI: 10.1371/journal.pgen.1011427

Isolating selective from non-selective forces using site frequency ratios

Jody Hey et al. PLoS Genet. 2025.

. 2025 Apr 21;21(4):e1011427.

doi: 10.1371/journal.pgen.1011427. eCollection 2025 Apr.

Authors

Jody Hey¹, Vitor A C Pavinato¹

Affiliation

¹ Department of Biology, Temple University, Philadelphia, Pennsylvania, United States of America.

PMID: 40258089
PMCID: PMC12064048
DOI: 10.1371/journal.pgen.1011427

Abstract

A new method is introduced for estimating the distribution of mutation fitness effects using site frequency spectra. Unlike previous methods, which make assumptions about non-selective factors, or that try to incorporate such factors into the underlying model, this new method mostly avoids non-selective effects by working with the ratios of counts of selected sites to neutral sites. An expression for the likelihood of a set of selected/neutral ratios is found by treating the ratio of two Poisson random variables as the ratio of two gaussian random variables. This approach also avoids the need to estimate the relative mutation rates of selected and neutral sites. Simulations over a wide range of demographic models, with linked selection effects show that the new SFRatios method performs well for statistical tests of selection, and it performs well for estimating the distribution of selection effects. Performance was better with weak selection models and for expansion and structured demographic models than for bottleneck models. Applications to two populations of Drosophila melanogaster reveal clear but very weak selection on synonymous sites. For nonsynonymous sites, selection was found to be consistent with previous estimates and stronger for an African population than for one from North Carolina.

Copyright: © 2025 Hey. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Comparison of SFSs and Selected/Neutral ratio for Wright-Fisher (WF) and other demographic models.**
Selection model: 2Ns~Lognormal(3.0,1.2) (see methods), with expectation -40.3. For all simulations, $\frac{θ_{S}}{θ} = 1.0$ . A. Folded SFS values for a sample of 200 chromosomes, scaled to the value for allele count 1. B. The ratio of selected to neutral counts for folded SFSs.

**Fig 2. Statistical performance for wright-fisher poisson random field (WF-PRF) and SFRatios.**
Top row (panels A, B): The probability of rejecting the null (neutral) when the alternative model (selected) is true for different probabilities of false positive (α) and varying strengths of 2Ns. Middle row (panels C, D). Receiver operator characteristic (ROC) curves, with area under the curve (AUC). Bottom row (panels E, F). Cumulative observed distributions of the likelihood ratio test statistic, with χ²,1df comparison for sets of 500 simulations.

**Fig 3. Simulation results for fixed**
$γ$ **values.** Boxplots of γ values estimated for simulated data generated under different demographic models. For each box, an arrow indicates the location of the true value. A. Population with a constant size. B. Population expansion. C. Population bottleneck. D. Two divergent subpopulations.

**Fig 4. Estimator performance for**
$γ$ **lognormal distribution parameters.** For each simulated data set, γ values were drawn from one of 4 lognormal distributions. 20 data sets were simulated for each demographic model and each γ distribution. A. lognormal distributions, for random variable x,(1 < x<∞), $2 N s$ =1-x. B. Constant population size Wright-Fisher. C. Population expansion. D. Population bottleneck. E. Two populations. F. African Origin model.

**Fig 5. Estimator performance for the ratio of mutation rates, ρ.**
For each simulated data set, 2Ns values were drawn from one of a 5 lognormal distributions (see Fig 4). 20 data sets were simulated for each demographic model and each 2Ns distribution, each with a true mutation rate ratio of 0.35. A. Constant Wright-Fisher B. Population expansion. C. Population Bottleneck. D. Two populations. E. African Origin model.

**Fig 6. SFSs and ratios for two Drosophila populations.**
A. SFS counts normalized to that singleton bin count to enable comparisons. B. SFS ratios for both populations, for nonsynonymous and synonymous SFSs divided by the short intron SFS. Expected values generated under the best fit models are shown with dashed lines.

**Fig 7. Best-fit estimated 2Ns densities for Drosophila populations.**
A. Nonsynonymous variation, Zambia. B. Nonsynonymous variation, North Carolina. C. Synonymous variation, Zambia. D. Synonymous variation, North Carolina.

See this image and copyright information in PMC

References

1. Lewontin RC, Krakauer J. Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms. Genetics. 1973;74:175–95. - PMC - PubMed
1. Beaumont MA, Balding DJ. Identifying adaptive genetic divergence among populations from genome scans. Mol Ecol. 2004;13(4):969–80. doi: 10.1111/j.1365-294x.2004.02125.x - DOI - PubMed
1. Hudson RR, Kreitman M, Aguadé M. A test of neutral molecular evolution based on nucleotide data. Genetics. 1987;116:153–9. doi: 10.1093/genetics/116.1.153 - DOI - PMC - PubMed
1. Miyata T, Yasunaga T. Molecular evolution of mRNA: a method for estimating evolutionary rates of synonymous and amino acid substitutions from homologous nucleotide sequences and its application. J Mol Evol. 1980;16(1):23–36. doi: 10.1007/BF01732067 - DOI - PubMed
1. Wright S. The distribution of gene frequencies in populations. Proc Natl Acad Sci U S A. 1937;23:307–20. doi: 10.1073/pnas.23.6.307 - DOI - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

R01 GM144468/GM/NIGMS NIH HHS/United States

LinkOut - more resources

Full Text Sources
- PubMed Central
- Public Library of Science
Molecular Biology Databases
- FlyBase

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Isolating selective from non-selective forces using site frequency ratios

Affiliation

Isolating selective from non-selective forces using site frequency ratios

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Molecular Biology Databases