On the Unfounded Enthusiasm for Soft Selective Sweeps III: The Supervised Machine Learning Algorithm That Isn't
- PMID: 33916341
- PMCID: PMC8066263
- DOI: 10.3390/genes12040527
On the Unfounded Enthusiasm for Soft Selective Sweeps III: The Supervised Machine Learning Algorithm That Isn't
Abstract
In the last 15 years or so, soft selective sweep mechanisms have been catapulted from a curiosity of little evolutionary importance to a ubiquitous mechanism claimed to explain most adaptive evolution and, in some cases, most evolution. This transformation was aided by a series of articles by Daniel Schrider and Andrew Kern. Within this series, a paper entitled "Soft sweeps are the dominant mode of adaptation in the human genome" (Schrider and Kern, Mol. Biol. Evolut. 2017, 34(8), 1863-1877) attracted a great deal of attention, in particular in conjunction with another paper (Kern and Hahn, Mol. Biol. Evolut. 2018, 35(6), 1366-1371), for purporting to discredit the Neutral Theory of Molecular Evolution (Kimura 1968). Here, we address an alleged novelty in Schrider and Kern's paper, i.e., the claim that their study involved an artificial intelligence technique called supervised machine learning (SML). SML is predicated upon the existence of a training dataset in which the correspondence between the input and output is known empirically to be true. Curiously, Schrider and Kern did not possess a training dataset of genomic segments known a priori to have evolved either neutrally or through soft or hard selective sweeps. Thus, their claim of using SML is thoroughly and utterly misleading. In the absence of legitimate training datasets, Schrider and Kern used: (1) simulations that employ many manipulatable variables and (2) a system of data cherry-picking rivaling the worst excesses in the literature. These two factors, in addition to the lack of negative controls and the irreproducibility of their results due to incomplete methodological detail, lead us to conclude that all evolutionary inferences derived from so-called SML algorithms (e.g., S/HIC) should be taken with a huge shovel of salt.
Keywords: artificial intelligence (AI); evolutionary biology; molecular and genome evolution; population size; selective sweeps; supervised machine learning (SML).
Conflict of interest statement
The authors declare no conflict of interest.
Figures


Similar articles
-
S/HIC: Robust Identification of Soft and Hard Sweeps Using Machine Learning.PLoS Genet. 2016 Mar 15;12(3):e1005928. doi: 10.1371/journal.pgen.1005928. eCollection 2016 Mar. PLoS Genet. 2016. PMID: 26977894 Free PMC article.
-
Soft Sweeps Are the Dominant Mode of Adaptation in the Human Genome.Mol Biol Evol. 2017 Aug 1;34(8):1863-1877. doi: 10.1093/molbev/msx154. Mol Biol Evol. 2017. PMID: 28482049 Free PMC article.
-
Soft shoulders ahead: spurious signatures of soft and partial selective sweeps result from linked hard sweeps.Genetics. 2015 May;200(1):267-84. doi: 10.1534/genetics.115.174912. Epub 2015 Feb 25. Genetics. 2015. PMID: 25716978 Free PMC article.
-
Selective Sweeps.Genetics. 2019 Jan;211(1):5-13. doi: 10.1534/genetics.118.301319. Genetics. 2019. PMID: 30626638 Free PMC article. Review.
-
Population genomics of rapid adaptation by soft selective sweeps.Trends Ecol Evol. 2013 Nov;28(11):659-69. doi: 10.1016/j.tree.2013.08.003. Epub 2013 Sep 25. Trends Ecol Evol. 2013. PMID: 24075201 Free PMC article. Review.
Cited by
-
Principal Component Analyses (PCA)-based findings in population genetic studies are highly biased and must be reevaluated.Sci Rep. 2022 Aug 29;12(1):14683. doi: 10.1038/s41598-022-14395-4. Sci Rep. 2022. PMID: 36038559 Free PMC article.
-
Not by Selection Alone: Expanding the Scope of Gene-Culture Coevolution.Evol Anthropol. 2025 Sep;34(3):e70007. doi: 10.1002/evan.70007. Evol Anthropol. 2025. PMID: 40682589 Free PMC article. Review.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources