Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Sep 30;12(10):jkac206.
doi: 10.1093/g3journal/jkac206.

impMKT: the imputed McDonald and Kreitman test, a straightforward correction that significantly increases the evidence of positive selection of the McDonald and Kreitman test at the gene level

Affiliations
Review

impMKT: the imputed McDonald and Kreitman test, a straightforward correction that significantly increases the evidence of positive selection of the McDonald and Kreitman test at the gene level

Jesús Murga-Moreno et al. G3 (Bethesda). .

Abstract

The McDonald and Kreitman test is one of the most powerful and widely used methods to detect and quantify recurrent natural selection in DNA sequence data. One of its main limitations is the underestimation of positive selection due to the presence of slightly deleterious variants segregating at low frequencies. Although several approaches have been developed to overcome this limitation, most of them work on gene pooled analyses. Here, we present the imputed McDonald and Kreitman test (impMKT), a new straightforward approach for the detection of positive selection and other selection components of the distribution of fitness effects at the gene level. We compare imputed McDonald and Kreitman test with other widely used McDonald and Kreitman test approaches considering both simulated and empirical data. By applying imputed McDonald and Kreitman test to humans and Drosophila data at the gene level, we substantially increase the statistical evidence of positive selection with respect to previous approaches (e.g. by 50% and 157% compared with the McDonald and Kreitman test in Drosophila and humans, respectively). Finally, we review the minimum number of genes required to obtain a reliable estimation of the proportion of adaptive substitution (α) in gene pooled analyses by using the imputed McDonald and Kreitman test compared with other McDonald and Kreitman test implementations. Because of its simplicity and increased power to detect recurrent positive selection on genes, we propose the imputed McDonald and Kreitman test as the first straightforward approach for testing specific evolutionary hypotheses at the gene level. The software implementation and population genomics data are available at the web-server imkt.uab.cat.

Keywords: McDonald and Kreitman test; natural selection; nucleotide variation; positive selection; protein-coding genes.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Hypothetical SFS and fixed differences from Hahn (2018).
Fig. 2.
Fig. 2.
α MKT estimations by the different MKT approaches under different SLiM simulated scenarios, specifically different simulated fractions of adaptive mutations. Equivalent results under other SLiM simulated scenarios are available in Supplementary Fig. 1.
Fig. 3.
Fig. 3.
Error biases associated with the α estimations for all of the scenarios and MKT approaches.
Fig. 4.
Fig. 4.
α estimations at simulations accounting for weak adaptation. Any of the proposed methods can correct linkage and weak adaptation at the estimations. Although the method proposed by Uricchio et al. (2019) can overcome linkage and weak adaptation, α estimations at the gene level remain unexplored and new approaches are required.
Fig. 5.
Fig. 5.
a) Estimated number of genes under positive selection in the D. melanogaster Zambian population detected by each MKT approach. b) Estimated number of genes under positive selection in the human lineage African populations detected by each MKT approach.
Fig. 6.
Fig. 6.
Gene pooled analysis. A total of 3,500 random protein-coding genes were sampled from the ZI dataset. We pooled the genes to obtain SFS of 1, 2, 5, 10, 25, 50, 75, 100, 250, 750, and 1,000 genes by resampling them 1,000 times with replacement. a) α estimates with MKT correction. b) Proportion of analysis performed by impMKT. c) Proportion of analysis performed by aMKT. d) Proportion of analysis performed by aMKT. e) Proportion of analysis performed by Grapes.
Fig. 7.
Fig. 7.
Gene pooled analysis. A total of 3,500 random protein-coding genes were sampled from the human dataset. We pooled the genes to obtain SFS of 1, 2, 5, 10, 25, 50, 75, 100, 250, 750, and 1,000 genes by resampling them 1,000 times with replacement. a) α estimates with MKT correction. b) Proportion of analysis performed by impMKT. c) Proportion of analysis performed by aMKT. d) Proportion of analysis performed by aMKT. e) Proportion of analysis performed by Grapes.

Similar articles

Cited by

References

    1. Akashi H. Inferring the fitness effects of DNA mutations from polymorphism and divergence data: statistical power to detect directional selection under stationarity and free recombination. Genetics. 1999;151(1):221–238. - PMC - PubMed
    1. Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA, Abecasis GR; 1000 Genomes Project Consortium A global reference for human genetic variation. Nature. 2015;526(7571):68–74. - PMC - PubMed
    1. Balloux F, Lehmann L.. Substitution rates at neutral genes depend on population size under fluctuating demography and overlapping generations. Evolution. 2012;66(2):605–611. - PubMed
    1. Bhérer C, Campbell CL, Auton A.. Refined genetic maps reveal sexual dimorphism in human meiotic recombination at multiple scales. Nat Commun. 2017;8:14994. - PMC - PubMed
    1. Bierne N, Eyre-Walker A.. The genomic rate of adaptive amino acid substitution in Drosophila. Mol Biol Evol. 2004;21(7):1350–1360. - PubMed

Publication types

LinkOut - more resources