Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Nov;180(3):1767-71.
doi: 10.1534/genetics.108.091850. Epub 2008 Sep 14.

Controlling type-I error of the McDonald-Kreitman test in genomewide scans for selection on noncoding DNA

Affiliations

Controlling type-I error of the McDonald-Kreitman test in genomewide scans for selection on noncoding DNA

Peter Andolfatto. Genetics. 2008 Nov.

Abstract

Departures from the assumption of homogenously interdigitated neutral and putatively selected sites in the McDonald-Kreitman test can lead to false rejections of the neutral model in the presence of intermediate levels of recombination. This problem is exacerbated by small sample sizes, nonequilibrium demography, recombination rate variation, and in comparisons involving more recently diverged species. I propose that establishing significance levels by coalescent simulation with recombination can improve the fidelity of the test in genomewide scans for selection on noncoding DNA.

PubMed Disclaimer

Figures

F<sc>igure</sc> 1.—
Figure 1.—
The MK test applied to genomic regions with different spatial organization of neutral (n) and selected (X) polymorphic and divergent mutations. D shows the spatial organization of neutral and selected sites in the study of Jeong et al. (2008). The tan gene cis-regulatory element, tMSE, regulates the expression of abdominal pigmentation in D. yakuba and is inferred to have an excess of divergence relative to polymorphism when compared to synonymous sites from two neighboring genes.
F<sc>igure</sc> 2.—
Figure 2.—
Performance of the MK test when comparing neutral sites to closely linked putatively selected sites (Figure 1B). Tests were only performed on MK tables where each marginal sum exceeded five counts. ρ/θ is the ratio of the population recombination rate ρ = 4NerL to the population mutation rate, θ = 4NeμL, where Ne is the species effective population size, L is the locus length in base pairs, and r and μ are the recombination and mutation rates per site per generation. The standard neutral model is simulated with θ = 1.5 (light shading), θ = 3 (shaded), θ = 6 (dark shading), and θ = 12 (solid). Sample size, n = 12 and the divergence time, T, were set to 2 (in units of 4Ne generations) in each case. All points are based on 10,000 replicates of the neutral coalescent with recombination implemented using the program ms, except θ = 1.5 (100,000 replicates).
F<sc>igure</sc> 3.—
Figure 3.—
The effect of various factors on the performance of the MK test in the context of Figure 1B. Tests were only performed on MK tables where each marginal sum exceeded five counts. (A) The effect of sample size. For open squares, θ = 3 (shaded line) and θ = 12 (solid line) are plotted for ρ/θ = 1. For solid squares, θ is adjusted with sample size such that the mean observed number of polymorphisms, E(S), is the same as for n = 12. (B) The effect of demography. The sample size, n = 12. θ and T are varied such that observed average number of polymorphic and divergent sites [E(S) ∼ 4.5 and E(D) ∼ 12, respectively] is similar for all models and corresponds to expectations for θ = 1.5 and T = 2 under the SNM. Models: SNM, the standard neutral model (light shading, ms command line –t 1.5); PS, a two-deme population structure model with all individuals sampled from one deme (shaded, ms command line –t 0.825–I 3 12 0 1–ma × 1 0 1 × 0 0 0 × –ej 4 2 1 –ej 4 3 1); BN, a bottleneck model (dark shading, ms command line –t 4.3 -en 1 0.0041 0.03 -en 1 0.019 1 –ej 0.45 2 1 –ej 0.45 3 1) based on parameters estimated for D. melanogaster by Thornton and Andolfatto (2006); BN + PS, a combined population structure and bottleneck model (solid, with parameters as above, except –t 2.4 –ej 0.7 2 1 –ej 0.7 3 1). (C) The effect of divergence depth. All parameters are as in B, except species divergence time is varied. For SNM, T = 2 and for BN + PS, T = 0.7. (D) The effect of an intron or a recombination hotspot. The two demographic models shown are the SNM with θ = 6 (squares) and the BN + PS model (circles) from B. For the intron model (solid lines), a 1-kb intron is added in the middle of a 1200-bp sequence. Note that recombination rate is uniform and the intron is not included in the MK test but changes the cumulative genetic distance between the tested regions. For the hotspot model (dark shaded lines), a 100-bp hotspot with 10-fold intensity relative to the background in the middle of the surveyed region was modeled (implemented with msHOT (Hellenthal and Stephens 2007), with locus length = 1200 bp and command line: -v 1 550 650 10). In the hotspot model, background levels of recombination were set such that the cumulative genetic distance of the surveyed region is the same in the presence and absence of the hotspot. For each demographic model, results for “no intron” and “no hotspot” are indicated with light shaded lines for comparison.

References

    1. Andolfatto, P., 2005. Adaptive evolution of non-coding DNA in Drosophila. Nature 437 1149–1152. - PubMed
    1. Andolfatto, P., and M. Przeworski, 2000. A genomewide departure from the standard neutral model in natural populations of Drosophila. Genetics 156 257–268. - PMC - PubMed
    1. Bachtrog, D., K. Thornton, A. Clark and P. Andolfatto, 2006. Extensive introgression of mitochondrial DNA relative to nuclear gene flow in the Drosophila yakuba species group. Evol. Int. J. Org. Evol. 60 292–302. - PubMed
    1. Baudry, E., N. Derome, M. Huet and M. Veuille, 2006. Contrasted polymorphism patterns in a large sample of populations from the evolutionary genetics model Drosophila simulans. Genetics 173 759–767. - PMC - PubMed
    1. Begun, D., A. Holloway, K. Stevens, L. Hillier, Y. Poh et al., 2007. Population genomics: whole-genome analysis of polymorphism and divergence in Drosophila simulans. PLoS Biol. 5 e310. - PMC - PubMed

Substances

LinkOut - more resources