Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jun 1;35(6):1536-1546.
doi: 10.1093/molbev/msy054.

New Methods for Inferring the Distribution of Fitness Effects for INDELs and SNPs

Affiliations

New Methods for Inferring the Distribution of Fitness Effects for INDELs and SNPs

Henry J Barton et al. Mol Biol Evol. .

Abstract

Small insertions and deletions (INDELs; ≤50 bp) are the most common type of variability after single nucleotide polymorphism (SNP). However, compared with SNPs, we know little about the distribution of fitness effects (DFE) of new INDEL mutations and how prevalent adaptive INDEL substitutions are. Studying INDELs has been difficult partly because identifying ancestral states at these sites is error-prone and misidentification can lead to severely biased estimates of the strength of selection. To solve these problems, we develop new maximum likelihood methods, which use polymorphism data to simultaneously estimate the DFE, the mutation rate, and the misidentification rate. These methods are applicable to both INDELs and SNPs. Simulations show that they can provide highly accurate results. We applied the methods to an INDEL polymorphism data set in Drosophila melanogaster. We found that the DFE for polymorphic INDELs in protein-coding regions is bimodal, with the variants being either nearly neutral or strongly deleterious. Based on the DFE, we estimated that 71.5-83.7% of the INDEL substitutions that took place along the D. melanogaster lineage were fixed by positive selection, which is comparable with the prevalence of adaptive substitutions at nonsynonymous sites. The new methods have been implemented in the software package anavar.

PubMed Disclaimer

Figures

<sc>Fig</sc>. 1.
Fig. 1.
The SFSs for insertions and deletions may be affected to different extents by polarisation errors. We assume that the population size is constant, that INDELs are neutral, and that the sample size is 10. In the genomic region under consideration, the total scaled mutation rate toward insertions, 4Neum, is 10, where Ne is the effective population size u is the insertion mutation rate per site per generation, and m is that size of the focal region. The total scaled mutation rate toward deletions is 20. The expected SFSs were generated using standard neutral theory. The SFSs with polarisation errors were generated by assuming that the ancestral state of an INDEL was wrongly identified with probability 0.1.

References

    1. Ananda G, Walsh E, Jacob KD, Krasilnikova M, Eckert KA, Chiaromonte F, Makova KD.. 2013. Distinct mutational behaviors differentiate short tandem repeats from microsatellites in the human genome. Genome Biol Evol. 5(3):606–620. - PMC - PubMed
    1. Andolfatto P. 2005. Adaptive evolution of non-coding DNA in Drosophila. Nature 437(7062):1149–1152. - PubMed
    1. Andolfatto P, Wong KM, Bachtrog D.. 2011. Effective population size and the efficacy of selection on the x chromosomes of two closely related Drosophila species. Genome Biol Evol. 3(0):114–128. - PMC - PubMed
    1. Besenbacher S, Liu S, Izarzugaza JMG, Grove J, Belling K, Bork-Jensen J, Huang S, Als TD, Li S, Yadav R, et al. . 2015. Novel variation and de novo mutation rates in population-wide de novo assembled Danish trios. Nat Commun. 6(1):5969.. - PMC - PubMed
    1. Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AFA, Roskin KM, Baertsch R, Rosenbloom K, Clawson H, Green ED, et al. . 2004. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 14(4):708–715. - PMC - PubMed

Publication types

LinkOut - more resources