Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jun;30(6):1315-25.
doi: 10.1093/molbev/mst036. Epub 2013 Feb 27.

Prevalence of multinucleotide replacements in evolution of primates and Drosophila

Affiliations

Prevalence of multinucleotide replacements in evolution of primates and Drosophila

Nadezhda V Terekhanova et al. Mol Biol Evol. 2013 Jun.

Abstract

Evolution of sequences mostly involves independent changes at different sites. However, substitutions at neighboring sites may co-occur as multinucleotide replacement events (MNRs). Here, we compare noncoding sequences of several species of primates, and of three species of Drosophila fruit flies, in a phylogenetic analysis of the replacements that occurred between species at nearby nucleotide sites. Both in primates and in Drosophila, the frequency of single-nucleotide replacements is substantially elevated within 10 nucleotides from other replacements that occurred on the same lineage but not on another lineage. The data imply that dinucleotide replacements (DNRs) affecting sites at distances of up to 10 nucleotides from each other are responsible for 2.3% of single-nucleotide replacements in primate genomes and for 5.6% in Drosophila genomes. Among these DNRs, 26% and 69%, respectively, are in fact parts of replacements of three or more trinucleotide replacements (TNRs). The plurality of MNRs affect nearby nucleotides, so that at least six times as many DNRs affect two adjacent nucleotide sites than sites 10 nucleotides apart. Still, approximately 60% of DNRs, and approximately 90% of TNRs, span distances more than two (or three) nucleotides. MNRs make a major contribution to the observed clustering of substitutions: In the human-chimpanzee comparison, DNRs are responsible for 50% of cases when two nearby replacements are observed on the human lineage, and TNRs are responsible for 83% of cases when three replacements at three immediately adjacent sites are observed on the human lineage. The prevalence of MNRs matches that is observed in data on de novo mutations and is also observed in the regions with the lowest sequence conservation, suggesting that MNRs mainly have mutational origin; however, epistatic selection and/or gene conversion may also play a role.

Keywords: D. melanogaster; H. sapiens; complex mutations; multinucleotide replacements; mutagenesis.

PubMed Disclaimer

Figures

F<sc>ig</sc>. 1.
Fig. 1.
Inferring the frequencies of DNRs (A,B) and of TNRs (C–E). The schematic phylogenetic tree at the left represents, from top to bottom, the two sister species, focal and proxy, and the outgroup; the adjacent horizontal lines represent multiple sequence alignments for the corresponding species. (A,B) The frequencies of substitutions on the focal lineage are measured within the set of sites (vertical ovals), such that another substitution (vertical line) is observed at distance k from them on the proxy (A) or on the focal (B) lineage (dd(k) and sd(k), respectively). (C,D) The frequencies of pairs of substitutions on the focal lineage are measured within the set of sites (pairs of vertical ovals at distances l from each other), such that another substitution (vertical line) is observed at distance k from them on the proxy (A) or the focal (B) lineage (dt(k,l) and st(k,l), respectively). E, five scenarios that can give rise to three adjacent substitutions on the same lineage, from top to bottom: three distinct mutations; two distinct mutations (an arc connects positions involved in an MNR), one being a DNR involving the second and the third, the first and the second, or the first and the third of the three nucleotides; and one TNR.
F<sc>ig</sc>. 2.
Fig. 2.
Frequencies of DNRs in primates for different distances between sites. dd(k) (red), sd(k) (blue), and αd(k) (green) are shown for distances formula image between the sites (horizontal axis). Left column (A–D), introns; right column (E–H), intergenic regions. (A,E) Homo sapiens and Pan troglodytes (Gorilla gorilla as outgroup), (B,F) H. sapiens and G. gorilla (Pongo pygmaeus as outgroup), (C,G) H. sapiens and P. pygmaeus (Macaca mulatta as outgroup), and (D,H) H. sapiens and M. mulatta (Callithrix jacchus as outgroup). CpG dinucleotides are excluded, which leads to underestimation of dd(1) and sd(1) (see Materials and Methods).
F<sc>ig</sc>. 3.
Fig. 3.
Frequencies of TNRs, such that two of the three replacements affect adjacent sites (l = 1), for different distances k from the third site in primates. dt(k,l) (red), st(k,l) (blue), and αt(k,l) (green) are shown for distances formula image (horizontal axis). The panels correspond to the panels in figure 2.
F<sc>ig</sc>. 4.
Fig. 4.
Frequencies of TNRs. αt(k,l) for all values of k and l between 1 and 10. (A) Humanchimpanzee comparison and (B) D. melanogaster–D. simulans comparison. Only intronic sites are shown.
F<sc>ig</sc>. 5.
Fig. 5.
Dependence of the contribution of DNRs on the length of the phylogenetic lineage of the focal species in introns. (A) formula image as the function of formula image and (B) formula image as the function of formula image. At each panel, the four points, from left to right, correspond to the substitutions on the Homo sapiens lineage after its divergence from Pan troglodytes, Gorilla gorilla, Pongo pygmaeus, and Macaca mulatta, respectively.
F<sc>ig</sc>. 6.
Fig. 6.
Frequencies of DNRs in D. melanogaster–D. simulans comparison for different distances between sites. (A–C) dd(k) (red), sd(k) (blue), and αd(k) (green) are shown for distances formula image between the sites (horizontal axis). (A) Introns; (B) intergenic regions; and (C) positions 8–30 in introns with lengths up to 120 nucleotides. (D) αd(k) for all intronic sites (gray) and positions 8–30 in introns with lengths up to 120 nucleotides (orange) plotted together. The differences between the two curves are insignificant for all k. Error bars for αd(k) correspond to 95% confidence intervals obtained by 1,000 bootstrap simulations; for data from all intronic sites and intergenic regions, they are not shown because they would be barely visible.
F<sc>ig</sc>. 7.
Fig. 7.
Frequencies of TNRs in D. melanogaster–D. simulans comparison, such that two of the three replacements affect adjacent sites (l = 1), for different distances k from the third site in drosophilids. dt(k,l) (red), st(k,l) (blue), and αt(k,l) (green) are shown for distances formula image between sites (horizontal axis). (A) Introns and (B) intergenic regions.

Similar articles

Cited by

References

    1. Altschul S, Gish W, Miller W, Myers E, Lipman D. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–410. - PubMed
    1. Andolfatto P. Adaptive evolution of non-coding DNA in Drosophila. Nature. 2005;437:1149–1152. - PubMed
    1. Asthana S, Noble WS, Kryukov G, Grant CE, Sunyaev S, Stamatoyannopoulos JA. Widely distributed noncoding purifying selection in the human genome. Proc Natl Acad Sci U S A. 2007;104:12410–12415. - PMC - PubMed
    1. Averof M, Rokas A, Wolfe KH, Sharp PM. Evidence for a high frequency of simultaneous double-nucleotide substitutions. Science. 2000;287:1283–1286. - PubMed
    1. Bazykin GA, Dushoff J, Levin SA, Kondrashov AS. Bursts of nonsynonymous substitutions in HIV-1 evolution reveal instances of positive selection at conservative protein sites. Proc Natl Acad Sci U S A. 2006;103:19396–19401. - PMC - PubMed

Publication types

LinkOut - more resources