Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Sep 7:7:12790.
doi: 10.1038/ncomms12790.

DNA transposon activity is associated with increased mutation rates in genes of rice and other grasses

Affiliations

DNA transposon activity is associated with increased mutation rates in genes of rice and other grasses

Thomas Wicker et al. Nat Commun. .

Abstract

DNA (class 2) transposons are mobile genetic elements which move within their 'host' genome through excising and re-inserting elsewhere. Although the rice genome contains tens of thousands of such elements, their actual role in evolution is still unclear. Analysing over 650 transposon polymorphisms in the rice species Oryza sativa and Oryza glaberrima, we find that DNA repair following transposon excisions is associated with an increased number of mutations in the sequences neighbouring the transposon. Indeed, the 3,000 bp flanking the excised transposons can contain over 10 times more mutations than the genome-wide average. Since DNA transposons preferably insert near genes, this is correlated with increases in mutation rates in coding sequences and regulatory regions. Most importantly, we find this phenomenon also in maize, wheat and barley. Thus, these findings suggest that DNA transposon activity is a major evolutionary force in grasses which provide the basis of most food consumed by humankind.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Example for a DNA transposon excision with numerous nucleotide substitutions in its flanking region.
A DTH_Harbinger transposon was excised from the genome of O. sativa (Osat) while it remained present in O. glaberrima (Ogla). In this particular event, the transposon was excised almost perfectly, only losing 2 bp of the target site duplication and replacing one of them with a mismatching base. The 211 bp upstream and 120 bp downstream of the excision contain 25 mutations (InDels >1 bp are counted as one mutation), resulting in <93% sequence identity and thus making the mutation rate over 15 times higher than for the genome overall. Outside of the region with the mutations, O. sativa and O. glaberrima sequences are identical, reflecting the overall genome-wide sequence conservation of ∼99.5%. The segments shown correspond to O. sativa chromosome 1 position 23,814,561–23,815,081 and O. glaberrima chromosome 1 position 16,579,166–16,580,116.
Figure 2
Figure 2. Mutations in sequences flanking transposon insertion and excision sites.
(a) Frequencies of nucleotide substitutions relative to transposon insertion/excision sites in rice. For the plot, 438 sequence alignments carrying transposon insertions (blue line) and 206 alignments carrying excisions (red line) were compiled. As control, 340 alignments of randomly picked orthologous sequences from O. sativa and O. glaberrima were used (grey line, see methods). Nucleotide substitution frequencies were calculated in a 400 bp sliding window with a 40 bp sliding step. (b) Insertion/Deletion (InDel) frequency calculated in a 1,000 bp sliding window with a 100 bp sliding step. (c) Proposed mechanism for error-prone DNA repair following the excision of DNA transposons. Step 1: after transposable element (TE) excision, 3′ overhangs are generated by exonuclease (blue). Step 2: the 3′ overhangs anneal using micro-homologies. To keep it simple, we only represent single-strand annealing (SSA2627) here. Alternatively, the strands could also be connected via synthesis-dependent strand annealing (SDSA262728), where the two strands are connected by ‘filler' sequences (which were found in some cases, not shown). Step 3: new strands are synthesized by a replication complex that has deficiencies in DNA polymerases fidelity and mismatch repair. Step 4: the final repair product is rich in nucleotide substitutions and small insertions and deletions.
Figure 3
Figure 3. Mutations in sequences flanking transposable element insertions.
(a) Frequencies of nucleotide substitutions relative to insertion sites of DNA transposons and Gypsy retrotransposons sites in rice. In both cases, nucleotide substitution frequency increases slightly towards the insertion point. This indicates that insertions are also associated with small numbers of mutations in their flanking sequences. Furthermore, this result is evidence that events classified as DNA transposon insertions probably do not contain many precise excisions. (b) Proposed mechanism for error-prone DNA repair following TE insertions (see also Supplementary Fig. 3). Step 1: the TE inserts into the genome by producing a staggered cut, resulting in a TE that is ligated to the genomic DNA via single-stranded segments. Step 2: 3′→5′ exonucleases expose large stretches of single-stranded DNA. Step 3: second strands are synthesized by a replication complex that has deficiencies in DNA polymerases fidelity and mismatch repair (the same as described in Fig. 2c). Step 4: the final TE insertion is flanked by segments rich in nucleotide substitutions and small insertions and deletions.
Figure 4
Figure 4. Sequence conservation along Oryza sativa and Oryza glaberrima chromosomes.
Data from all 12 chromosomes were compiled and chromosome sizes were normalized by dividing chromosome arms into three equally sized bins (x-axis). The y-axis depicts sequence identity of orthologous sequences. For each chromosome bin, promoter regions (the 2,000 bp upstream of the transcription start point, red box plots) are compared with intergenic sequences from the same bin (blue box plots). Promoters are on average 20–29% less conserved than intergenic sequences from the same chromosome bin. To calculate sequence conservation in intergenic regions, we isolated segments that are located in the middle of intergenic sequences which are at least 10 kb in size (that is, the distance between the end of one gene and the start of the next one is over 10 kb).
Figure 5
Figure 5. Substitution frequencies in synonymous sites showing that grass genes have higher mutation rates in their 5′ and 3′ regions.
To normalize the different CDS sizes, genes were divided into five equally sized bins and frequencies were normalized to nucleotide substitutions per kb for each bin. The bold line inside the box is the median value, while mean values are indicated with numbers. (a) Comparison of 442 closest homologues from O. sativa and O. glaberrima. (b) Comparison of 2,314 pairs of closest homologues from wheat and barley. (c) Comparison of 428 pairs of intragenomic closest homologues in maize that originate from a whole-genome duplication. (d) Comparison of 4,133 pairs of closest homologues from A. thaliana and A. lyrata.

Similar articles

Cited by

References

    1. Grass Phylogeny Working Group. Phylogeny and subfamilial classification of the grasses (Poaceae). Ann. Missouri Bot. Gard. 88, 373–457 (2001).
    1. Schnable P. S. et al.. The B73 maize genome: complexity, diversity, and dynamics. Science 326, 1112–1115 (2009). - PubMed
    1. International Brachypodium Initiative. Genome sequencing and analysis of the model grass Brachypodium distachyon. Nature 463, 763–768 (2010). - PubMed
    1. Yang G., Weil C. F. & Wessler S. R. A rice Tc1/mariner-like element transposes in yeast. Plant Cell 18, 2469–2478 (2006). - PMC - PubMed
    1. Bureau T. & Wessler S. R. Mobile inverted-repeat elements of the Tourist family are associated with the genes of many cereal grasses. Proc. Natl Acad. Sci. USA 9, 907–916 (1994). - PMC - PubMed

Publication types

MeSH terms