Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 May 10;108(19):7884-9.
doi: 10.1073/pnas.1104208108. Epub 2011 Apr 25.

The catalytic domain of all eukaryotic cut-and-paste transposase superfamilies

Affiliations

The catalytic domain of all eukaryotic cut-and-paste transposase superfamilies

Yao-Wu Yuan et al. Proc Natl Acad Sci U S A. .

Abstract

Cut-and-paste DNA transposable elements are major components of eukaryotic genomes and are grouped into superfamilies (e.g., hAT, P) based on sequence similarity of the element-encoded transposase. The transposases from several superfamilies possess a protein domain containing an acidic amino acid triad (DDE or DDD) that catalyzes the "cut and paste" transposition reaction. However, it was unclear whether this domain was shared by the transposases from all superfamilies. Through multiple-alignment of transposase sequences from a diverse collection of previously identified and recently annotated elements from a wide range of organisms, we identified the putative DDE/D triad for all superfamilies. Furthermore, we identified additional highly conserved amino acid residues or motifs within the DDE/D domain that together form a "signature string" that is specific to each superfamily. These conserved residues or motifs were exploited as phylogenetic characters to infer evolutionary relationships among all superfamilies. The phylogenetic analysis revealed three major groups that were not previously discerned and led us to revise the classification of several currently recognized superfamilies. Taking the data together, this study suggests that all eukaryotic cut-and-paste transposable element superfamilies have a common evolutionary origin and establishes a phylogenetic framework for all future cut-and-paste transposase comparisons.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
DDE domains of CACTA, Mirage, Chapaev, and Transib elements. (A) Alignment shown is after redundancy elimination. Distances between the conserved blocks are indicated in the number of amino acid residues. Conserved residues within each superfamily (and between superfamilies) are highlighted in gray. The DDE triad identified here is marked with letters above the alignment; the DDE triad for CACTA elements identified in ref. is marked with black dots. Three additional conserved motifs discussed in the text, C(2)C, [M/L]H, and H(3-4)H, are also noted. (B) Predicted secondary structure of the DDE domain of the SmTRC1 transposase (GenBank: AM268206). Asterisks indicate the DDE triad. α-Helices and β-strands are highlighted with pink and blue bars, respectively. Note that the position of “α2/3-β5” of the typical “β1-β2-β3-α1-β4-α2/3-β5-α4-α5/6” fold remains unclear. The inserted domain (highlighted in gray) between the second D and the E residue is rich in α-helices.
Fig. 2.
Fig. 2.
An unrooted consensus tree of the transposase superfamilies inferred from the presence or absence of the highly conserved residues in the signature strings. Bootstrap values are at the nodes. The arrows with labels indicate superfamily clusters merged in our revised classification. Shown on the right is a schematic representation of the DDE/D domain and the signature string for each superfamily. Conserved blocks are highlighted in blue, variable regions are in gray. White gaps are regions not drawn to scale. The DDE triads are highlighted in red. Alternative residues are marked by slashes; lowercase indicates that a residue occurs in <10% of the sequences in the alignment profile. The [C/D](2)H motif is highlighted in orange; the C(2)C, [M/L]H, and H(3-4)H motifs are highlighted in green.
Fig. 3.
Fig. 3.
Taxonomic distribution of the 17 superfamilies across the eukaryotic tree of life. Gray and white boxes indicate presence and absence, respectively. The illustrated tree was drawn according to refs. and and the Tree of Life webpage (http://tolweb.org/tree/). The five represented eukaryotic supergroups are highlighted in thickened lines. The asterisks after each terminal branch indicates the number of genomes representing that branch: *, 1 genome; **, 2 to 5 genomes; **, 6 to 10 genomes; ****, over 10 genomes.

References

    1. Finnegan DJ. Eukaryotic transposable elements and genome evolution. Trends Genet. 1989;5:103–107. - PubMed
    1. Wicker T, et al. A unified classification system for eukaryotic transposable elements. Nat Rev Genet. 2007;8:973–982. - PubMed
    1. Hellsten U, et al. The genome of the Western clawed frog Xenopus tropicalis. Science. 2010;328:633–636. - PMC - PubMed
    1. Chapman JA, et al. The dynamic genome of Hydra. Nature. 2010;464:592–596. - PMC - PubMed
    1. Nene V, et al. Genome sequence of Aedes aegypti, a major arbovirus vector. Science. 2007;316:1718–1723. - PMC - PubMed

Publication types