Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Oct 19;2(1):12.
doi: 10.1186/1759-8753-2-12.

Crypton transposons: identification of new diverse families and ancient domestication events

Affiliations

Crypton transposons: identification of new diverse families and ancient domestication events

Kenji K Kojima et al. Mob DNA. .

Abstract

Background: "Domestication" of transposable elements (TEs) led to evolutionary breakthroughs such as the origin of telomerase and the vertebrate adaptive immune system. These breakthroughs were accomplished by the adaptation of molecular functions essential for TEs, such as reverse transcription, DNA cutting and ligation or DNA binding. Cryptons represent a unique class of DNA transposons using tyrosine recombinase (YR) to cut and rejoin the recombining DNA molecules. Cryptons were originally identified in fungi and later in the sea anemone, sea urchin and insects.

Results: Herein we report new Cryptons from animals, fungi, oomycetes and diatom, as well as widely conserved genes derived from ancient Crypton domestication events. Phylogenetic analysis based on the YR sequences supports four deep divisions of Crypton elements. We found that the domain of unknown function 3504 (DUF3504) in eukaryotes is derived from Crypton YR. DUF3504 is similar to YR but lacks most of the residues of the catalytic tetrad (R-H-R-Y). Genes containing the DUF3504 domain are potassium channel tetramerization domain containing 1 (KCTD1), KIAA1958, zinc finger MYM type 2 (ZMYM2), ZMYM3, ZMYM4, glutamine-rich protein 1 (QRICH1) and "without children" (WOC). The DUF3504 genes are highly conserved and are found in almost all jawed vertebrates. The sequence, domain structure, intron positions and synteny blocks support the view that ZMYM2, ZMYM3, ZMYM4, and possibly QRICH1, were derived from WOC through two rounds of genome duplication in early vertebrate evolution. WOC is observed widely among bilaterians. There could be four independent events of Crypton domestication, and one of them, generating WOC/ZMYM, predated the birth of bilaterian animals. This is the third-oldest domestication event known to date, following the domestication generating telomerase reverse transcriptase (TERT) and Prp8. Many Crypton-derived genes are transcriptional regulators with additional DNA-binding domains, and the acquisition of the DUF3504 domain could have added new regulatory pathways via protein-DNA or protein-protein interactions.

Conclusions: Cryptons have contributed to animal evolution through domestication of their YR sequences. The DUF3504 domains are domesticated YRs of animal Crypton elements.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic structures of Cryptons. Crypton-Cn1 and MarCry-1_FO belong to the CryptonF group. YR = tyrosine recombinase; GCR1_C = DNA-binding domain; DDE = DDE-transposase; C48 = C48 peptidase; HTH = helix-turn-helix motif.
Figure 2
Figure 2
Phylogeny of Cryptons, DUF3504 genes and other eukaryotic tyrosine recombinases. The numbers at nodes are bootstrap values over 40. Open circles indicate the clusters of Cryptons, and filled circles show the clusters of DUF3504 genes. YR = tyrosine recombinase. Prefixes of names are as follows. Cry = Crypton; 1958 = KIAA1958. Accession numbers of DUF3504 genes are shown in Additional file 5. Sequences of the transposable elements are deposited in Repbase http://www.girinst.org/repbase/. Other abbreviations and accession numbers are as follows. FLP = FLP recombinase of the 2-micron plasmid in Saccharomyces cerevisiae (NP_040488); FLP_Klac = FLP recombinase of the plasmid pKD1 in Kluyveromyces lactis (YP_355327); CRE = Cre recombinase of the enterobacteria phage P1 (YP_006472); Vlf1_AcNPV = very late expression factor 1 from the Autographa californica nucleopolyhedrovirus (NP_054107); Tn916 = Tn916 transposase from Enterococcus faecalis (NP_0687929); XerD = XerD from Escherichia coli (NP_417370); Lambda = lambda phage recombinase (NP_040609); At_Ti = recombinase from the Agrobacterium tumefaciens Ti plasmid (NP_059767); SpPat1 from Strongylocentrotus purpuratus (obtained at http://biocadmin.otago.ac.nz/fmi/xsl/retrobase/home.xsl). Suffixes for species names are as follows. Animals: Hs = human, Homo sapiens; Oa = platypus, Ornithorhynchus anatinus; Gg = chicken, Gallus gallus; Tg = zebra finch, Taeniopygia guttata; Ac/ACa = lizard, Anolis carolinensis; Xt/XT = frog, Xenopus tropicalis; Dr/DR = zebrafish, Danio rerio; OL = medaka, Oryzias latipes; Cm = chimaera, Callorhinchus milii; SP = sea urchin, Strongylocentrotus purpuratus; SK = acorn worm, Saccoglossus kowalevskii; Dm = fruit fly, Drosophila melanogaster; Tc/TC/TCa = beetle, Tribolium castaneum; NVi = parasitic wasp, Nasonia vitripennis; CQ = southern house mosquito, Culex quinquefasciatus; AA = yellow fever mosquito, Aedes aegypti; DPu = water flea, Daphnia pulex; Acal = sea hare, Aplysia californica; Sm = bloodfluke, Schistosoma mansoni; NV = sea anemone, Nematostella vectensis. Fungi: RO = Rhizopus oryzae; CGlo = Chaetomium globosum; TS = Talaromyces stipitatus; CI = Coccidioides immitis; FO = Fusarium oxysporum. Stramenopiles: PI = Phytophthora infestans; PS = Phytophthora sojae; PU = Pythium ultimum; HAra = Hyaloperonospora arabidopsidis; ALai = Albugo laibachii; PTri = Phaeodactylum tricornutum. Plants: CR = Chlamydomonas reinhardtii.
Figure 3
Figure 3
Distribution and schematic structures of Crypton-derived genes in Saccharomycetaceae fungi. (A) Schematic protein structures encoded by Crypton-derived genes and Cryptons. (B) Distribution of Crypton-derived genes. Each gene identified in the haploid genome is represented by a plus symbol. (C) The phylogeny of Crypton-derived genes and Cryptons using the GCR1_C domain sequences. The numbers at nodes are bootstrap values over 50. Accession numbers of genes are shown in Additional file 2. "Cry" stands for Crypton. Suffixes for species names are as follows. Sc = Saccharomyces cerevisiae; Cg = Candida glabrata; Vp = Vanderwaltozyma polyspora; Zr = Zygosaccharomyces rouxii; Lt = Lachancea thermotolerans; Kl = Kluyveromyces lactis; Ag = Ashbya gossypii; Ct = Candida tropicalis; Ca = Candida albicans; Ps = Pichia stipitis; Pg = Pichia guilliermondii.
Figure 4
Figure 4
Crypton-derived sequence in an intron of ATF7IP gene. (A) Alignment of proteins coded by deuterostome Cryptons and Crypton-derived sequences. Catalytically essential residues are shown below the alignment. (B) Illustration of the conservation of ATF7IP loci. The position of the YR sequence is indicated by the open box. Black boxes represent exons of the chicken ATF7IP gene. Gray boxes indicate conserved blocks between chicken and respective species based on the Net Tracks of the UCSC Genome Browser http://genome.ucsc.edu/. Lines between gray boxes indicate that boxes are connected by unalignable sequences. (C) Alignment of nucleotide sequences of Crypton-derived sequences.
Figure 5
Figure 5
Schematic structures of DUF3504 proteins. KIAA1958 gene has two isoforms, each of which encodes a DUF3504 domain. The structures of KCTD1, KIAA1958, QRICH1, ZMYM2, ZMYM3 and ZMYM4 are from humans. The structure of WOC is from Drosophila melanogaster.
Figure 6
Figure 6
Distribution of Cryptons and Crypton-derived genes. Each gene identified in the haploid genome is represented by a plus symbol. Minus symbols indicate the absence of Cryptons or Crypton-derived genes. Asterisks indicate the presence of their disrupted fragments. The branch ages are based on TimeTree [30]. The unit of time is indicated. Crypton-derived genes listed at nodes of the tree indicate the times of their domestication based on their distribution in different species. KIAA1958L, QRICH1, ZMYM2, ZMYM3 and ZMYM4 are not shown, because they were likely derived by gene duplications.
Figure 7
Figure 7
Paralogous relationships of WOC/ZMYM/QRICH1 genes. (A) Two conserved intron positions among WOC, ZMYM2, ZMYM3, ZMYM4 and QRICH1. Introns are printed in lowercase letters and shaded. Protein sequences are shown below nucleotide sequences. The upper and lower intron positions correspond to the 20th and 22nd introns of human ZMYM2, respectively. (B) The synteny blocks of ZMYM2, ZMYM3 and ZMYM4. Ohnologous relationships reported by Makino and McLysaght [33] are indicated by dotted lines. GJB = gap junction protein β; GJA = gap junction protein α; DLGAP3 = discs large homolog-associated protein 3; C1orf212 = chromosome 1 open reading frame 212. Other gene names are described in the text.

Similar articles

Cited by

References

    1. Kapitonov VV, Jurka J. A universal classification of eukaryotic transposable elements implemented in Repbase. Nat Rev Genet. 2008;9:411–414. - PubMed
    1. Miller WJ, Hagemann S, Reiter E, Pinsker W. P-element homologous sequences are tandemly repeated in the genome of Drosophila guanche. Proc Natl Acad Sci USA. 1992;89:4018–4022. doi: 10.1073/pnas.89.9.4018. - DOI - PMC - PubMed
    1. Greider CW, Blackburn EH. The telomere terminal transferase of Tetrahymena is a ribonucleoprotein enzyme with two kinds of primer specificity. Cell. 1987;51:887–898. doi: 10.1016/0092-8674(87)90576-9. - DOI - PubMed
    1. Gladyshev EA, Arkhipova IR. Telomere-associated endonuclease-deficient Penelope-like retroelements in diverse eukaryotes. Proc Natl Acad Sci USA. 2007;104:9352–9357. doi: 10.1073/pnas.0702741104. - DOI - PMC - PubMed
    1. Kapitonov VV, Jurka J. RAG1 core and V(D)J recombination signal sequences were derived from Transib transposons. PLoS Biol. 2005;3:e181. doi: 10.1371/journal.pbio.0030181. - DOI - PMC - PubMed

LinkOut - more resources