Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 May 23;103(21):8101-6.
doi: 10.1073/pnas.0601161103. Epub 2006 May 3.

Birth of a chimeric primate gene by capture of the transposase gene from a mobile element

Affiliations

Birth of a chimeric primate gene by capture of the transposase gene from a mobile element

Richard Cordaux et al. Proc Natl Acad Sci U S A. .

Abstract

The emergence of new genes and functions is of central importance to the evolution of species. The contribution of various types of duplications to genetic innovation has been extensively investigated. Less understood is the creation of new genes by recycling of coding material from selfish mobile genetic elements. To investigate this process, we reconstructed the evolutionary history of SETMAR, a new primate chimeric gene resulting from fusion of a SET histone methyltransferase gene to the transposase gene of a mobile element. We show that the transposase gene was recruited as part of SETMAR 40-58 million years ago, after the insertion of an Hsmar1 transposon downstream of a preexisting SET gene, followed by the de novo exonization of previously noncoding sequence and the creation of a new intron. The original structure of the fusion gene is conserved in all anthropoid lineages, but only the N-terminal half of the transposase is evolving under strong purifying selection. In vitro assays show that this region contains a DNA-binding domain that has preserved its ancestral binding specificity for a 19-bp motif located within the terminal-inverted repeats of Hsmar1 transposons and their derivatives. The presence of these transposons in the human genome constitutes a potential reservoir of approximately 1,500 perfect or nearly perfect SETMAR-binding sites. Our results not only provide insight into the conditions required for a successful gene fusion, but they also suggest a mechanism by which the circuitry underlying complex regulatory networks may be rapidly established.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest statement: No conflicts declared.

Figures

Fig. 1.
Fig. 1.
Milestones leading to the birth of SETMAR. The structure of the SETMAR locus (Right) and a simplified chronology of the divergence time of the species examined relative to hominoid primates (Left) are shown. Pink boxes represent the two SET exons, which are separated by a single intron (interrupted black line) and form a “SET-only” gene whose structure is conserved in all nonanthropoid species examined and terminated with a stop codon (∗) located at a homologous position (except in cow; see Fig. 2a). The Hsmar1 transposon (event 1) was inserted in the primate lineage, after the split between tarsier and anthropoids, but before the divergence of extant anthropoid lineages. The transposon is shown here with its TIRs (black triangles) and transposase coding sequence (red box). The secondary AluSx insertion within the TIR of Hsmar1 (event 2) is represented as a blue diamond. The position of the deletion removing the stop codon of the “SET-only” gene (event 3) is indicated as a lightning bolt. The de novo conversion from noncoding to exonic sequence is shown in green, the creation of the second intron is represented as a dashed blue line (event 4), and the splice sites are shown as thick blue lines.
Fig. 2.
Fig. 2.
Molecular events leading to the birth of SETMAR. (a) Schematic phylogeny (Left) and multiple alignment of the 3′ end of SET exon 2 (Right) in 10 primates and 5 nonprimate mammals [OWM, Old World monkeys (Green, African green monkey; Rhes, Rhesus macaque); NWM, New World monkey (owl monkey)]. Dots indicate the identity with the top sequence, and hyphens denote sequence gaps. The asterisk and the box indicate the position of the ancestral SET stop codon (TAG) that is conserved in all mammals (except cow) but was removed by a deletion in anthropoid primates. In cow, the current stop codon is located five codons downstream of the original stop codon (data not shown), and in mouse, a 2-bp insertion resulted in a premature stop codon (underlined). (b) Multiple alignment of the 5′ donor splice site of SETMAR intron 2. The human consensus splice motif (19) is used as a reference (top line). The GT dinucleotide after the last SET exon 2 codon (GAG) in anthropoids and delimiting the start of SETMAR intron 2 (in phase 0) is underlined. (c) Multiple alignment of the 5′ end of the MAR coding region in anthropoids. The Hsmar1 transposon family consensus (14) is used as a reference (top line). The two putative lariat branch points (LBP) and the 3′ acceptor splice site (ASS) are boxed. The human consensus LBP and ASS motifs (19) are shown in bold below the boxes. The AG dinucleotide delimiting the end of SETMAR intron 2 is underlined. The MAR exon of SETMAR starts with codon ACT, located immediately before and in frame with the putative start codon (S) of the ancestral Hsmar1 transposase.
Fig. 3.
Fig. 3.
In vitro DNA-binding activity and specificity of the MAR domain of SETMAR. (a) Schematic representation of the SETMAR protein and its predicted features: pre-SET (p-S), helix–turn–helix motif (HTH), and DDN triad (positions of the original catalytic amino acid triad of the MAR region). The protein multiple alignment on the right shows that the triad is DD34N (∗) in all of the SETMAR protein sequences examined (naming convention as in Fig. 2) instead of the typical DD34D motif of the Hsmar1 and Hsmar2 consensus transposase sequences (14, 22) and all known active mariner transposases, such as mos1 from Drosophila melanogaster (Mos1-Dm). Dots indicate identity with top sequence, and numbers indicate the number of amino acids between the sequence portions shown. (b) In vitro DNA-binding activity and specificity of purified MAR protein domain. EMSA of various TIR double-stranded oligonucleotides mixed with a purified recombinant peptide corresponding to MBP domain alone (top lane) or to the entire MAR region fused to a N-terminal MBP domain (all other lanes). The TIR oligonucleotides were designed by using the consensus Hsmar1 or Hsmar2 sequences (14, 22) and their characteristic flanking TA target site duplication. Base substitutions relative to the Hsmar1 TIR are in bold and underlined. The EMSA autoradiography shows shifted DNA (bound) on the right side of the gel, whereas input DNA (unbound) is on the left side. MARx7/8 corresponds to a mixture of two oligonucleotides, none of which are bound by the purified protein. (c) Mapping of the MAR region involved in DNA binding. EMSA of either the Hsmar1 or Hsmar2 TIR oligonucleotides with four recombinant purified peptides corresponding to the entire MAR peptide (lane 1), the first 126 (lane 2) or 92 (lane 3) aa of the MAR peptide fused to a N-terminal MBP tag, or the MBP alone (lane 4). Two shifted bands can be seen when the Hsmar1 TIR oligonucleotide is mixed with either peptide 1 or peptide 2. Based on previous in vitro studies of mariner DNA-binding activities (–26), we interpret complex (Cplx) 3 as a single oligonucleotide with a protein dimer, whereas the upper bands may correspond to tetramers of protein bound to single (Cplx 2) or paired (Cplx 1) oligonucleotides.

Comment in

  • Evolutionary tinkering with transposable elements.
    Jordan IK. Jordan IK. Proc Natl Acad Sci U S A. 2006 May 23;103(21):7941-2. doi: 10.1073/pnas.0602656103. Epub 2006 May 16. Proc Natl Acad Sci U S A. 2006. PMID: 16705033 Free PMC article. No abstract available.

References

    1. Long M., Betran E., Thornton K., Wang W. Nat. Rev. Genet. 2003;4:865–875. - PubMed
    1. Johnson M. E., Viggiano L., Bailey J. A., Abdul-Rauf M., Goodwin G., Rocchi M., Eichler E. E. Nature. 2001;413:514–519. - PubMed
    1. Marques A. C., Dupanloup I., Vinckenbosch N., Reymond A., Kaessmann H. PLoS Biol. 2005;3:e357. - PMC - PubMed
    1. Miller W. J., Hagemann S., Reiter E., Pinsker W. Proc. Natl. Acad. Sci. USA. 1992;89:4018–4022. - PMC - PubMed
    1. Kidwell M. G., Lisch D. R. Evolution Int. J. Org. Evolution. 2001;55:1–24. - PubMed

Publication types