Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Oct;8(10):1532-42.
doi: 10.1128/EC.00183-09. Epub 2009 Aug 7.

Trypanosomatid genomes contain several subfamilies of ingi-related retroposons

Affiliations

Trypanosomatid genomes contain several subfamilies of ingi-related retroposons

Frédéric Bringaud et al. Eukaryot Cell. 2009 Oct.

Abstract

Retroposons are ubiquitous transposable elements found in the genomes of most eukaryotes, including trypanosomatids. The African and American trypanosomes (Trypanosoma brucei and Trypanosoma cruzi) contain long autonomous retroposons of the ingi clade (Tbingi and L1Tc, respectively) and short nonautonomous truncated versions (TbRIME and NARTc, respectively), as well as degenerate ingi-related retroposons devoid of coding capacity (DIREs). In contrast, Leishmania major contains only remnants of extinct retroposons (LmDIREs) and of short nonautonomous heterogeneous elements (LmSIDERs). We extend this comparative and evolutionary analysis of retroposons to the genomes of two other African trypanosomes (Trypanosoma congolense and Trypanosoma vivax) and another Leishmania sp. (Leishmania braziliensis). Three new potentially functional retroposons of the ingi clade have been identified: Tvingi in T. vivax and Tcoingi and L1Tco in T. congolense. T. congolense is the first trypanosomatid containing two classes of potentially active retroposons of the ingi clade. We analyzed sequences located upstream of these new long autonomous ingi-related elements, which code for the recognition site of the retroposon-encoded endonuclease. The closely related Tcoingi and Tvingi elements show the same conserved pattern, indicating that the Tcoingi- and Tvingi-encoded endonucleases share site specificity. Similarly, the conserved pattern previously identified upstream of L1Tc has also been detected at the same relative position upstream of L1Tco elements. A phylogenetic analysis of all ingi-related retroposons identified so far, including DIREs, clearly shows that several distinct subfamilies have emerged and coexisted, though in the course of trypanosomatid evolution, only a few have been maintained as active elements in modern trypanosomatid (sub)species.

PubMed Disclaimer

Figures

FIG. 1.
FIG. 1.
Divergence between members of potentially active retroposons belonging to the ingi clade. Bases covered by the T. brucei (Tbingi), T. cruzi (L1Tc), T. congolense (Tcoingi and L1Tco), and T. vivax (Tvingi) retroposons were sorted by their divergence from their consensus sequences. The consensus sequences were determined from the alignment of full-length retroposons (Tbingi, 63 copies; L1Tc, 48 copies; Tcoingi, 27 copies) and all of the annotated elements larger than 300 bp (L1Tco, 8 copies) or 3.5 kb (Tvingi, 46 copies). The number of retroposons per fraction of 2% divergence is expressed as a fraction of the highest value, to which an arbitrary value of 1 was assigned. The percentage of divergence was calculated using the matching regions of the consensus sequences. The circled numbers at the top indicate the median value of each graph, and the closed circles represent the number of potentially active elements in each column.
FIG. 2.
FIG. 2.
Schematic representation and comparison of potentially active autonomous and nonautonomous families of retroposons belonging to the ingi clade. The schematic maps are based on the consensus sequences generated from alignments of the Tbingi and TbRIME (T. brucei), Tcoingi and L1Tco (T. congolense), Tvingi and TvRIME (T. vivax), and L1Tc and NARTc (T. cruzi) retroposon sequences. The autonomous Tbingi, Tcoingi, Tvingi, and L1Tc consensus sequences contain a single long open reading frame coding for a multifunctional protein containing the endonuclease, RT, RNase H, and leucine zipper domains (the number of amino acids [aa] composing the encoded protein is indicated in parentheses). For L1Tco, four frameshifts were introduced into the consensus sequence to restore the putative coding sequence. The positions of the start and stop codons are indicated. Matches between autonomous and nonautonomous (TbRIME, TvRIME, and NARTc) retroposons are shown by light-gray boxes. The black boxes at the ends of both maps represent the poly(dA) terminal sequence. The N-terminal conserved motif, representing the trypanosomatid retroposon signature, is indicated by a dark-gray box.
FIG. 3.
FIG. 3.
Phylogenetic analysis of the RT domain. The phylogeny is based on approximately 450 aligned amino acid residues (see Fig. S1 in the supplemental material for the alignment) corresponding to the entire RT domain of nontrypanosomatid retroposons, Tbingi, Tcoingi, Tvingi, L1Tc, L1Tco, and DIREs, which all belong to the ingi clade. The names of retroposons from T. brucei, T. congolense, T. vivax, T. cruzi, L. major, and L. braziliensis are in red, dark-blue, light-blue, magenta, green, and brown letters, respectively. The 10 nontrypanosomatid retroposons are representatives of the Jockey, Tad1, R1, CR1, and LOA retroposon clades (13). The consensus tree was generated by the neighbor-joining method and rooted on the RT sequences of nontrypanosomatid retroposons. The number next to each node indicates the bootstrap value as a percentage of 100 replicates corresponding to the tree generated by the neighbor-joining method. The black dots indicate potential functional autonomous retroposons. The names in italics on the right correspond to groups or subclades identified in a previous analysis (7), while the ingi1 to -6 subclades were defined from this phylogenetic analysis.
FIG. 4.
FIG. 4.
Divergence between members of short nonautonomous retroposons. Bases covered by the T. cruzi (NARTc), T. brucei (TbRIME), and T. vivax (TvRIME) retroposons were sorted by their divergence from their consensus sequences. The consensus sequences, determined from the alignment of full-length retroposons (NARTc, 115 copies; TbRIME, 70 copies; TvRIME, 26 copies) approximate the element's original sequence at the time of insertion. The number of retroposons per fraction of 2% divergence is expressed as a fraction of the highest value, to which an arbitrary value of 1 was assigned. The percentage of divergence was calculated using the matching regions of the consensus sequences. The circled numbers at the top indicate the median value of each graph.
FIG. 5.
FIG. 5.
Organization of TvRIME clusters. Contig 9379 is composed of tandemly arranged TvRIME retroposons separated by the 12-bp TSD. This contig contains eight full-length TvRIME sequences (gray boxes) plus two truncated elements (white boxes) located at the extremities of the contig. The vertical double lines highlight both ends of the contig. The 12-bp TSDs located at the junctions between the TvRIME retroposons are shown, with the underlined residues corresponding to the consensus TSD sequence located between 41 TvRIMEs.
FIG. 6.
FIG. 6.
Comparison of the 5′ and 3′ adjacent sequences flanking the Tvingi and Tcoingi retroposons. (A) Only full-length Tcoingi and Tvingi flanked by a TSD presenting two mismatches at the most are considered. (B) The 5′ adjacent regions of L1Tco elements, which contain residues of the conserved motif located at the same relative position upstream of the T. cruzi L1Tc retroposon (4). Conserved residues are indicated by white letters on a black background. On the right, the names of the elements are indicated. The Tcoingi or L1Tco numbers with six characters correspond to the scaffold number (0 or 1) followed by the position (without the last three numbers) of the retroposon in the T. congolense scaffold. For the T. vivax retroposons (Tvingi), the numbers correspond to the contig numbers. The alignment of all the selected sequences was based on the retroposon sequences (gray column with “Tcoingi,” “Tvingi,” or “L1Tco”), from which only the first and last 6 bp, separated by the type of retroposon, are shown. The potentially functional Tcoingi elements that code for a full-length protein (1,751 amino acids) are indicated by white characters on a black background. The TSD flanking the retroposons is indicated by boldface letters, with underlined capital letters for the conserved residues; the lowercase letters in the TSD correspond to nonconserved residues. A gray background in the 5′ adjacent sequences (5′) means that the retroposon is preceded by another retroposon. For truncated retroposons, the position of end of the element is given.
FIG. 7.
FIG. 7.
Base frequencies at different positions (Pos) upstream of the Tvingi (A) and Tcoingi (B) retroposons. The frequencies were analyzed in the 30 bp upstream of the retroposon sequence (positions −30 to −01), including the duplicated 12-bp residues (TSD), of 110 Tvingi and 30 Tcoingi sequences. When upstream sequences showed more than 95% identity, only one sequence was retained for this analysis. The values in columns T, C, A, and G represent the percentages of the T, C, A, and G residues, respectively, at individual positions. Values greater than 45% are indicated as follows: 45 to 59%, underlined and boldface; 60 to 74%, gray shaded; 75 to 84%, underlined, boldface, and gray shaded; and 85 to 100%, white numbers on a black background. The last column (cons) shows the conserved residues.
FIG. 8.
FIG. 8.
Comparison of consensus sequences flanking ingi/L1Tc retroposons. The consensus sequence located upstream of the T. vivax (Tvingi), T. congolense (Tcoingi), T. brucei (Tbingi), and T. cruzi (L1Tc) retroelements is shown using the “sequence logo” graphic representation (11, 32). The analysis of the T. cruzi and T. brucei sequences was performed previously (4, 5). For T. vivax and T. congolense retroposons, the analysis was performed on the same data set used in Fig. 7. The logo representation displays the frequencies of bases at each position as the relative heights of letters, along with the degree of sequence conservation as the total height of a stack of letters, measured in bits of information. The vertical scale is in bits, with a maximum of 2 bits possible at each position. For the numbering, position +1 corresponds to the first residue of the retroelement. The first 10 residues of the retroelement sequence and the TSD are indicated.
FIG. 9.
FIG. 9.
Evolution of ingi retroposons in the trypanosomatid genomes. On the left is shown the phylogenetic relationship between trypanosomatid species and subspecies analyzed here. The tree was drawn according to data from Stevens et al. (36) and Gibson et al. (16). The arrows indicate the deduced times of appearance of the ingi1 to -6 subclades. At the top is a schematic representation of the retroposon phylogenetic tree shown in Fig. 3, and the central table shows the retroposons belonging to the ingi1 to -6 subclades identified in the genomes of these trypanosomatids. The names of potential active ingi-related retroposons are in boldface. Also indicated are vestiges of ingi-like retroposons (DIRE) and, in parentheses, the short nonautonomous retroposons (TbRIME, TvRIME, and NARTc) corresponding to active truncated versions of the long and autonomous elements mentioned above (Tbingi, Tvingi, and L1Tc, respectively). The slashes indicate loss of the active retroposon and/or the corresponding vestigial sequences.

Similar articles

Cited by

References

    1. Berriman, M., E. Ghedin, C. Hertz-Fowler, G. Blandin, H. Renauld, D. C. Bartholomeu, N. J. Lennard, E. Caler, N. E. Hamlin, B. Haas, U. Bohme, L. Hannick, M. A. Aslett, J. Shallom, L. Marcello, L. Hou, B. Wickstead, U. C. Alsmark, C. Arrowsmith, R. J. Atkin, A. J. Barron, F. Bringaud, K. Brooks, M. Carrington, I. Cherevach, T. J. Chillingworth, C. Churcher, L. N. Clark, C. H. Corton, A. Cronin, R. M. Davies, J. Doggett, A. Djikeng, T. Feldblyum, M. C. Field, A. Fraser, I. Goodhead, Z. Hance, D. Harper, B. R. Harris, H. Hauser, J. Hostetler, A. Ivens, K. Jagels, D. Johnson, J. Johnson, K. Jones, A. X. Kerhornou, H. Koo, N. Larke, S. Landfear, C. Larkin, V. Leech, A. Line, A. Lord, A. Macleod, P. J. Mooney, S. Moule, D. M. Martin, G. W. Morgan, K. Mungall, H. Norbertczak, D. Ormond, G. Pai, C. S. Peacock, J. Peterson, M. A. Quail, E. Rabbinowitsch, M. A. Rajandream, C. Reitter, S. L. Salzberg, M. Sanders, S. Schobel, S. Sharp, M. Simmonds, A. J. Simpson, L. Tallon, C. M. Turner, A. Tait, A. R. Tivey, S. Van Aken, D. Walker, D. Wanless, S. Wang, B. White, O. White, S. Whitehead, J. Woodward, J. Wortman, M. D. Adams, T. M. Embley, K. Gull, E. Ullu, J. D. Barry, A. H. Fairlamb, F. Opperdoes, B. G. Barrell, J. E. Donelson, N. Hall, C. M. Fraser, S. E. Melville, and N. M. El-Sayed. 2005. The genome of the African trypanosome Trypanosoma brucei. Science 309:416-422. - PubMed
    1. Bohne, A., F. Brunet, D. Galiana-Arnoux, C. Schultheis, and J. N. Volff. 2008. Transposable elements as drivers of genomic and biological diversity in vertebrates. Chromosome Res. 16:203-215. - PubMed
    1. Boucher, N., Y. Wu, C. Dumas, M. Dube, D. Sereno, M. Breton, and B. Papadopoulou. 2002. A common mechanism of stage-regulated gene expression in Leishmania mediated by a conserved 3′-untranslated region element. J. Biol. Chem. 277:19511-19520. - PubMed
    1. Bringaud, F., D. C. Bartholomeu, G. Blandin, A. Delcher, T. Baltz, N. M. El-Sayed, and E. Ghedin. 2006. The Trypanosoma cruzi L1Tc and NARTc non-LTR retrotransposons show relative site-specificity for insertion. Mol. Biol. Evol. 23:411-420. - PubMed
    1. Bringaud, F., N. Biteau, E. Zuiderwijk, M. Berriman, N. M. El-Sayed, E. Ghedin, S. E. Melville, N. Hall, and T. Baltz. 2004. The ingi and RIME non-LTR retrotransposons are not randomly distributed in the genome of Trypanosoma brucei. Mol. Biol. Evol. 21:520-528. - PubMed

Publication types