Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Aug 11;17(1):617.
doi: 10.1186/s12864-016-2927-4.

Orthologs, turn-over, and remolding of tRNAs in primates and fruit flies

Affiliations

Orthologs, turn-over, and remolding of tRNAs in primates and fruit flies

Cristian A Velandia-Huerto et al. BMC Genomics. .

Abstract

Background: Transfer RNAs (tRNAs) are ubiquitous in all living organism. They implement the genetic code so that most genomes contain distinct tRNAs for almost all 61 codons. They behave similar to mobile elements and proliferate in genomes spawning both local and non-local copies. Most tRNA families are therefore typically present as multicopy genes. The members of the individual tRNA families evolve under concerted or rapid birth-death evolution, so that paralogous copies maintain almost identical sequences over long evolutionary time-scales. To a good approximation these are functionally equivalent. Individual tRNA copies thus are evolutionary unstable and easily turn into pseudogenes and disappear. This leads to a rapid turnover of tRNAs and often large differences in the tRNA complements of closely related species. Since tRNA paralogs are not distinguished by sequence, common methods cannot not be used to establish orthology between tRNA genes.

Results: In this contribution we introduce a general framework to distinguish orthologs and paralogs in gene families that are subject to concerted evolution. It is based on the use of uniquely aligned adjacent sequence elements as anchors to establish syntenic conservation of sequence intervals. In practice, anchors and intervals can be extracted from genome-wide multiple sequence alignments. Syntenic clusters of concertedly evolving genes of different families can then be subdivided by list alignments, leading to usually small clusters of candidate co-orthologs. On the basis of recent advances in phylogenetic combinatorics, these candidate clusters can be further processed by cograph editing to recover their duplication histories. We developed a workflow that can be conceptualized as stepwise refinement of a graph of homologous genes. We apply this analysis strategy with different types of synteny anchors to investigate the evolution of tRNAs in primates and fruit flies. We identified a large number of tRNA remolding events concentrated at the tips of the phylogeny. With one notable exception all phylogenetically old tRNA remoldings do not change the isoacceptor class.

Conclusions: Gene families evolving under concerted evolution are not amenable to classical phylogenetic analyses since paralogs maintain identical, species-specific sequences, precluding the estimation of correct gene trees from sequence differences. This leaves conservation of syntenic arrangements with respect to "anchor elements" that are not subject to concerted evolution as the only viable source of phylogenetic information. We have demonstrated here that a purely synteny-based analysis of tRNA gene histories is indeed feasible. Although the choice of synteny anchors influences the resolution in particular when tight gene clusters are present, and the quality of sequence alignments, genome assemblies, and genome rearrangements limits the scope of the analysis, largely coherent results can be obtained for tRNAs. In particular, we conclude that a large fraction of the tRNAs are recent copies. This proliferation is compensated by rapid pseudogenization as exemplified by many very recent alloacceptor remoldings.

Keywords: Concerted evolution; Orthology; Synteny; tRNA remolding.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Tight anchors for t into species b. Possible anchors are indicated by the grey boxes. The tight anchors are the anchors closest to t (marked in lighter grey) that connect species a and b. By synteny, the only possible orthologs of t are the three loci indicated by white circles
Fig. 2
Fig. 2
Stepwise refinement of the candidate graph Γ c. a The graph Γ c represents the possible orthology assignments among tRNA loci derived from the synteny anchors. Only genes from different species can be orthologs, hence no edges connect loci in the same species. b Based on sequence similarity edges are removed between tRNAs from different isoacceptor families. c A modified Needleman-Wunsch alignment algorithm is used to identify order-preserving subgroups. This step admits local tandem duplications but not duplications of larger subclusters
Fig. 3
Fig. 3
Determining orthologs by linear coordinate transformations
Fig. 4
Fig. 4
Scheme of step-wise orthology identification. Top genomic organization of tRNAs (colored symbols). “Secure anchors” such as known orthologous proteins are shown as gray ovals. Sequence-unique alignment blocks are indicated as thick dark-gray lines. These anchors subdivide the genome into syntenic clusters forming the connected components of the graph of candidates Γ c, here shown for blocks of a genome-wide alignment as delimiters. Each cluster forms a connected component of Γ c. Pairwise generalized list alignments leads to an estimate of the co-orthology relation for each group of homologous tRNAs. Each of these estimated graphs is then corrected to the nearest cograph
Fig. 5
Fig. 5
a Gain, loss, and duplications of tRNAs in primates computed from the most fine-grained synteny definition based on individual MSA blocks and b by joining adjacent blocks as described in the text. Gain and duplication events were assigned to the edge leading to the last common ancestor of all observed co-orthologs, except for groups that contained only a macaque and a human or a chimpanzee tRNA; in these cases we assigned two lineage specific gains. Green numbers refer to the total number of tRNAs detected by tRNAscan-SE; green numbers in parentheses count the pseudogenes found in the set of all tRNAs. Blue numbers refer to the total gain, i.e., the sum of event seeding new connected components and duplication events with a connected component. The number of identified local duplication events is given in parentheses in blue. The red numbers indicate the loss events on the corresponding branch. Species abbreviations: human, Homo sapiens: Hsa; chimapanzee, Pan troglodytes: Ptr; gorilla, Gorilla gorilla: Ggo; orangutan, Pongo abelii: Pab; gibbon, Nomascus leucogenys: Nle; rhesus macaque, Macaca mulatta: Mmu
Fig. 6
Fig. 6
A more complex tRNA cluster in primates (see Additional file 5 for coordinates. Panel a summarizes the situation as list alignment. For simplicity, tRNAs from both strands are included. Except in rhesus and orangutan the first part of cluster the has been cluster. Panel b shows a more detailed, strand-specific genomic map. It highlights the reversal of the orientation of a tRNA Arg (R) in rhesus and the two copies of tRNA Lys (K) on opposite strands. Panel c shows the graph corresponding to the cluster. Edges indicate that the tRNA sequences sufficiently similar by be possible orthologs. Different species are distinguished by colors. The tRNAs isoacceptor classes are indicated by their 1-letter codes: Phe (F), Lys (K), Leu (L), Val (V), Arg (R)
Fig. 7
Fig. 7
Gain, loss and duplications of tRNAs in primates computed based on protein-anchored clusters and the linear interpolation method. N. leucogenys was not included in this part of the analysis
Fig. 8
Fig. 8
Gains a and losses b of tRNAs in drosophilids. See caption of Fig. 5 for details. Drosophila simulans: Dsim; Drosophila sechellia: Dsec; Drosophila melanogaster: Dmel; Drosophila yakuba: Dyak; Drosophila erecta: Dere; Drosophila ananassae: Dana; Drosophila pseudoobscura: Dpse; Drosophila persimilis: Dper; Drosophila willistoni: Dwil; Drosophila mojavensis: Dmoj; Drosophila virilis: Dvir; Drosophila grimshawi: Dgri
Fig. 9
Fig. 9
a Remolding events in primates (summary statistics only) and b drosophilids (affected isoacceptor classes). Isoacceptor remoldings are shown in dark blue, alloacceptor remoldings are given in red. Details of all anticodon changes are given in the Additional file 2
Fig. 10
Fig. 10
Alignment and secondary structure of tRNAs deriving from the Cys(GCA) →Tyr(GTA) remolding event predating the last common ancestor of human and rhesus. Descendants of both tRNAs have survived in all investigated genomes except Nomascus. The secondary structure is the standard tRNA structure

Similar articles

Cited by

References

    1. Capra JA, Stolzer M, Durand D, Pollard KS. How old is my gene? Trends Genet. 2013;29:659–68. doi: 10.1016/j.tig.2013.07.001. - DOI - PMC - PubMed
    1. Holland PW. Evolution of homeobox genes. Wiley Interdiscip Rev Dev Biol. 2013;2:31–45. doi: 10.1002/wdev.78. - DOI - PubMed
    1. Hiller M, Schaar BT, Indjeian VB, Kingsley DM, Hagey LR, Bejerano G. A “forward genomics” approach links genotype to phenotype using independent phenotypic losses among related species. Cell Rep. 2012;2:817–23. doi: 10.1016/j.celrep.2012.08.032. - DOI - PMC - PubMed
    1. Fitch WM. Distinguishing homologous from analogous proteins. Syst Biol. 1970;19:99–113. - PubMed
    1. Tatusov RL, Koonin EV, Lipman DJ. A genomic perspective on protein families. Science. 1997;278:631–7. doi: 10.1126/science.278.5338.631. - DOI - PubMed

Publication types

LinkOut - more resources