Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Feb 13;15(1):8.
doi: 10.1186/s12915-017-0353-y.

Nuclear genetic codes with a different meaning of the UAG and the UAA codon

Affiliations

Nuclear genetic codes with a different meaning of the UAG and the UAA codon

Tomáš Pánek et al. BMC Biol. .

Abstract

Background: Departures from the standard genetic code in eukaryotic nuclear genomes are known for only a handful of lineages and only a few genetic code variants seem to exist outside the ciliates, the most creative group in this regard. Most frequent code modifications entail reassignment of the UAG and UAA codons, with evidence for at least 13 independent cases of a coordinated change in the meaning of both codons. However, no change affecting each of the two codons separately has been documented, suggesting the existence of underlying evolutionary or mechanistic constraints.

Results: Here, we present the discovery of two new variants of the nuclear genetic code, in which UAG is translated as an amino acid while UAA is kept as a termination codon (along with UGA). The first variant occurs in an organism noticed in a (meta)transcriptome from the heteropteran Lygus hesperus and demonstrated to be a novel insect-dwelling member of Rhizaria (specifically Sainouroidea). This first documented case of a rhizarian with a non-canonical genetic code employs UAG to encode leucine and represents an unprecedented change among nuclear codon reassignments. The second code variant was found in the recently described anaerobic flagellate Iotanema spirale (Metamonada: Fornicata). Analyses of transcriptomic data revealed that I. spirale uses UAG to encode glutamine, similarly to the most common variant of a non-canonical code known from several unrelated eukaryotic groups, including hexamitin diplomonads (also a lineage of fornicates). However, in these organisms, UAA also encodes glutamine, whereas it is the primary termination codon in I. spirale. Along with phylogenetic evidence for distant relationship of I. spirale and hexamitins, this indicates two independent genetic code changes in fornicates.

Conclusions: Our study documents, for the first time, that evolutionary changes of the meaning of UAG and UAA codons in nuclear genomes can be decoupled and that the interpretation of the two codons by the cytoplasmic translation apparatus is mechanistically separable. The latter conclusion has interesting implications for possibilities of genetic code engineering in eukaryotes. We also present a newly developed generally applicable phylogeny-informed method for inferring the meaning of reassigned codons.

Keywords: Codon reassignment; Evolution; Evolutionary constraint; Fornicata; Genetic code; Iotanema spirale; Lygus hesperus; Protists; Rhizaria; Transcriptome.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Phylogenetic position of the organisms studied. a Phylogeny of eukaryotes including the rhizarian exLh based on 18S rDNA sequences. The maximum likelihood (ML) tree was inferred with RAxML using the GTRGAMMAI substitution model. The values at branches represent RAxML BS values followed by PhyloBayes posterior probabilities (GTRCAT model). b Phylogeny of Fornicata including I. spirale based on a concatenated data set of 18S rDNA and EF-1α, EF2, HSP70, and HSP90 protein sequences. The ML tree was inferred with RAxML using the substitution models GTRGAMMA (for 18S rDNA) and PROTGAMMALG4X (for the protein sequences). The values at branches represent RAxML BS values followed by PhyloBayes posterior probabilities (CAT Poisson model). Maximal support (100/1) is indicated with black dots. Asterisks indicate support values lower than 50% or 0.5, respectively, dashes mark branches in the ML tree that are absent from the PhyloBayes tree
Fig. 2
Fig. 2
In-frame UAG codons in protein-coding genes of the rhizarian exLh and I. spirale. a An example of a rhizarian exLh gene with several in-frame UAG codons: multiple sequence alignment of orthologs of the Bat1 protein (spliceosome RNA helicase). b Relative frequency of hyperconserved positions (at least 90% amino acid identity across orthologs from 250 representatives of main eukaryotic groups in the alignment) corresponding to UAG-containing sites in the rhizarian exLh transcripts. c An example of a I. spirale gene with several in-frame UAG codons: multiple sequence alignment of orthologs of the Polr2a protein (also known as RNA polymerase II subunit RPB1). d Relative frequency of hyperconserved positions (at least 90% amino acid identity across orthologs from 54 representatives of main eukaryotic groups in the alignment) corresponding to UAG-containing sites in the rhizarian exLh transcripts. e, f Dominant amino acid identity at conserved alignment positions (defined using 90% and 50% threshold) in a broad-scale comparison of I. spirale sequences with eukaryotic homologs. e Positions corresponding to in-frame UAG codons in I. spirale sequences. f Positions corresponding to canonical glutamine codons (CAG, CAA) in I. spirale sequences. In Fig. 2a and c, only selected segments of the full alignments (separated by double slashes) are shown for simplicity. Asterisks indicate positions with in-frame UAG codons in the underlying coding sequences. In Fig. 2b and d, the hyperconserved positions are sorted according to the respective hyperconserved amino acid residue (only four most frequent position classes are shown). Source tables for Fig. 2b and d including data from read mapping are available in Additional file 1: Table S1D and S2C
Fig. 3
Fig. 3
Relative codon frequencies in the rhizarian exLh and I. spirale. a Relative codon frequencies in two different groups of genes (for ribosomal proteins and for subunits of the 26S proteasome; listed in Additional file 1: Tables S1A and S1B) in the rhizarian exLh. b Relative codon frequencies in a reference set of genes of I. spirale (listed in Additional file 3: Tables S2A and S2B). The relative codon frequencies are calculated as the percentage of the codon among all occurrences of codons with the same meaning (i.e., coding for the same amino acid or terminating translation)
Fig. 4
Fig. 4
Phylogenetic distribution of known non-canonical genetic codes in nuclear genes of eukaryotes. The schematic phylogenetic tree was drawn on the basis of phylogenetic and phylogenomic analyses for eukaryotes as a whole [60, 71, 72] (our own Fig. 1 and Additional file 2: Figure S1) and for the relevant subgroups with non-canonical codes [, , –77]. Multifurcations indicate uncertain or controversial branching order, dashed branches indicate different positions of Metamonada within eukaryotes suggested by different studies, branches drawn as double lines indicate paraphyletic groupings. The types and occurrences of the different non-canonical codes are based on this study (the rhizarian exLh and Iotanema) and the following previous reports: fungi [14, 15]; Amoeboaphelidium [13]; oxymonads [11]; Blastocrithidia [18]; ulvophytes [12]; ciliates [7, 9, 16, 17]. Note that, for simplicity, code variants with a context-dependent dual meaning of UAR or UGA codons as sense or termination ones (UAR in Blastocrithidia and Condylostoma, UGA in Parduczia and Condylostoma) are not distinguished from those with a “complete” reassignment. We also omitted some ciliate species with their putative non-canonical codes supported by little data that are specifically related to and possibly sharing the same code with better studied species. Changes in the genetic code are mapped onto the tree primarily (black circles) using Dollo parsimony (no reversions are allowed). An alternative maximum parsimony scenario with reversions weighted the same as other changes is indicated by the respective code numbers in white circles. An alternative branching order to the one indicated in the figure was supported by some studies for some of the ciliate lineages, but the alternative topology does not decrease the minimal number of codon reassignments required to explain the distribution of non-standard genetic codes

References

    1. Knight RD, Freeland SJ, Landweber LF. Rewiring the keyboard: evolvability of the genetic code. Nat Rev Genet. 2001;2:49–58. doi: 10.1038/35047500. - DOI - PubMed
    1. Ling J, O'Donoghue P, Söll D. Genetic code flexibility in microorganisms: novel mechanisms and impact on physiology. Nat Rev Microbiol. 2015;13:707–21. doi: 10.1038/nrmicro3568. - DOI - PMC - PubMed
    1. Keeling PJ. Evolution of the genetic code. Curr Biol. 2016;26:R851–3. doi: 10.1016/j.cub.2016.08.005. - DOI - PubMed
    1. Matsumoto T, Ishikawa SA, Hashimoto T, Inagaki Y. A deviant genetic code in the green alga-derived plastid in the dinoflagellate Lepidodinium chlorophorum. Mol Phylogenet Evol. 2011;60:68–72. - PubMed
    1. Preer Jr JR, Preer LB, Rudman BM, Barnett AJ. Deviation from the universal code shown by the gene for surface protein 51A in Paramecium. Nature. 1985;314:188–90. - PubMed

Publication types

LinkOut - more resources