. 2014 May 29;15(1):411.

doi: 10.1186/1471-2164-15-411.

Predicting the fungal CUG codon translation with Bagheera

Stefanie Mühlhausen, Martin Kollmar¹

Affiliations

Affiliation

¹ Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany. mako@nmr.mpibpc.mpg.de.

PMID: 24885275
PMCID: PMC4050208
DOI: 10.1186/1471-2164-15-411

Predicting the fungal CUG codon translation with Bagheera

Stefanie Mühlhausen et al. BMC Genomics. 2014.

. 2014 May 29;15(1):411.

doi: 10.1186/1471-2164-15-411.

Authors

Stefanie Mühlhausen, Martin Kollmar¹

Affiliation

¹ Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany. mako@nmr.mpibpc.mpg.de.

PMID: 24885275
PMCID: PMC4050208
DOI: 10.1186/1471-2164-15-411

Abstract

Background: Many eukaryotes have been shown to use alternative schemes to the universal genetic code. While most Saccharomycetes, including Saccharomyces cerevisiae, use the standard genetic code translating the CUG codon as leucine, some yeasts, including many but not all of the "Candida", translate the same codon as serine. It has been proposed that the change in codon identity was accomplished by an almost complete loss of the original CUG codons, making the CUG positions within the extant species highly discriminative for the one or other translation scheme.

Results: In order to improve the prediction of genes in yeast species by providing the correct CUG decoding scheme we implemented a web server, called Bagheera, that allows determining the most probable CUG codon translation for a given transcriptome or genome assembly based on extensive reference data. As reference data we use 2071 manually assembled and annotated sequences from 38 cytoskeletal and motor proteins belonging to 79 yeast species. The web service includes a pipeline, which starts with predicting and aligning homologous genes to the reference data. CUG codon positions within the predicted genes are analysed with respect to amino acid similarity and CUG codon conservation in related species. In addition, the tRNACAG gene is predicted in genomic data and compared to known leu-tRNACAG and ser-tRNACAG genes. Bagheera can also be used to evaluate any mRNA and protein sequence data with the codon usage of the respective species. The usage of the system has been demonstrated by analysing six genomes not included in the reference data.

Conclusions: Gene prediction and consecutive comparison with reference data from other Saccharomycetes are sufficient to predict the most probable decoding scheme for CUG codons. This approach has been implemented into Bagheera (http://www.motorprotein.de/bagheera).

PubMed Disclaimer

Figures

**Figure 1**
**Workflow of the Bagheera web application. A)** Upon uploading of the yeast genome or transcriptome assembly data homologous proteins to the reference sequences are identified using TBLASTN and subsequently predicted by AUGUSTUS-PPX. The reference sequences used for the gene prediction are selected according to the species selected as model organism for AUGUSTUS. The predicted proteins are aligned to the reference alignments (NW = Needleman-Wunsch, SW = Swith-Waterman, LCS = Longest Common Subsequence) and the codon usage predicted based on the analysis of sequence similarity and CUG codon conservation at CUG codon positions. Optionally, a phylogenetic tree can be calculated based on a randomly selected and concatenated subset of the predicted proteins. B) A gene reconstruction of the uploaded protein sequence is performed to obtain cDNA sequence. The species encoding the uploaded protein has to be specified. The cDNA sequence is then translated according to the translation scheme of the respective species.

**Figure 2**
**Screenshot of the web interface.** The web interface is divided into three main parts: data upload and options section, results section, and phylogenetic tree section (not shown). A) Example data were uploaded and processed with default parameters. B) The results section is split into a summary and a section listing each reference protein and a detailed analysis of each predicted protein down to single CUG codons. For every reference protein, the predicted gene and, if applicable, the respective CUG positions are shown. For every predicted CUG position, which could be mapped onto the reference data, the amino acid composition and CUG codon usage at the respective positions in the reference data are listed. The predicted actin related protein class 4 (Arp4) contains one CUG at position 163. This position corresponds to alignment position 291 in the reference alignment. It is here indicated by a black box. All CUG codons are noted as leucine in the predicted sequence, regardless the suggested codon usage.

**Figure 3**
**Number of CUG codons in the reference data.** The total number of CUG positions for every set of reference proteins is shown together with the numbers of CUG positions conserved in at least two and five genes. To account for different protein lengths (e.g. 200 amino acids in dynactin3 p24 proteins compared to up to 4,000 amino acids in dynein heavy chain proteins), the total number of CUG positions per 1,000 amino acids is also plotted showing that CUG codons are not particularly enriched in certain protein families. Values for all species using standard codon usage (left side) are contrasted with those for all species using alternative yeast codon usage (right side). Detailed numbers are available in Additional file 3.

See this image and copyright information in PMC

Cited by

Genomic and transcriptomic analysis of Candida intermedia reveals the genetic determinants for its xylose-converting capacity.
Geijer C, Faria-Oliveira F, Moreno AD, Stenberg S, Mazurkewich S, Olsson L. Geijer C, et al. Biotechnol Biofuels. 2020 Mar 12;13:48. doi: 10.1186/s13068-020-1663-9. eCollection 2020. Biotechnol Biofuels. 2020. PMID: 32190113 Free PMC article.
Endogenous Stochastic Decoding of the CUG Codon by Competing Ser- and Leu-tRNAs in Ascoidea asiatica.
Mühlhausen S, Schmitt HD, Pan KT, Plessmann U, Urlaub H, Hurst LD, Kollmar M. Mühlhausen S, et al. Curr Biol. 2018 Jul 9;28(13):2046-2057.e5. doi: 10.1016/j.cub.2018.04.085. Epub 2018 Jun 18. Curr Biol. 2018. PMID: 29910077 Free PMC article.
A computational screen for alternative genetic codes in over 250,000 genomes.
Shulgina Y, Eddy SR. Shulgina Y, et al. Elife. 2021 Nov 9;10:e71402. doi: 10.7554/eLife.71402. Elife. 2021. PMID: 34751130 Free PMC article.
Rapid Genetic Code Evolution in Green Algal Mitochondrial Genomes.
Noutahi E, Calderon V, Blanchette M, El-Mabrouk N, Lang BF. Noutahi E, et al. Mol Biol Evol. 2019 Apr 1;36(4):766-783. doi: 10.1093/molbev/msz016. Mol Biol Evol. 2019. PMID: 30698742 Free PMC article.
Genetic basis of priority effects: insights from nectar yeast.
Dhami MK, Hartwig T, Fukami T. Dhami MK, et al. Proc Biol Sci. 2016 Oct 12;283(1840):20161455. doi: 10.1098/rspb.2016.1455. Proc Biol Sci. 2016. PMID: 27708148 Free PMC article.

See all "Cited by" articles

References

1. Jukes TH, Osawa S, Muto A, Lehman N. Evolution of anticodons: variations in the genetic code. Cold Spring Harb Symp Quant Biol. 1987;52:769–776. doi: 10.1101/SQB.1987.052.01.086. - DOI - PubMed
1. Jukes TH, Osawa S. Evolutionary changes in the genetic code. Comp Biochem Physiol B. 1993;106:489–494. doi: 10.1016/0300-9629(93)90243-W. - DOI - PubMed
1. Osawa S, Jukes TH. Codon reassignment (codon capture) in evolution. J Mol Evol. 1989;28:271–278. doi: 10.1007/BF02103422. - DOI - PubMed
1. Schultz DW, Yarus M. Transfer RNA mutation and the malleability of the genetic code. J Mol Biol. 1994;235:1377–1380. doi: 10.1006/jmbi.1994.1094. - DOI - PubMed
1. Ohama T, Suzuki T, Mori M, Osawa S, Ueda T, Watanabe K, Nakase T. Non-universal decoding of the leucine codon CUG in several Candida species. Nucleic Acids Res. 1993;21:4039–4045. doi: 10.1093/nar/21.17.4039. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases
- Saccharomyces Genome Database
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Predicting the fungal CUG codon translation with Bagheera

Affiliation

Predicting the fungal CUG codon translation with Bagheera

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases

Research Materials

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases

Research Materials