Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jul 22;6(12):3222-37.
doi: 10.1093/gbe/evu152.

Molecular phylogeny of sequenced Saccharomycetes reveals polyphyly of the alternative yeast codon usage

Molecular phylogeny of sequenced Saccharomycetes reveals polyphyly of the alternative yeast codon usage

Stefanie Mühlhausen et al. Genome Biol Evol. .

Abstract

The universal genetic code defines the translation of nucleotide triplets, called codons, into amino acids. In many Saccharomycetes a unique alteration of this code affects the translation of the CUG codon, which is normally translated as leucine. Most of the species encoding CUG alternatively as serine belong to the Candida genus and were grouped into a so-called CTG clade. However, the "Candida genus" is not a monophyletic group and several Candida species are known to use the standard CUG translation. The codon identity could have been changed in a single branch, the ancestor of the Candida, or to several branches independently leading to a polyphyletic alternative yeast codon usage (AYCU). In order to resolve the monophyly or polyphyly of the AYCU, we performed a phylogenomics analysis of 26 motor and cytoskeletal proteins from 60 sequenced yeast species. By investigating the CUG codon positions with respect to sequence conservation at the respective alignment positions, we were able to unambiguously assign the standard code or AYCU. Quantitative analysis of the highly conserved leucine and serine alignment positions showed that 61.1% and 17% of the CUG codons coding for leucine and serine, respectively, are at highly conserved positions, whereas only 0.6% and 2.3% of the CUG codons, respectively, are at positions conserved in the respective other amino acid. Plotting the codon usage onto the phylogenetic tree revealed the polyphyly of the AYCU with Pachysolen tannophilus and the CTG clade branching independently within a time span of 30–100 Ma.

PubMed Disclaimer

Figures

F<sc>ig</sc>. 1.—
Fig. 1.—
Phylogenetic relationship between Saccharomycetes. The unrooted phylogenetic network was generated using the Neighbor-Net method as implemented in SplitsTree 4.1.3.1. The Schizosaccharomycetes species were included as outgroup. The network strongly supports the Saccharomycetaceae and the CTG clade (highlighted in orange). The grouping of Pachysolen tannophilus is not unambiguously resolved. Species of the genus Candida are highlighted in blue (teleomorph names) and green (if anamorphs of the species are called Candida) showing the paraphyly (or misassignment) of this genus. Orange and purple dots mark species, for which alternative or standard codon usage has already been shown elsewhere.
F<sc>ig</sc>. 2.—
Fig. 2.—
Sequence alignment of the yeast class I myosins highlighting leucines and serines encoded by CUG. The protein sequence alignment represents part of the class I myosin alignment (for the complete alignment, see supplementary fig. S2, Supplementary Material online). Numbers on the left denote the residue numbers of the first amino acids of the sequences in this section of the alignment. All CUG positions occurring in the aligned class I myosin genes are highlighted. We assigned the most probable translation scheme to each species and translated the CUG codons accordingly. Blue and green boxes indicate CUG codons coding for leucine and serine, respectively.
F<sc>ig</sc>. 3.—
Fig. 3.—
Conservation of serine and leucine positions. (A) The charts show the amino acid conservation at all alignment positions of the Gblocks reduced concatenated alignment of the 26 cytoskeletal and motor proteins, at which at least one leucine (upper charts) or one serine (lower charts) is present. The sequence conservation has been determined based on the property-entropy divergence, as described in Capra and Singh (2007). With a window size of 0, each column is scored independently (left row), whereas the surrounding three columns are also taken into account with a window size of 3 (right row). Blue bars represent the number of alignment positions with a conservation score for leucine and serine residues, respectively, within the given half-bounded intervals. Red bars denote the number of alignment positions with respect to conservation, at which at least one CUG codon is present independent of its translation. Green bars give the total numbers of CUG codons at the respective alignment positions. (B) The weblogos (Crooks et al. 2004) show the sequence conservation of two kinesin subfamilies, kinesin-1 and kinesin-5, within the family-defining motor domain around the highly conserved switch II and α-helix α4 motifs. At the position within α4 marked by a grey bar, kinesin-1 sequences contain a highly conserved serine whereas kinesin-5 sequences contain a highly conserved leucine indicating the need to resolve subfamily relationships when determining CUG codon usage by sequence conservation.
F<sc>ig</sc>. 4.—
Fig. 4.—
CUG codon usage at conserved leucine and serine positions. The graph presents the CUG codon usage with respect to alignment position conservation. For each species we determined the percentage of CUG codons at alignment positions with conservation scores of ≥90%, ≥80%, and ≥50% and a window size of 3, respectively, in the concatenated alignment of the 26 motor and cytoskeletal proteins. Red and blue colors denote the percentages of CUG codons present at alignment positions enriched in serines and leucines, respectively. For comparison, we plotted the percentages of CUG codons at positions of the assigned codon translation to the left (% CUG codons at leucine positions for species using the standard code and % CUG codons at serine positions for species using the AYCU) and the percentages of CUG codons present at alignment positions enriched in the respective other amino acid to the right. When considering only highly conserved alignment positions (≥90% conserved) the CUG codon translation assignment is unambiguous. Species using the AYCU are highlighted in bold.
F<sc>ig</sc>. 5.—
Fig. 5.—
CUG codon positions conserved between species. The heatmap represents the number of CUG codon positions shared between every two species. The upper triangle shows the number of shared CUG codon positions at those positions in the concatenated alignment of the full-length sequences, which have a conservation score of at least 50%, the lower triangle the number of shared CUG codons at all alignment positions. The diagonal represents the total number of CUG codons in the respective species. The number of CUG codons is colored on a logarithmic scale. Species encoding CUG as serine are typed in bold.
F<sc>ig</sc>. 6.—
Fig. 6.—
Timeline of the CUG codon reassignment. The tree presents the ML topology generated under the Γ + WAGF model in RAxML showing branch lengths for the concatenated alignments of 26 cytoskeletal and motor proteins. Support for major branches of the RAxML (1,000 bootstrap replicates), MrBayes (posterior probabilities), and ClustalW trees (1,000 replicates) is indicated (for more details, see supplementary fig. S1, Supplementary Material online). Species using the AYCU are highlighted in bold. With the splits of Schizosaccharomycetes pombe and Saccharomyces cerevisiae and Sa. cerevisiae and Candida albicans set to 587 and 235 Ma, respectively, the divergence of the CTG clade was estimated to 190 Ma using treePL (Smith and O’Meara 2012). The scale bar denotes amino acid substitutions per site. The width and color of the branches to extant species represent the total number of CUG codons in the respective concatenated sequences.

Corrected and republished from

  • previous manuscript

Similar articles

Cited by

References

    1. Berbee ML, Taylor JW. Rhynie chert: a window into a lost world of complex plant-fungus interactions. New Phytol. 2007;174:475–479. - PubMed
    1. Bezerra AR, et al. Reversion of a fungal genetic code alteration links proteome instability with genomic and phenotypic diversification. Proc Natl Acad Sci U S A. 2013;110:11079–11084. - PMC - PubMed
    1. Butler G, et al. Evolution of pathogenicity and sexual reproduction in eight Candida genomes. Nature. 2009;459:657–662. - PMC - PubMed
    1. Capra JA, Singh M. Predicting functionally important residues from sequence conservation. Bioinformatics. 2007;23:1875–1882. - PubMed
    1. Crooks GE, Hon G, Chandonia JM, Brenner SE. WebLogo: a sequence logo generator. Genome Res. 2004;14:1188–1190. - PMC - PubMed

Publication types

LinkOut - more resources