Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Aug 9;13(8):e0201715.
doi: 10.1371/journal.pone.0201715. eCollection 2018.

Optimization of the standard genetic code according to three codon positions using an evolutionary algorithm

Affiliations

Optimization of the standard genetic code according to three codon positions using an evolutionary algorithm

Paweł Błażej et al. PLoS One. .

Erratum in

Abstract

Many biological systems are typically examined from the point of view of adaptation to certain conditions or requirements. One such system is the standard genetic code (SGC), which generally minimizes the cost of amino acid replacements resulting from mutations or mistranslations. However, no full consensus has been reached on the factors that caused the evolution of this feature. One of the hypotheses suggests that code optimality was directly selected as an advantage to preserve information about encoded proteins. An important feature that should be considered when studying the SGC is the different roles of the three codon positions. Therefore, we investigated the robustness of this code regarding the cost of amino acid replacements resulting from substitutions in these positions separately and the sum of these costs. We applied a modified evolutionary algorithm and included four models of the genetic code assuming various restrictions on its structure. The SGC was compared both with the codes that minimize the objective function and those that maximize it. This approach allowed us to place the SGC in the global space of possible codes, which is a more appropriate and unbiased comparison than that with randomly generated codes because they are characterized by relatively uniform amino acid assignments to codons. The SGC appeared to be well optimized at the global scale, but its individual positions were not fully optimized because there were codes that were optimized for only one codon position and simultaneously outperformed the SGC at the other positions. We also found that different code structures may lead to the same optimality and that random codes can show a tendency to minimize costs under some of the genetic code models. Our results suggest that the optimality of SGC could be a by-product of other processes.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. The difference between the values of objective function of the SGC (FSGC) and the best code (Fbest) calculated under four models with different restrictions on the genetic code structure when the polarity costs were minimized for the three codon positions individually or as the sum of costs over all these positions; GEN, the least restrictive model; NUM, the model preserving the number of codons per amino acid; BLO, the model preserving the codon block structure; DEG, the model preserving the degeneracy.
Fig 2
Fig 2. The GD measure calculated under four models with different restrictions on the genetic code structure when the polarity costs were minimized for three codon positions individually or as the sum of costs over all positions; GEN, the least restrictive model; NUM, the model preserving the number of codons per amino acid; BLO, the model preserving the codon block structure; DEG, the model preserving the degeneracy.
Fig 3
Fig 3. Graphical representation of the genetic codes in the three-dimensional space of objective function for three codon positions and the DEG model.
The individual plots correspond to the scenarios in which polarity costs were optimized individually for one of the three codon positions (A, B, C) or for all of them (D). SGC, the standard genetic code; start, starting codes; best, the codes that minimize the objective function; worst, the codes that maximize the objective function. See S2–S5 Figs for interactive plots.
Fig 4
Fig 4. Graphical representation of the genetic codes in the three-dimensional space of objective function for three codon positions and the BLO model.
See S6–S9 Figs for interactive plots. Other explanations are the same as those in Fig 3.
Fig 5
Fig 5. Distribution of genetic codes in the three-dimensional space of objective function for three codon positions and the NUM model.
See S10–S13 Figs for interactive plots. Other explanations are the same as those in Fig 3.
Fig 6
Fig 6. Graphical representation of the genetic codes in the three-dimensional space of objective function for three codon positions and the GEN model.
See S14–S17 Figs for interactive plots. Other explanations are the same as those in Fig 3.
Fig 7
Fig 7. The difference EDworstEDbest, i.e., the closest Euclidean distances between the SGC and worst optimized codes (EDworst) and the SGC and best optimized codes (EDbest), calculated under four models with different restrictions on the genetic code structure when the polarity costs were minimized for the three codon positions individually or as the sum of costs over all positions; GEN, the least restrictive model; NUM, the model preserving the number of codons per amino acid; BLO, the model preserving the codon block structure; DEG, the model preserving the degeneracy.
Fig 8
Fig 8
The GD measure calculated under four models of the genetic code (DEG, BLO, NUM, and GEN) when the polarity costs were minimized for three codon positions individually (A, B, and C) or as the sum of costs over all positions (D).
Fig 9
Fig 9. The difference EDworst¯EDbest¯, i.e., between the mean of the closest Euclidean distances of the randomized codes to the worst (EDworst¯) and the best optimized codes (EDbest¯), calculated under four models with different restrictions on the genetic code structure when the polarity costs were minimized for the three codon positions individually or as the sum of costs over all positions; GEN, the least restrictive model; NUM, the model preserving the number of codons per amino acid; BLO, the model preserving the codon block structure; DEG, the model preserving the degeneracy.
The range of box plots corresponds to two standard deviations and the thick horizontal line marks the mean.
Fig 10
Fig 10. The global distance for the randomized codes, GDrand, calculated under four models with different restrictions on the genetic code structure when the polarity costs were minimized for the three codon positions individually or as the sum of costs over all positions; GEN, the least restrictive model; NUM, the model preserving the number of codons per amino acid; BLO, the model preserving the codon block structure; DEG, the model preserving the degeneracy.
The range of box plots corresponds to two standard deviations and the thick horizontal line marks the mean.
Fig 11
Fig 11
The plots of correspondence analysis comparing the structures of the genetic codes for the first and second components (the left-hand panel) and the first and third components (the right-hand panel). The prefixes pos_1, pos_2, pos_3 and sum indicate that the given code was optimized to minimize (best) or maximize (worst) the objective function according to the amino acid replacement costs in the first, second, and third codon positions as well as the total costs for the three codon positions, respectively. The plots are shown separately for four models with different restrictions on the genetic code structure: GEN, the least restrictive model; NUM, the model preserving the number of codons per amino acid; BLO, the model preserving the codon block structure; DEG, the model preserving the degeneracy. See S18–S21 Figs for interactive plots.

Similar articles

Cited by

References

    1. Schönauer S, Clote P. How optimal is the genetic code? In: Frishman D, Mewes HW, editors. Computer Science and Biology Proceedings of the German Conference on Bioinformatics (GCB'97) Sep 21–241997. p. 65–7.
    1. Crick FH. The origin of the genetic code. Journal of molecular biology. 1968;38(3):367–79. Epub 1968/12/01. . - PubMed
    1. Khorana HG, Buchi H, Ghosh H, Gupta N, Jacob TM, Kossel H, et al. Polynucleotide synthesis and the genetic code. Cold Spring Harbor symposia on quantitative biology. 1966;31:39–49. Epub 1966/01/01. . - PubMed
    1. Nirenberg M, Caskey T, Marshall R, Brimacombe R, Kellogg D, Doctor B, et al. The RNA code and protein synthesis. Cold Spring Harbor symposia on quantitative biology. 1966;31:11–24. Epub 1966/01/01. . - PubMed
    1. Sonneborn TM. Degeneracy of the genetic code: extent, nature, and genetic implications In: Bryson V, Vogel HJ, editors. Evolving genes and proteins. New York: Academic Press; 1965. p. 377–97.

Publication types

LinkOut - more resources