Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Feb 23;8(1):3532.
doi: 10.1038/s41598-018-21973-y.

Horizontal transfer of code fragments between protocells can explain the origins of the genetic code without vertical descent

Affiliations

Horizontal transfer of code fragments between protocells can explain the origins of the genetic code without vertical descent

Tom Froese et al. Sci Rep. .

Abstract

Theories of the origin of the genetic code typically appeal to natural selection and/or mutation of hereditable traits to explain its regularities and error robustness, yet the present translation system presupposes high-fidelity replication. Woese's solution to this bootstrapping problem was to assume that code optimization had played a key role in reducing the effect of errors caused by the early translation system. He further conjectured that initially evolution was dominated by horizontal exchange of cellular components among loosely organized protocells ("progenotes"), rather than by vertical transmission of genes. Here we simulated such communal evolution based on horizontal transfer of code fragments, possibly involving pairs of tRNAs and their cognate aminoacyl tRNA synthetases or a precursor tRNA ribozyme capable of catalysing its own aminoacylation, by using an iterated learning model. This is the first model to confirm Woese's conjecture that regularity, optimality, and (near) universality could have emerged via horizontal interactions alone.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Assignments of the 64 codons of the genetic code. The bases of the codon table are arranged according to their specific error robustness: least (top), middle (left), and most robust (right). An amino acid’s slot is coloured according to its polar requirement to illustrate chemical similarity. Its aminoacyl-tRNA synthetase class is I or II. (a) The highly ordered standard genetic code. Stop codon slots are coloured white. (b) A highly robust artificial code emerging from the iterated learning model. Stop codons were not included in the model.
Figure 2
Figure 2
‘Black box’ of a protocell’s primitive translation system. For simplicity, and following previous work on the iterated learning model, we used a fully interconnected feed-forward multi-layer perceptron network to model the translational mapping from a codon to its corresponding amino acid. There are three input nodes, one for each of a codon’s bases. The order of base positions is arbitrary and interchangeable (no third base ‘wobble’). There are six hidden nodes. Output is an 11-dimensional vector that specifies an amino acid in terms of properties by which it can be uniquely distinguished in chemical space.
Figure 3
Figure 3
A model of communal evolution of the genetic code. (a) A small group of protocells is initialized such that their ‘black box’ primitive translation systems encode random genetic codes consisting of few amino acids. Then the ‘iterative learning’ cycle begins. (b) Two protocells are randomly selected for horizontal transfer of a fragment of the donor’s genetic code to the recipient. (c) A small subset of codon assignments is randomly chosen and transferred; occasionally, codon assignment inaccuracies can occur in the transferred components. (d) The recipient adjusts its genetic code to be more like the donor’s code according to the received assignments. (e) The process of horizontal transfer is completed. Then the cycle starts again by going back to (b).
Figure 4
Figure 4
Emergence of artificial genetic codes. Results are averaged from 50 runs and plotted in intervals of 500 transfers. Expressivity counts the receiver’s encoded amino acids after a transfer (range [1, 20]), plotted as a box plot where the dark green bar represents the overall mean, the lighter green bar represents lower and upper quartiles, dotted lines represent minimum and maximum non-outliers, and circles represent outliers. Δcode represents optimality as the code’s robustness to single nucleotide changes (red box plot). The standard genetic code (SGC) has an expressivity of 20, the number of amino acids encoded in the code. The Δcode of SGC’s codons (excluding stop codons) is 5.24 (red line). The most robust artificial code with the same expressivity as the SGC has a Δcode of 4.17 (see Fig. 1b for details). Universality is measured as the average distance between all codes in a group of protocells, where distance is calculated as the number of different codon assignments (range [0, 64]). We plot the overall mean distance and its standard deviation, with the final average of 16.86 different assignments being the smallest overall average encountered for the duration of these runs.
Figure 5
Figure 5
Regularities of the artificial genetic codes. We analysed the average properties of the 50 most optimal artificial genetic codes, one from each of 50 the independent runs. (a) Like the standard genetic code, the class of simple amino acids has more assignments than the complex and sulfur classes (red). This may partly result from the fact that the simple class is more frequent among the 20 encoded amino acids, but this tendency remains even if we correct for the unequal distribution of classes (blue). (b) Like the standard genetic code, there is a positive correlation between an amino acid’s frequency in proteins, modelled in terms of probability of amino acid transfer, and number of assignments (black). And there is also a negative correlation between its molecular weight and number of assignments (purple). Again, this may partly result from the fact that lighter amino acids are more frequent among the 20 encoded amino acids.

Similar articles

Cited by

References

    1. Koonin, E. V. & Novozhilov, A. S. Origin and evolution of the universal genetic code. Annu. Rev. Genet. 51 (2017). - PubMed
    1. Freeland SJ, Wu T, Keulmann N. The case for an error minimizing standard genetic code. Orig. Life Evol. Biosph. 2003;33:457–477. doi: 10.1023/A:1025771327614. - DOI - PubMed
    1. Novozhilov AS, Wolf YI, Koonin EV. Evolution of the genetic code: partial optimization of a random code for robustness to translation error in a rugged fitness landscape. Biol. Direct. 2007;2:24. doi: 10.1186/1745-6150-2-24. - DOI - PMC - PubMed
    1. Amikura K, Sakai Y, Asami S, Kiga D. Multiple amino acid-excluded genetic codes for protein engineering using multiple sets of tRNA variants. ACS Synth. Biol. 2014;3:140–144. doi: 10.1021/sb400144h. - DOI - PubMed
    1. Crick F. The origin of the genetic code. J. Mol. Biol. 1968;38:367–379. doi: 10.1016/0022-2836(68)90392-6. - DOI - PubMed

Publication types