Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Sep 27;38(10):4301-4309.
doi: 10.1093/molbev/msab164.

On the Origin of Frameshift-Robustness of the Standard Genetic Code

Affiliations

On the Origin of Frameshift-Robustness of the Standard Genetic Code

Haiqing Xu et al. Mol Biol Evol. .

Abstract

The standard genetic code (SGC) has been extensively analyzed for the biological ramifications of its nonrandom structure. For instance, mismatch errors due to point mutation or mistranslation have an overall smaller effect on the amino acid polar requirement under the SGC than under random genetic codes (RGCs). A similar observation was recently made for frameshift errors, prompting the assertion that the SGC has been shaped by natural selection for frameshift-robustness-conservation of certain amino acid properties upon a frameshift mutation or translational frameshift. However, frameshift-robustness confers no benefit because frameshifts usually create premature stop codons that cause nonsense-mediated mRNA decay or production of nonfunctional truncated proteins. We here propose that the frameshift-robustness of the SGC is a byproduct of its mismatch-robustness. Of 564 amino acid properties considered, the SGC exhibits mismatch-robustness in 93-133 properties and frameshift-robustness in 55 properties, respectively, and that the latter is largely a subset of the former. For each of the 564 real and 564 randomly constructed fake properties of amino acids, there is a positive correlation between mismatch-robustness and frameshift-robustness across one million RGCs; this correlation arises because most amino acid changes resulting from a frameshift are also achievable by a mismatch error. Importantly, the SGC does not show significantly higher frameshift-robustness in any of the 55 properties than RGCs of comparable mismatch-robustness. These findings support that the frameshift-robustness of the SGC need not originate through direct selection and can instead be a site effect of its mismatch-robustness.

Keywords: amino acid property; byproduct; mismatch-robustness; mistranslation; mutation; natural selection.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Mismatch- and frameshift-robustness of the SGC in multiple amino acid properties. (A–C) The frequency distribution of MS1 (A), MS2 (B), or MS3 (C) in polar requirement among one million RGCs. In each panel, the dashed line indicates the corresponding value for the SGC and the P value is the fraction of RGCs with MS values smaller than that of the SGC. (D) Frequency distribution of the (nominal) P value that measures the significance of mismatch- or frameshift-robustness of the SGC across 564 amino acid properties. The difference between the P-value distributions for MS1 (yellow) and MS3 (red) is significant (P <10−11, Kolmogorov–Smirnov test), so is the difference between the P-value distributions for MS2 (blue) and MS3 (red) (P <10−15). (E) The SGC’s P values for MS1 and MS3 in 564 amino acid properties. (F) The SGC’s P values for MS2 and MS3 in 564 amino acid properties. In (E) and (F), each dot represents one amino acid property. Dots above and below the dashed diagonal line are colored in red and blue, respectively, with their respective numbers indicated. P value shows the outcome of a binomial test of the equality of the numbers of red and blue dots.
Fig. 2.
Fig. 2.
Correlation between mismatch- and frameshift-robustness and its underlying mechanism. (A and B) Pearson’s correlation (r) between MS1 and MS3 (A) or that between MS2 and MS3 (B) in polar requirement among one million RGCs. Each blue dot represents one RGC, and only 1,000 RGCs are shown for better viewing. The red dot represents the SGC. (C) Frequency distribution of the across-RGC correlation between frameshift-robustness (MS3) and mismatch-robustness (MS1 or MS2) among 564 amino acid properties. (D) Classification of frameshift events into two types. As examples, +1 frameshifts of codon UUC and codon GAC and their corresponding amino acids are shown. A solid line indicates a frameshift error resulting in an amino acid change that is achievable by a mismatch error, whereas a dashed line indicates a frameshift error resulting in an amino acid change that is unachievable by a mismatch error. Orange and green areas in the pie chart indicate fractions of mismatch-like and mismatch-unlike frameshifts, respectively. The number of types of frameshift events belonging to each group is given in the parentheses. (E and F) Distribution of the across-RGC correlation between frameshift-robustness (MS3) and mismatch-robustness measured by MS1 (E) or MS2 (F) among the 564 amino acid properties. MS3 is calculated based on all frameshift events (blue), mismatch-like frameshift events (orange), or mismatch-unlike frameshift events (green). The correlation between mismatch-robustness (MS1 or MS2) and mismatch-like frameshift-robustness (MS3) differs significantly from that between mismatch-robustness and mismatch-unlike frameshift-robustness (P <10−93, Wilcoxon signed-rank test) in both (E) and (F).
Fig. 3.
Fig. 3.
Correlation between mismatch- and frameshift-robustness in fake amino acid properties simulated using a uniform distribution. Data shown are based on 10,000 RGCs. (A) Frequency distribution of the across-RGC correlation between frameshift-robustness (MS3) and mismatch-robustness (MS1 or MS2) among 564 fake amino acid properties. (B and C) Frequency distribution of the across-RGC correlation between frameshift-robustness (MS3) and mismatch-robustness measured by MS1 (B) or MS2 (C) among the 564 fake amino acid properties. In (B) and (C), MS3 is calculated based on all frameshift events (blue), mismatch-like frameshift events (orange), or mismatch-unlike frameshift events (green). The correlation between mismatch-robustness (MS1 or MS2) and mismatch-like frameshift-robustness (MS3) differs significantly from that between mismatch-robustness and mismatch-unlike frameshift-robustness (P <10−93, Wilcoxon signed-rank test) in both (B) and (C).
Fig. 4.
Fig. 4.
Frameshift-robustness arises as a byproduct of mismatch-robustness. (A) The extent of frameshift-robustness of RGCs after controlling the extent of MS1-based mismatch-robustness. The top plot shows the frequency distribution of the number (N1) of amino acid properties with significant MS1-based mismatch-robustness among one million RGCs. The black dashed lines show the 80% and 120% of the corresponding N1 of the SGC, respectively. The RGCs between the two dashed lines are control RGCs. The lower plot shows the frequency distribution of the number (N3) of amino acid properties with significant frameshift-robustness among the control RGCs, whereas the green dashed line indicates the N3 of the SGC. P value indicates the proportion of control RGCs that are to the right of the dashed line. (B) Same as (A) except that MS1 is replaced with MS2. (C) Frameshift-robustness of RGCs in polar requirement after controlling MS1-based mismatch-robustness in the same property. The top plot shows the frequency distribution of MS1 among one million RGCs. The black dashed line indicates the corresponding MS1 of the SGC. The RGCs to the left of the dashed line are control RGCs. The lower plot shows the frequency distribution of MS3 among the control RGCs, whereas the green dashed line indicates the MS3 of the SGC. P value indicates the proportion of control RGCs that are to the left of the dashed line. (D) Same as (C) except that MS1 is replaced with MS2.

Similar articles

Cited by

References

    1. Abrahams L, Hurst LD.. 2018. Refining the ambush hypothesis: evidence that GC- and AT-rich bacteria employ different frameshift defence strategies. Genome Biol Evol. 10(4):1153–1173. - PMC - PubMed
    1. Amirnovin R.1997. An analysis of the metabolic theory of the origin of the genetic code. J Mol Evol. 44(5):473–476. - PubMed
    1. Archetti M.2004. Codon usage bias and mutation constraints reduce the level of error minimization of the genetic code. J Mol Evol. 59(2):258–266. - PubMed
    1. Atchley WR, Zhao J, Fernandes AD, Druke T.. 2005. Solving the protein sequence metric problem. Proc Natl Acad Sci U S A. 102(18):6395–6400. - PMC - PubMed
    1. Bartonek L, Braun D, Zagrovic B.. 2020. Frameshifting preserves key physicochemical properties of proteins. Proc Natl Acad Sci U S A. 117(11):5907–5912. - PMC - PubMed

Publication types

LinkOut - more resources