Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Aug;10(4):175-80.
doi: 10.1016/j.gpb.2012.08.002. Epub 2012 Aug 11.

The pendulum model for genome compositional dynamics: from the four nucleotides to the twenty amino acids

Affiliations

The pendulum model for genome compositional dynamics: from the four nucleotides to the twenty amino acids

Zhang Zhang et al. Genomics Proteomics Bioinformatics. 2012 Aug.

Abstract

The genetic code serves as one of the natural links for life's two conceptual frameworks-the informational and operational tracks-bridging the nucleotide sequence of DNA and RNA to the amino acid sequence of protein and thus its structure and function. On the informational track, DNA and its four building blocks have four basic variables: order, length, GC and purine contents; the latter two exhibit unique characteristics in prokaryotic genomes where protein-coding sequences dominate. Bridging the two tracks, tRNAs and their aminoacyl tRNA synthases that interpret each codon-nucleotide triplet, together with ribosomes, form a complex machinery that translates genetic information encoded on the messenger RNAs into proteins. On the operational track, proteins are selected in a context of cellular and organismal functions constantly. The principle of such a functional selection is to minimize the damage caused by sequence alteration in a seemingly random fashion at the nucleotide level and its function-altering consequence at the protein level; the principle also suggests that there must be complex yet sophisticated mechanisms to protect molecular interactions and cellular processes for cells and organisms from the damage in addition to both immediate or short-term eliminations and long-term selections. The two-century study of selection at species and population levels has been leading a way to understand rules of inheritance and evolution at molecular levels along the informational track, while ribogenomics, epigenomics and other operationally-defined omics (such as the metabolite-centric metabolomics) have been ushering biologists into the new millennium along the operational track.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The four nucleotides and their variables in DNA sequence A. Life’s informational track has only four “cards”—four nucleotides A, T, G, and C—to “play” but a highly variable “deck” size. For instance, human has a deck of 3 billion “cards”. Although modified nucleotides do exist in genomes, their functional roles are often operational, such as 5-methylcytocine (5-mC) and 5-hydroxylcytocine (5-hmC). B. A “deck of cards” for all life forms has a limited number of basic variables.
Figure 2
Figure 2
Dinucleotide and codon contents of prokaryotic genomes We used 300 genomes, 100 each, from the three dnaE-based groups including the dnaE1 (dnaE1–dnaE1) group (A), the dnaE2 (dnaE1–dnaE1 and dnaE2) group (B) and the dnaE3 (dnaE3–polC) group (C). Di-nucleotide contents (left panels) are sorted based on GC content increase (left to right; scale bars). Codons (right panels) are also sorted based on GC content changes (left to right; scale bars). The six-fold codons are separated into their corresponding two and four codon sets. Note that frequencies of dinucleotides are essentially equivalent to those of codons and that GC-rich and GC-poor codons are over-utilized in the dnaE2 and dnaE3 group bacteria.
Figure 3
Figure 3
The Pendulum Model Pendulum models were drawn for both GC and purine contents in the same figure. On the horizontal scale, the GC content variation is shown where the equilibrium position (dashed yellow massless rob and massive bob) points at 50%, although there are ample genomes whose GC contents deviate significantly from this position. The amplitude of GC variation is rather broad, leading to a 60% difference (from 20% to 80%, horizontal double-arrowed blue line). The dashed blue curve indicates bob’s trajectory. Other dashed massless robs and massive bobs (red and blue indicate GC-rich and GC-poor, respectively) and their connected arrowed dials connect different GC-content to amino acids; the arrowed dials (dashed arrowed lines indicate transient positions) are aligned linearly with the bobs. On the vertical scale, variation of purine content is shown, which has smaller amplitude than that of GC content, (40–60%, green massive bob and massless rob), a third of the amplitude for GC content variation. The vertical double-arrowed green line indicates the amplitude and the green dashed curve shows the bob’s trajectory. The equilibrium position is indicated with yellow dashed massless rob and massive bob, pointing at 50% that is roughly the average purine content for most of the genomes and genes. The connected arrowed dials are perpendicular in this part of the model to the pendulums and such connection has no particular meaning but to demonstrate the link between nucleotide to amino acid sequences. The face of the “pendulum clock” has two components, the 64-codon genetic code and the 20 amino acid set. The frictionless pivot (dark grey toothed button) is fixed in the genetic code to indicate the fact that the information flow is translated into protein sequences through the code and that the code has both evolved step-wise to fix the coding capacity in the operational track and selected to minimize the damage in the operational track when DNA sequence varies to change the amino acid sequence. Among prokaryotic genomes, GC content variation is the major force dominating composition dynamics, while purine content variation only becomes pronounced when GC-content becomes relatively low. Lower GC-content forces the genomes to select more G for protein coding diversity and more genes on the leading strand to achieve transcription efficiency and transcript stability (see the main text for details).
Figure 4
Figure 4
A detailed illustration of the Pendulum Model Three pendulums are positioned in such a way where the equilibrium position is shown in color and the other two positions are shown in grey to indicate their transient nature. The bob’s trajectory is indicated with a dashed blue line. When the pendulum moves toward either GC increase (red) or AT increase (blue), the GC or AT quarters expand (gray pendulums) as indicated with three schematic circular representations of the clock faces at the three positions. We only filled in the lower half of the clock’s face here, since half of the codon table is not GC-content sensitive. The concentric circles (from the center) are the first (A and U in blue; G and C in red), second (A and U in blue; G and C in red) and the third (only R or purine and Y or pyrimidine are indicated in the AU quarter; N in the GC quarter is omitted) codon positions. The outermost circle displays the corresponding amino acids in a single letter code. The model demonstrates that alterations in GC content lead to the reshuffling of the codon composition in protein through the organization of the genetic code or the codon table. Although the GC-sensitive quarters of the genetic code is directly affected, other codons are not all standing still since the six-codon members of the genetic code including Arg (R), Leu (L) and Ser (S), balance both the purine-sensitive and purine-insensitive halves and GC-sensitive and GC-insensitive quarters , , .

Similar articles

Cited by

References

    1. Bao Q., Tian Y., Li W., Xu Z., Xuan Z., Hu S. A complete sequence of the T. tengcongensis genome. Genome Res. 2002;12:689–700. - PMC - PubMed
    1. Zhao X., Hu J., Yu J. Comparative analysis of eubacterial DNA polymerase III alpha subunits. Genomics Proteomics Bioinformatics. 2006;4:203–211. - PMC - PubMed
    1. Zhao X., Zhang Z., Yan J., Yu J. GC content variability of eubacteria is governed by the pol III alpha subunit. Biochem Biophys Res Commun. 2007;356:20–25. - PubMed
    1. Hu J., Zhao X., Zhang Z., Yu J. Compositional dynamics of guanine and cytosine content in prokaryotic genomes. Res Microbiol. 2007;158:363–370. - PubMed
    1. Hu J., Zhao X., Yu J. Replication-associated purine asymmetry (PAS) may contribute to strand-biased gene distribution (SGD) Genomics. 2007;90:186–194. - PubMed

Publication types