Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Nov;31(11):2905-12.
doi: 10.1093/molbev/msu228. Epub 2014 Aug 1.

Amino Acid metabolism conflicts with protein diversity

Affiliations

Amino Acid metabolism conflicts with protein diversity

Teresa Krick et al. Mol Biol Evol. 2014 Nov.

Abstract

The 20 protein-coding amino acids are found in proteomes with different relative abundances. The most abundant amino acid, leucine, is nearly an order of magnitude more prevalent than the least abundant amino acid, cysteine. Amino acid metabolic costs differ similarly, constraining their incorporation into proteins. On the other hand, a diverse set of protein sequences is necessary to build functional proteomes. Here, we present a simple model for a cost-diversity trade-off postulating that natural proteomes minimize amino acid metabolic flux while maximizing sequence entropy. The model explains the relative abundances of amino acids across a diverse set of proteomes. We found that the data are remarkably well explained when the cost function accounts for amino acid chemical decay. More than 100 organisms reach comparable solutions to the trade-off by different combinations of proteome cost and sequence diversity. Quantifying the interplay between proteome size and entropy shows that proteomes can get optimally large and diverse.

Keywords: amino acid decay; amino acid metabolism; information theory; maximum entropy; proteomics.

PubMed Disclaimer

Figures

F<sc>ig</sc>. 1.
Fig. 1.
Correlation of the logarithm of amino acid relative abundances in proteomes with metabolic cost in units of ATP molecules per amino acid molecule (A and D), with metabolic cost in units of ATP molecules per amino acid molecule corrected by amino acid decay (B and E) and with the genetic code model (C and F). (A)–(C) correspond to data set DS1, (D)–(F) correspond to data set DS2. Data points for the amino acid cysteine are shown as empty symbols, the rest of the amino acids are shown as black symbols. The lines are RMA regressions to all data points.
F<sc>ig</sc>. 2.
Fig. 2.
Correlation of amino acid relative abundances in proteomes with metabolic cost in units of ATP molecules per amino acid molecule (black line: plain abundances; blue line: logarithm of the abundances), with metabolic cost in units of ATP molecules per amino acid molecule corrected by amino acid decay (red line: plain abundances; green line: logarithm of the abundances) and with the genetic code model (dashed line). (A) corresponds to data set DS1, (B) corresponds to data set DS2. The data are shown as a function of genomic GC content in the x axis.
F<sc>ig</sc>. 3.
Fig. 3.
Trade-off between amino acid metabolic cost and proteome sequence diversity. (A) Genomic GC content dependence of the average metabolic cost per amino acid. (B) Genomic GC content dependence of the proteome entropy. (C) Genomic GC content dependence of the target function f. (D) Trade-off between amino acid metabolic cost (x axis) and proteome sequence diversity measured as entropy (y axis). The contour lines indicate the value for the target function, and the triangles correspond to the trade-off model using the values of m for DS1 and DS2 from figure 1B and E. All panels display the 107 organisms in data set DS1 (white symbols), the 17 organisms in data set DS2 (black symbols), and the genetic code model (red symbols). (D) includes genomic GC contents between 0.15 (lower right corner) and 0.75 (lower left corner). The y axis legend to the right of (B) and (D) illustrates the number of probable peptide chains of length 100 given by e100h, where h is the entropy (Shannon 1948; Shannon and Weaver 1949).

References

    1. Akashi H, Gojobori T. Metabolic efficiency and amino acid composition in the proteomes of Escherichia coli and Bacillus subtilis. Proc Natl Acad Sci U S A. 2002;99(6):3695–3700. - PMC - PubMed
    1. Alves R, Savageau MA. Evidence of selection for low cognate amino acid bias in amino acid biosynthetic enzymes. Mol Microbiol. 2005;56(4):1017–1034. - PMC - PubMed
    1. Barton MD, Delneri D, Oliver SG, Rattray M, Bergman CM. Evolutionary systems biology of amino acid biosynthetic cost in yeast. PLoS One. 2010;5(8):e11935. - PMC - PubMed
    1. Beeby M, O'Connor BD, Ryttersgaard C, Boutz DR, Perry LJ, Yeates TO. The genomics of disulfide bonding and protein stabilization in thermophiles. PLoS Biol. 2005;3:e309. - PMC - PubMed
    1. Bryngelson JD, Wolynes PG. Spin glasses and the statistical mechanics of protein folding. Proc Natl Acad Sci U S A. 1987;84(21):7524–7528. - PMC - PubMed

Publication types

LinkOut - more resources