Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013;8(1):e53785.
doi: 10.1371/journal.pone.0053785. Epub 2013 Jan 25.

Proline: the distribution, frequency, positioning, and common functional roles of proline and polyproline sequences in the human proteome

Affiliations

Proline: the distribution, frequency, positioning, and common functional roles of proline and polyproline sequences in the human proteome

Alexander A Morgan et al. PLoS One. 2013.

Abstract

Proline is an anomalous amino acid. Its nitrogen atom is covalently locked within a ring, thus it is the only proteinogenic amino acid with a constrained phi angle. Sequences of three consecutive prolines can fold into polyproline helices, structures that join alpha helices and beta pleats as architectural motifs in protein configuration. Triproline helices are participants in protein-protein signaling interactions. Longer spans of repeat prolines also occur, containing as many as 27 consecutive proline residues. Little is known about the frequency, positioning, and functional significance of these proline sequences. Therefore we have undertaken a systematic bioinformatics study of proline residues in proteins. We analyzed the distribution and frequency of 687,434 proline residues among 18,666 human proteins, identifying single residues, dimers, trimers, and longer repeats. Proline accounts for 6.3% of the 10,882,808 protein amino acids. Of all proline residues, 4.4% are in trimers or longer spans. We detected patterns that influence function based on proline location, spacing, and concentration. We propose a classification based on proline-rich, polyproline-rich, and proline-poor status. Whereas singlet proline residues are often found in proteins that display recurring architectural patterns, trimers or longer proline sequences tend be associated with the absence of repetitive structural motifs. Spans of 6 or more are associated with DNA/RNA processing, actin, and developmental processes. We also suggest a role for proline in Kruppel-type zinc finger protein control of DNA expression, and in the nucleation and translocation of actin by the formin complex.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Structure of Proline and Related Homologues.
This figure shows the structures of proline (a), L-azetidine-2-carboxylic acid (Aze) (b), ß-lactam (c), nicotianamine (d), and mugineic acid (e). Aze is the lower homologue of proline. Its ring has only four members instead of five. Plants synthesize Aze as essential constituents of the two metal chelating molecules, nicotianamine and mugineic acid. These compounds trap metal ions in the soil and transport them to various plant parts. Aze is in particularly high concentrations in the bulbous roots of many plants, making its way into the human and lifestock food supply. Aze exerts its toxic effects by eluding the proof-reading function of the prolyl tRNA synthetases, allowing it to be misincorporated into nascent peptides or proteins in which it replaces proline. When Aze replaces proline, it can change protein structure, function, and antigenicity. This molecular mimicry is analagous to the other 4-member nitrogen containing ring of ß-lactam, which exerts its bactericidal effects by mimicking a D-Ala-D-Ala sequence of a transpeptidase, irreversibly blocking its role in bacterial cell wall synthesis. The role of Aze in human health is yet to be established.
Figure 2
Figure 2. Distribution of Proline Across Proteome.
(a) Counts of amino acids binned by their proportion of proline. (b) Counts of amino acids binned by their proportion of polyproline. (c) Counts of polyproline spans binned by their length.
Figure 3
Figure 3. Distribution of Amino Acids by Protein Length Across Human Proteome.
The counts of each amino acid as a function of relative length are shown with each letter corresponding to the appropriate amino acid. Each protein was divided into 100 segments and the total count of each amino acid in each segment was summed across the proteome. For example, a 200 amino acid long protein with a serine in position 3 and a lysine in position 4 would add a count of one S and one L in the 2% bin along the horizontal axis. Proline, P, peaks in prevalence at the 2% of length position, with 8,215 prolyl residues of a total of 108,671 amino acids (7.6% proline). The vertical axis was normalized to reflect a percent frequency.
Figure 4
Figure 4. Repeated TWEAZR Motif in ZNF729.
(a) The 28-amino-acid motif (TWEAZR) repeated 32 times in ZNF729. (b) Ribbon cartoon showing the likely structure of the conserved zinc finger motif based on homology with similar zinc finger structures. The cysteine residues are marked in green, the histidine in blue, and the proline in red. We propose that cis-trans isomerization of the proline can move the downstream portion of the zinc finger domain and alters the contact of some residues with specific nucleic acids. (c) The logo for the TWEAZR motif in ZNF729.

Similar articles

Cited by

References

    1. Rubenstein E (2000) Biologic effects of and clinical disorders caused by nonprotein amino acids. Medicine 79: 80–89. - PubMed
    1. Bell EA (2003) Nonprotein amino acids of plants: significance in medicine, nutrition, and agriculture. Journal of agricultural and food chemistry 51: 2854–2865. - PubMed
    1. Williamson MP (1994) The structure and function of proline-rich regions in proteins. The Biochemical journal 297 Pt 2: 249–260. - PMC - PubMed
    1. MacArthur MW, Thornton JM (1991) Influence of proline residues on protein conformation. Journal of molecular biology 218: 397–412. - PubMed
    1. Baldwin RL (2008) The search for folding intermediates and the mechanism of protein folding. Annu Rev Biophys 37: 1–21. - PubMed

Publication types

MeSH terms