Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2002 Jan;11(1):18-26.
doi: 10.1110/ps.ps.31002.

Composites of local structure propensities: evidence for local encoding of long-range structure

Affiliations

Composites of local structure propensities: evidence for local encoding of long-range structure

David Shortle. Protein Sci. 2002 Jan.

Abstract

To estimate how extensively the ensemble of denatured-state conformations is constrained by local side-chain-backbone interactions, propensities of each of the 20 amino acids to occur in mono- and dipeptides mapped to discrete regions of the Ramachandran map are computed from proteins of known structure. In addition, propensities are computed for the trans, gauche-, and gauche+ rotamers, with or without consideration of the values of phi and psi. These propensities are used in scoring functions for fragment threading, which estimates the energetic favorability of fragments of protein sequence to adopt the native conformation as opposed to hundreds of thousands of incorrect conformations. As finer subdivisions of the Ramachandran plot, neighboring residue phi/psi angles, and rotamers are incorporated, scoring functions become better at ranking the native conformation as the most favorable. With the best composite propensity function, the native structure can be distinguished from 300,000 incorrect structures for 71% of the 2130 arbitrary protein segments of length 40, 48% of 2247 segments of length 30, and 20% of 2368 segments of length 20. A majority of fragments of length 30-40 are estimated to be folded into the native conformation a substantial fraction of the time. These data suggest that the variations observed in amino acid frequencies in different phi/psi/chi1 environments in folded proteins reflect energetically important local side-chain-backbone interactions, interactions that may severely restrict the ensemble of conformations populated in the denatured state to a relatively small subset with nativelike structure.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
The major regions of the Ramachandran plot and the subdivisions used in this work. The p4 propensities used subdivisions alpha, beta, L-helix, and φ combined with other. The p6 propensities were B0, P0, alpha, L-helix, φ, and other. The p9 propensities were B1, B2, P1, P2, A1, A2, L-helix, φ, and other. The p12 propensities were β1, β2, β3, p1, p2, p3, α1, α2, α3, L-helix, φ, and other. The p15 propensities were L1, L2, L3, m1, m2, m3, r1, r2, r3, α1, α2, α3, L-helix, φ, and other.
Fig. 2.
Fig. 2.
Fraction of 2140 sequence fragments of length 40 for which the native conformation ranked (A) No. 1 (i.e., with the best score) or (B) in the top 0.1% of conformations (300 per 299,779). The score consisted of the sum of the logarithm of the single-peptide propensities, pN, where n = 4, 6, 9, 12, 15, or the Bryant and Lawrence (1993) empirical pair potential score (energy).
Fig. 3.
Fig. 3.
Fraction of 2140 sequence fragments of length 40 for which the native conformation had the best score (i.e., ranked No. 1). (A) As a function of the number of N2 states used to represent residues i−1 and i+1 in dipeptide propensities. (B) As a function of rotamer probability used in combination with single peptide propensities. (C) As a function of rotamer probability used in combination with dipeptide propensities.
Fig. 4.
Fig. 4.
Histograms of estimated probabilities that sequence fragments occupy the native conformation, as opposed to 300,000 incorrect conformations, as a function of length and composite propensity function. Fragment length and propensity function used in threading are given in the panel. The ranges of probabilities are listed below each column of the histograms.
Fig. 5.
Fig. 5.
Composite propensity function scores (p15x6-r15) per 100 residues for 23 homologous pairs of proteins, one from a thermophilic organism and the other from a mesophilic organism. Their Protein Data Bank designations are as follows: 1:1thl/1npc, 2:1ldn/1ldm, 3:3pfk/2pfk, 4: 1ril/2rn2, 5:1bmd/4mdh, 6:2prd/1ino, 7:1php/3pgk, 8:1thm/1st3, 9:1ebd/1lvl, 10:1btm/1tim, 11:2fxb/1dur, 12:1yna/1xyn, 13:1xyz/2exo, 14:1caa/6rxn, 15:1gd1/1gdp, 16:1tib/1lgy, 17:1zip/1ak2, 18:2prd/1obw, 19:1ais/1vol, 20:1ffh/1fts, 21:1pcz/1vok, 22:1obr/2ctc, and 23:1phn/1cpc. This set of proteins is from Kannan and Vishveshwara (2000).

References

    1. Bahar, I., Kaplan, M., and Jernigan, R.L. 1997. Short-range conformational energies, secondary structure propensities, and recognition of correct sequence-structure matches. Proteins 29 292–308. - PubMed
    1. Blaber, M., Zhang, X.J., and Matthews, B.W. 1993. Structural basis of amino acid alpha helix propensity. Science 260 1637–1640. - PubMed
    1. Bonneau, R. and Baker, D. 2001. Ab initio protein structure prediction: Progress and prospects. Ann. Rev. Biophys. Biomol. Struct. 30 173–189. - PubMed
    1. Bryant, S.H. and Lawrence, C.E. 1993. An empirical energy function for threading protein sequence through the folding motif. Proteins 16 92–112. - PubMed
    1. Chou, P.Y. and Fasman, G.D. 1974. Conformational parameters for amino acids in helical, β-sheet, and random coil regions calculated from proteins. Biochemistry 13 211–222. - PubMed

Publication types

Associated data