Review

. 2014 Apr 16;11(95):20131147.

doi: 10.1098/rsif.2013.1147. Print 2014 Jun 6.

From local structure to a global framework: recognition of protein folds

Agnel Praveen Joseph¹, Alexandre G de Brevern

Affiliations

PMID: 24740960
PMCID: PMC4006237
DOI: 10.1098/rsif.2013.1147

Review

From local structure to a global framework: recognition of protein folds

Agnel Praveen Joseph et al. J R Soc Interface. 2014.

. 2014 Apr 16;11(95):20131147.

doi: 10.1098/rsif.2013.1147. Print 2014 Jun 6.

Authors

Agnel Praveen Joseph¹, Alexandre G de Brevern

Affiliation

¹ Science and Technology Facilities Council, Rutherford Appleton Laboratory, Harwell Oxford, , Didcot OX11 0QX, UK.

PMID: 24740960
PMCID: PMC4006237
DOI: 10.1098/rsif.2013.1147

Abstract

Protein folding has been a major area of research for many years. Nonetheless, the mechanisms leading to the formation of an active biological fold are still not fully apprehended. The huge amount of available sequence and structural information provides hints to identify the putative fold for a given sequence. Indeed, protein structures prefer a limited number of local backbone conformations, some being characterized by preferences for certain amino acids. These preferences largely depend on the local structural environment. The prediction of local backbone conformations has become an important factor to correctly identifying the global protein fold. Here, we review the developments in the field of local structure prediction and especially their implication in protein fold recognition.

Keywords: bioinformatics; local structure prediction; protein fold recognition; sequence annotation; structural alphabets; threading.

PubMed Disclaimer

Figures

**Figure 1.**
‘Homology’ detection with variation in sequence identity. (a) Schematic demonstrating the use of sequence comparison for detecting structural homology. The sequence alignments are indicated with ‘X’ representing any amino acid. Same amino acids at equivalent positions are highlighted in red, similar ones are in green. At sequence identity levels above 30% (i), simple sequence alignments are largely sufficient to detect similar folds. Below this similarity threshold, the alignments are less accurate and thus less efficient in detecting genuine relationships. (ii) Between 20 and 30%, the correct fold is not often detected as the top hit. (iii) At very low sequence identities, simple sequence alignments are not very useful. (b) Variation of structural similarity (quantified in terms of GDT_TS score) [31] with change in sequence identity. Even at low sequence identities (less than 30%, highlighted in grey background) significant structural similarity could be observed. (Online version in colour.)

**Figure 2.**
Different strategies for protein fold recognition. The fold space is highlighted by the blue background and the lengths of the black arrows joining the target sequence (space) and fold space give an idea of the distance of relationship. (a) Close relationships are often detected by simple sequence alignment techniques. (b) Addition of evolutionary information using sequence profiles derived from MSAs helps in detecting more distantly related folds. When the sequence-based alignments are not informative, sequence–structure matching needs to be carried out. (c) The target sequence can be threaded on to the known folds to check the compatibility. The compatibility is usually quantified based on the global interaction or energy potential. Obtaining an optimal alignment between the sequence and a fold is however difficult and computationally quite expensive. (d) The other alternative is to carry out prediction of different structural features like local backbone conformation, solvent accessibility or contact order and then matching the predicted features with that found in the known fold. (Online version in colour.)

**Figure 3.**
Comparison of secondary structure prediction methods. For a recently solved structure of methylglyoxal synthase [PDB ID 2X8W] [127], the assigned secondary structure by (a) DSSP [128] and predicted ones are shown. The α-helices are shown in red, β-strands in yellow and coils in green. Different secondary structure prediction methods are shown: (b) PSIPRED [115], (c) PROF [129], (d) SSPRO [130] and (e) YASPIN [131]. The predictions are also shown as sequence alignment in (f). Helices, strands and coils are indicated by H, E and C, respectively. (Online version in colour.)

**Figure 4.**
Going beyond three-state secondary structure. The structures related to methylglyoxal synthase (a) [PDB ID 2X8W] are identified purely based on the secondary structure, using SSEA server [142]. A different fold (b) (Response Regulator, PDB ID 1M5T) [143] was obtained as a top hit based on the secondary structure content. The secondary structure alignment (c) shows that the structures are close based on the secondary structure; however, the folds are different. The equivalent helices and strands are highlighted in the same colour in the two structures (*a,b*). A more precise description of the backbone conformation was obtained using PB (d). The assignment of PB instead of the three-state secondary structures highlights many differences between the two structures, which were otherwise masked by the secondary structure definition. The segments assigned as coils (indicated by ‘C’) are highlighted in red in the PB alignment. Other differences in the regular secondary structures are in blue. (Online version in colour.)

**Figure 5.**
Prediction of local conformational rigidity and flexibility. Structure of glutamate mutase [PDB ID: 1CCW, chain B] [218] (a) highlighting the predicted secondary structures [219]: helices (red) and strands (yellow), flexible regions [217] (pink) and a discordant helix with high strand propensity (brown, highlighted within the dotted circle) [215]. (b) The flexible regions predicted (probability > 0.5 and confidence > 10) by PredyFlexy method [216] (pink) and the predicted disordered region [220] is shown in blue (highlighted within the dotted circles). (Online version in colour.)

See this image and copyright information in PMC

References

1. Anderson AC. 2003. The process of structure-based drug design. Chem. Biol. 10, 787–797. ( 10.1016/j.chembiol.2003.09.002) - DOI - PubMed
1. Chen L, Morrow JK, Tran HT, Phatak SS, Du-Cuny L, Zhang S. 2013. From laptop to benchtop to bedside: structure-based drug design on protein targets. Curr. Pharm. Des. 18, 1217–1239. ( 10.2174/138161212799436386) - DOI - PMC - PubMed
1. Verlinde CL, Hol WG. 1994. Structure-based drug design: progress, results and challenges. Structure 2, 577–587. ( 10.1016/S0969-2126(00)00060-5) - DOI - PubMed
1. Berman HM, et al. 2009. The protein structure initiative structural genomics knowledgebase. Nucleic Acids Res. 37, D365–D368. ( 10.1093/nar/gkn790) - DOI - PMC - PubMed
1. Chandonia JM, Brenner SE. 2006. The impact of structural genomics: expectations and outcomes. Science 311, 347–351. ( 10.1126/science.1121018) - DOI - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

From local structure to a global framework: recognition of protein folds

Affiliation

From local structure to a global framework: recognition of protein folds

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources