Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jan;21(1):107-21.
doi: 10.1002/pro.767. Epub 2011 Dec 5.

Modeling large regions in proteins: applications to loops, termini, and folding

Affiliations

Modeling large regions in proteins: applications to loops, termini, and folding

Aashish N Adhikari et al. Protein Sci. 2012 Jan.

Abstract

Template-based methods for predicting protein structure provide models for a significant portion of the protein but often contain insertions or chain ends (InsEnds) of indeterminate conformation. The local structure prediction "problem" entails modeling the InsEnds onto the rest of the protein. A well-known limit involves predicting loops of ≤12 residues in crystal structures. However, InsEnds may contain as many as ~50 amino acids, and the template-based model of the protein itself may be imperfect. To address these challenges, we present a free modeling method for predicting the local structure of loops and large InsEnds in both crystal structures and template-based models. The approach uses single amino acid torsional angle "pivot" moves of the protein backbone with a C(β) level representation. Nevertheless, our accuracy for loops is comparable to existing methods. We also apply a more stringent test, the blind structure prediction and refinement categories of the CASP9 tournament, where we improve the quality of several homology based models by modeling InsEnds as long as 45 amino acids, sizes generally inaccessible to existing loop prediction methods. Our approach ranks as one of the best in the CASP9 refinement category that involves improving template-based models so that they can function as molecular replacement models to solve the phase problem for crystallographic structure determination.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The InsEnds modeling problem. A multiple sequence alignment of a target sequence to template sequences can contain insertions at the same location. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
Figure 2
Figure 2
Local structure prediction algorithm. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
Figure 3
Figure 3
The ligation terms close the loop. Constraints are placed on the distance between the two ends of the loop and the distance between the free end of the loop and the anchor residue. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
Figure 4
Figure 4
Selection of the top 5 loop predictions for 1xnb. After clustering, the largest 5 clusters are ranked based on Z-scores with respect to cluster tightness, size, and average DOPE-PW energy. Once ranked, a selection is made from each of the 5 clusters using DOPE-PW + SASA. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
Figure 5
Figure 5
CASP9 Ins&Ends blind predictions. Numbers indicate improvement from MODELER (Red) to our model (blue), when compared with the native structure (green) after modeling the regions enclosed by the boxes. RMSD changes are for the whole structure.
Figure 6
Figure 6
InsEnds predictions for CASP9 refinement targets. Difference between CA/CA distance across the sequence of the initial (starting) /native and final (refined using InsEnds method)/native after superposition using sequence-dependent LGA protocol. Official data from CASP9 official website (http://predictioncenter.org/casp9/). For each target, the arrows indicate the regions where the InsEnds modeling has been performed. The blue to green color change designates regions where the InsEnds modeling improves upon the given target based on LGA superposition to native structures.
Figure 7
Figure 7
Global versus local RMSD. The RMSD of non InsEnds region is plotted against the global InsEnds RMSD (red) and local InsEnds RMSD (blue) for six CASP8 targets. The global InsEnds RMSD is affected severely by the quality of the homology model. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
Figure 8
Figure 8
InsEnds algorithm applied to protein folding pathways. A: The β5 and 310 helix in ubiquitin are the last structures to form in the pathway. Their structures are depicted as disordered in the model (B) of the folding intermediate and (C) predicted using the InsEnds algorithm. D: Barnase native structure highlighting the two hairpin loops that are part of two different cores, and (E) and (F) predictions of the loops using InsEnds algorithm, respectively. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

References

    1. Moult J. A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. Curr Opin Struct Biol. 2005;15:285–289. - PubMed
    1. Jones S, Thornton JM. Prediction of protein-protein interaction sites using patch analysis. J Mol Biol. 1997;272:133–143. - PubMed
    1. Shi L, Javitch JA. The second extracellular loop of the dopamine D2 receptor lines the binding-site crevice. Proc Natl Acad Sci USA. 2004;101:440–445. - PMC - PubMed
    1. Wu SJ, Dean DH. Functional significance of loops in the receptor binding domain of Bacillus thuringiensis CryIIIA delta-endotoxin. J Mol Biol. 1996;255:628–640. - PubMed
    1. Bruccoleri RE, Karplus M. Chain closure with bond angle variations. Macromolecules. 1985;18:2767–2773.

Publication types

LinkOut - more resources