Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Feb 8:15:243-254.
doi: 10.1016/j.csbj.2017.01.011. eCollection 2017.

Protein Structure Classification and Loop Modeling Using Multiple Ramachandran Distributions

Affiliations

Protein Structure Classification and Loop Modeling Using Multiple Ramachandran Distributions

Seyed Morteza Najibi et al. Comput Struct Biotechnol J. .

Abstract

Recently, the study of protein structures using angular representations has attracted much attention among structural biologists. The main challenge is how to efficiently model the continuous conformational space of the protein structures based on the differences and similarities between different Ramachandran plots. Despite the presence of statistical methods for modeling angular data of proteins, there is still a substantial need for more sophisticated and faster statistical tools to model the large-scale circular datasets. To address this need, we have developed a nonparametric method for collective estimation of multiple bivariate density functions for a collection of populations of protein backbone angles. The proposed method takes into account the circular nature of the angular data using trigonometric spline which is more efficient compared to existing methods. This collective density estimation approach is widely applicable when there is a need to estimate multiple density functions from different populations with common features. Moreover, the coefficients of adaptive basis expansion for the fitted densities provide a low-dimensional representation that is useful for visualization, clustering, and classification of the densities. The proposed method provides a novel and unique perspective to two important and challenging problems in protein structure research: structure-based protein classification and angular-sampling-based protein loop structure prediction.

Keywords: Bivariate splines; Log-spline density estimation; Protein classification; Protein structure; Ramachandran distribution; Roughness penalty; SCOP; Trigonometric B-spline.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
A classification task with 33 domains from four Species of the same protein class, separated at the bottom of SCOP hierarchy with PSCDE approach [36]. (A) The scree plot with numbers showing the percentage of variability explained by the leading components; (B) the AIC plot; (C) the scatter plot of coefficients 1 vs 2; and (D) the scatter plot of coefficients 3 vs 4.
Fig. 2
Fig. 2
A classification task with 33 domains from four Species of the same protein class, separated at the bottom of SCOP hierarchy with PSCDE(T) approach. (A) The scree plot with numbers showing the percentage of variability explained by the leading components; (B) the trace of the penalized log-likelihood function; (C) the scatter plot of coefficients 1 vs 2; and (D) the scatter plot of coefficients 2 vs 3.
Fig. 3
Fig. 3
Dendrograms from hierarchical clustering for SCOP.4 task.

Similar articles

Cited by

References

    1. Oldfield T.J., Hubbard R.E. Analysis of Cα geometry in protein structures. Proteins. 1994;18(4):324–337. - PubMed
    1. Laskowski R., MacArthur M.W., Moss D., Thornton J.M. Procheck: a program to check the stereochemical quality of protein structures. J Appl Crystallogr. 1993;26:283–291.
    1. Hooft R.W.W., Sander C., Vriend G. Objectively judging the quality of a protein structure from a Ramachandran plot. Comput Appl Biosci: CABIOS. 1997;13(4):425–430. - PubMed
    1. Davis I.W., Murray L.W., Richardson J.S., Richardson D.C. Molprobity: structure validation and all-atom contact analysis for nucleic acids and their complexes. Nucleic Acids Res. 2004;32(Web Server issue):W615–W619. - PMC - PubMed
    1. Simons K.T., Bonneau R., Ruczinski I., Baker D. Ab initio protein structure prediction of CASP III targets using ROSETTA. Proteins. 1999;37(Suppl 3):171–176. - PubMed

LinkOut - more resources