Protein bioinformatics and mixtures of bivariate von Mises distributions for angular data
- PMID: 17688502
- DOI: 10.1111/j.1541-0420.2006.00682.x
Protein bioinformatics and mixtures of bivariate von Mises distributions for angular data
Abstract
A fundamental problem in bioinformatics is to characterize the secondary structure of a protein, which has traditionally been carried out by examining a scatterplot (Ramachandran plot) of the conformational angles. We examine two natural bivariate von Mises distributions--referred to as Sine and Cosine models--which have five parameters and, for concentrated data, tend to a bivariate normal distribution. These are analyzed and their main properties derived. Conditions on the parameters are established which result in bimodal behavior for the joint density and the marginal distribution, and we note an interesting situation in which the joint density is bimodal but the marginal distributions are unimodal. We carry out comparisons of the two models, and it is seen that the Cosine model may be preferred. Mixture distributions of the Cosine model are fitted to two representative protein datasets using the expectation maximization algorithm, which results in an objective partition of the scatterplot into a number of components. Our results are consistent with empirical observations; new insights are discussed.
Similar articles
-
Refinement of NMR-determined protein structures with database derived mean-force potentials.Proteins. 2007 Jul 1;68(1):232-42. doi: 10.1002/prot.21358. Proteins. 2007. PMID: 17387736
-
Detection of two-component mixtures of lognormal distributions in grouped, doubly truncated data: analysis of red blood cell volume distributions.Biometrics. 1991 Jun;47(2):607-22. Biometrics. 1991. PMID: 1912264
-
Missing data imputation through GTM as a mixture of t-distributions.Neural Netw. 2006 Dec;19(10):1624-35. doi: 10.1016/j.neunet.2005.11.003. Epub 2006 Mar 31. Neural Netw. 2006. PMID: 16580176
-
Probabilistic models and machine learning in structural bioinformatics.Stat Methods Med Res. 2009 Oct;18(5):505-26. doi: 10.1177/0962280208099492. Epub 2009 Jan 19. Stat Methods Med Res. 2009. PMID: 19153168 Review.
-
Towards a calculus of biomolecular complexes at equilibrium.Brief Bioinform. 2007 Jul;8(4):226-33. doi: 10.1093/bib/bbm034. Epub 2007 Jul 18. Brief Bioinform. 2007. PMID: 17640924 Review.
Cited by
-
RaptorX-Angle: real-value prediction of protein backbone dihedral angles through a hybrid method of clustering and deep learning.BMC Bioinformatics. 2018 May 8;19(Suppl 4):100. doi: 10.1186/s12859-018-2065-x. BMC Bioinformatics. 2018. PMID: 29745828 Free PMC article.
-
Density Estimation for Protein Conformation Angles Using a Bivariate von Mises Distribution and Bayesian Nonparametrics.J Am Stat Assoc. 2009 Jun 1;104(486):586-596. doi: 10.1198/jasa.2009.0024. J Am Stat Assoc. 2009. PMID: 20221312 Free PMC article.
-
Using kernelized partial canonical correlation analysis to study directly coupled side chains and allostery in small G proteins.Bioinformatics. 2015 Jun 15;31(12):i124-32. doi: 10.1093/bioinformatics/btv241. Bioinformatics. 2015. PMID: 26072474 Free PMC article.
-
Mocapy++--a toolkit for inference and learning in dynamic Bayesian networks.BMC Bioinformatics. 2010 Mar 12;11:126. doi: 10.1186/1471-2105-11-126. BMC Bioinformatics. 2010. PMID: 20226024 Free PMC article.
-
De novo protein conformational sampling using a probabilistic graphical model.Sci Rep. 2015 Nov 6;5:16332. doi: 10.1038/srep16332. Sci Rep. 2015. PMID: 26541939 Free PMC article.
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources