. 2010 Mar 12:11:126.

doi: 10.1186/1471-2105-11-126.

Mocapy++--a toolkit for inference and learning in dynamic Bayesian networks

Martin Paluszewski¹, Thomas Hamelryck

Affiliations

PMID: 20226024
PMCID: PMC2848649
DOI: 10.1186/1471-2105-11-126

Mocapy++--a toolkit for inference and learning in dynamic Bayesian networks

Martin Paluszewski et al. BMC Bioinformatics. 2010.

. 2010 Mar 12:11:126.

doi: 10.1186/1471-2105-11-126.

Authors

Martin Paluszewski¹, Thomas Hamelryck

Affiliation

¹ Bioinformatics Centre, University of Copenhagen, Denmark. palu@binf.ku.dk

PMID: 20226024
PMCID: PMC2848649
DOI: 10.1186/1471-2105-11-126

Abstract

Background: Mocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs). It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations).

Results: The program package is freely available under the GNU General Public Licence (GPL) from SourceForge http://sourceforge.net/projects/mocapy. The package contains the source for building the Mocapy++ library, several usage examples and the user manual.

Conclusions: Mocapy++ is especially suitable for constructing probabilistic models of biomolecular structure, due to its support for directional statistics. In particular, it supports the Kent distribution on the sphere and the bivariate von Mises distribution on the torus. These distributions have proven useful to formulate probabilistic models of protein and RNA structure in atomic detail.

PubMed Disclaimer

Figures

**Figure 1**
**BARNACLE: a probabilistic model of RNA structure**. A DBN with nine slices is shown, of which one slice is boxed. Nodes D and H are discrete nodes, while node A is a univariate von Mises node. The dihedral angles within one nucleotide i are labelled αⁱto ζⁱ. BARNACLE is a probabilistic model of the dihedral angles in a stretch of RNA [12].

**Figure 2**
**Samples from three Kent distributions on the sphere**. The red points were sampled from a distribution with high concentration and high correlation (κ = 1000, β = 499), the green points were sampled from a distribution with low concentration and no correlation (κ = 10, β = 0), and the blue points were sampled from a distribution with medium concentration and medium correlation (κ = 200, β = 50). The distributions underlying the red and green points have the same mean direction and axes and illustrate the effect of κ and β. For each distribution, 5000 points are sampled.

**Figure 3**
**Samples from three bivariate von Mises distributions on the torus**. The green points were sampled from a distribution with high concentration and no correlation (κ₁= κ₂= 100, κ₃= 0), the blue points were sampled from a distribution with high concentration and negative correlation (κ₁= κ₂= 100, κ₃= 49), and the red points were sampled from a distribution with low concentration and no correlation (κ₁= κ₂= 10, κ₃= 0). For each distribution, 10000 points are sampled.

**Figure 4**
**The model used in the third benchmark**. Each slice contains two hidden nodes (H and I). They are parents to a multivariate four-dimensional Gaussian node (G) and a bivariate von Mises node (V), respectively. The sizes of H and I are five and three, respectively. The length of the BN is 50 slices.

**Figure 5**
**Log-likelihood evolution during S-EM training**. Each column shows the evolution of the log-likelihood for one of the three benchmarks described in the results section. The training procedure was started from two different random seeds (indicated by a solid and a dashed line). The log-likelihood values, log P (D|H_n, θ_n), used in the upper figures are conditional on the states of the sampled hidden nodes (θ_nare the parameter values at iteration n, H_nare the hidden node values at iteration n and D is the observed data). The log-likelihood values in the lower figures, log P (D|θ_n), are computed by summing over all hidden node sequences using the forward algorithm [5]. Note that the forward algorithm can only be used on HMMs and is therefore not applied on the complex benchmark.

See this image and copyright information in PMC

Cited by

Systems biology data analysis methodology in pharmacogenomics.
Rodin AS, Gogoshin G, Boerwinkle E. Rodin AS, et al. Pharmacogenomics. 2011 Sep;12(9):1349-60. doi: 10.2217/pgs.11.76. Pharmacogenomics. 2011. PMID: 21919609 Free PMC article. Review.
New Algorithm and Software (BNOmics) for Inferring and Visualizing Bayesian Networks from Heterogeneous Big Biological and Genetic Data.
Gogoshin G, Boerwinkle E, Rodin AS. Gogoshin G, et al. J Comput Biol. 2017 Apr;24(4):340-356. doi: 10.1089/cmb.2016.0100. Epub 2016 Sep 28. J Comput Biol. 2017. PMID: 27681505 Free PMC article.
Beyond rotamers: a generative, probabilistic model of side chains in proteins.
Harder T, Boomsma W, Paluszewski M, Frellsen J, Johansson KE, Hamelryck T. Harder T, et al. BMC Bioinformatics. 2010 Jun 5;11:306. doi: 10.1186/1471-2105-11-306. BMC Bioinformatics. 2010. PMID: 20525384 Free PMC article.
Assessing protein conformational sampling methods based on bivariate lag-distributions of backbone angles.
Maadooliat M, Gao X, Huang JZ. Maadooliat M, et al. Brief Bioinform. 2013 Nov;14(6):724-36. doi: 10.1093/bib/bbs052. Epub 2012 Aug 27. Brief Bioinform. 2013. PMID: 22926831 Free PMC article.
Equilibrium simulations of proteins using molecular fragment replacement and NMR chemical shifts.
Boomsma W, Tian P, Frellsen J, Ferkinghoff-Borg J, Hamelryck T, Lindorff-Larsen K, Vendruscolo M. Boomsma W, et al. Proc Natl Acad Sci U S A. 2014 Sep 23;111(38):13852-7. doi: 10.1073/pnas.1404948111. Epub 2014 Sep 5. Proc Natl Acad Sci U S A. 2014. PMID: 25192938 Free PMC article.

See all "Cited by" articles

References

1. Bishop CM. Pattern recognition and machine learning. Springer; 2006.
1. Pearl J. Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann; 1997.
1. Ghahramani Z. Learning dynamic Bayesian networks. Lect Notes Comp Sci. 1998;1387:168–197. full_text.
1. Rabiner LR. A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE. 1989;77(2):257–286. doi: 10.1109/5.18626. - DOI
1. Durbin R, Eddy SR, Krogh A, Mitchison G. Biological sequence analysis. Cambridge University Press; 1999.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

LinkOut - more resources

Full Text Sources
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Mocapy++--a toolkit for inference and learning in dynamic Bayesian networks

Affiliation

Mocapy++--a toolkit for inference and learning in dynamic Bayesian networks

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Molecular Biology Databases

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources

Molecular Biology Databases