Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Oct 26;11 Suppl 8(Suppl 8):S2.
doi: 10.1186/1471-2105-11-S8-S2.

Graphical models for inferring single molecule dynamics

Affiliations

Graphical models for inferring single molecule dynamics

Jonathan E Bronson et al. BMC Bioinformatics. .

Abstract

Background: The recent explosion of experimental techniques in single molecule biophysics has generated a variety of novel time series data requiring equally novel computational tools for analysis and inference. This article describes in general terms how graphical modeling may be used to learn from biophysical time series data using the variational Bayesian expectation maximization algorithm (VBEM). The discussion is illustrated by the example of single-molecule fluorescence resonance energy transfer (smFRET) versus time data, where the smFRET time series is modeled as a hidden Markov model (HMM) with Gaussian observables. A detailed description of smFRET is provided as well.

Results: The VBEM algorithm returns the model's evidence and an approximating posterior parameter distribution given the data. The former provides a metric for model selection via maximum evidence (ME), and the latter a description of the model's parameters learned from the data. ME/VBEM provide several advantages over the more commonly used approach of maximum likelihood (ML) optimized by the expectation maximization (EM) algorithm, the most important being a natural form of model selection and a well-posed (non-divergent) optimization problem.

Conclusions: The results demonstrate the utility of graphical modeling for inference of dynamic processes in single molecule biophysics.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Examples of types of commonly encountered biophysical time series data. (A) A time series for a molecule transitioning between a series of locally stable conformations. Such data often arise, for example, when studying protein domain movements or the dynamics of polymers tethered to a surface. (B) A time series for a molecule undergoing a stepping process. Such data often arise, for example, when studying proteins with unidirectional movements, e.g., helicases and motor proteins.
Figure 2
Figure 2
A GM for the problem of learning genders of boys and girls from a table of their heights and weights. The gender of the nth child is denoted zn. The 2-dimensional vector of the child’s height and weight is denoted dn The mean hight and weight for each gender, variances of height and weight for each gender, and probability of belonging to each gender are denoted by formula image, formula image, and formula image, respectively. Observed variables are represented by open circles, hidden variables are represented by filled circles, and fixed parameters are represented by dots. To avoid drawing nodes for all N hidden and observed variables, the variables are shown once and placed inside a plate which denotes the number or repetitions in the lower right corner. This GM specifies the conditional factorization of formula image shown in Eq. 1.
Figure 3
Figure 3
(A) Cartoon of a smFRET experiment studying the zipping/unzipping of a DNA hairpin. A FRET donor chromophore (green balloon) and acceptor chromophore (red balloon) are attached to the DNA. When the DNA is zipped (left), exciting the donor with green light causes the majority of energy to be transferred to the acceptor. The donor will fluoresce dimly and the acceptor will fluoresce brightly. When the DNA is unzipped, the probes are too far apart for efficient FRET. Exciting the donor in this conformation causes it to fluoresce brightly and the acceptor to fluoresce dimly. (B) The two channel (donor, acceptor) time series generated by the DNA as it transitions between zipped (bright red, dim green) and unzipped (dim red, bright green). (C) The 1D FRET transformation of the time series from B, calculated with Eq. 4. The closer the probes, the more intense the signal. Time series of this summary statistic are commonly used for analysis.
Figure 4
Figure 4
(A) The HMM as a GM. At each time step, t, the system occupies a hidden state, zt and produces an observable emission, dt, drawn from p(dt|zt). In turn, zt is drawn from p(zt|zt−1). (B) Complete GM for the HMM used to describe smFRET data in this work. Following the Bayesian treatment of probability, all unknown parameters are treated as hidden variables, and represented as open circles. Emissions are assumed to be Gaussian, with mean formula image and precision formula image. Transition rates are multinomial, with probabilities given by A. The probability of initially occupying each hidden state is multinomial as well, with probabilities given by formula image. Equations for these distributions are described in the text below Eq. 5. This GM specifies the conditional factorization of formula image shown in Eq. 6.
Figure 5
Figure 5
(A) Model selection using ME. Inference using 1 ≤ K ≤ 7 hidden states was performed for each trace. The results with the highest L(q) are shown in red. (B) The posterior parameter distribution for the lowest-valued smFRET state inferred in each time series. The width of the posterior increases with the noise of the smFRET states, indicating lower confidence in the parameters learned from inference on noisier time series. (C) The idealized trajectories (red) inferred for each time series (blue) using the most probable parameters of the inference with the highest L(q).

Similar articles

Cited by

References

    1. Joo C, Balci H, Ishitsuka Y, Buranachai C, Ha T. Advances in single-molecule fluorescence methods for molecular biology. Annu. 2008;77:51–76. doi: 10.1146/annurev.biochem.77.070606.101543. - DOI - PubMed
    1. Myong S, Ha T. Stepwise translocation of nucleic acid motors. Curr. 2010;20:121–127. doi: 10.1016/j.sbi.2009.12.008. - DOI - PMC - PubMed
    1. Seidel R, Dekker C. Single-molecule studies of nucleic acid motors. Curr. 2007;17:80–86. doi: 10.1016/j.sbi.2006.12.003. - DOI - PubMed
    1. Aathavan K, Politzer AT, Kaplan A, Moffitt JR, Chemla YR, Grimes S, Jardine PJ, Anderson DL, Bustamante C. Substrate interactions and promiscuity in a viral DNA packaging motor. Nature. 2009;461:669–673. doi: 10.1038/nature08443. - DOI - PMC - PubMed
    1. Dumont S, Cheng W, Serebrov V, Beran RK, Tinoco I, Pyle AM, Bustamante C. RNA translocation and unwinding mechanism of HCV NS3 helicase and its coordination by ATP. Nature. 2006;439:105–108. doi: 10.1038/nature04331. - DOI - PMC - PubMed

Publication types

LinkOut - more resources