Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jun 15;29(12):1511-8.
doi: 10.1093/bioinformatics/btt180. Epub 2013 Apr 18.

An HMM-based algorithm for evaluating rates of receptor-ligand binding kinetics from thermal fluctuation data

Affiliations

An HMM-based algorithm for evaluating rates of receptor-ligand binding kinetics from thermal fluctuation data

Lining Ju et al. Bioinformatics. .

Abstract

Motivation: Abrupt reduction/resumption of thermal fluctuations of a force probe has been used to identify association/dissociation events of protein-ligand bonds. We show that off-rate of molecular dissociation can be estimated by the analysis of the bond lifetime, while the on-rate of molecular association can be estimated by the analysis of the waiting time between two neighboring bond events. However, the analysis relies heavily on subjective judgments and is time-consuming. To automate the process of mapping out bond events from thermal fluctuation data, we develop a hidden Markov model (HMM)-based method.

Results: The HMM method represents the bond state by a hidden variable with two values: bound and unbound. The bond association/dissociation is visualized and pinpointed. We apply the method to analyze a key receptor-ligand interaction in the early stage of hemostasis and thrombosis: the von Willebrand factor (VWF) binding to platelet glycoprotein Ibα (GPIbα). The numbers of bond lifetime and waiting time events estimated by the HMM are much more than those estimated by a descriptive statistical method from the same set of raw data. The kinetic parameters estimated by the HMM are in excellent agreement with those by a descriptive statistical analysis, but have much smaller errors for both wild-type and two mutant VWF-A1 domains. Thus, the computerized analysis allows us to speed up the analysis and improve the quality of estimates of receptor-ligand binding kinetics.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Thermal fluctuation assay. (A) BFP photomicrograph. A micropipette-aspirated RBC with a bead (left, termed ‘probe’) attached to the apex was aligned with a bead (right, termed ‘target’) aspirated by another micropipette. (B) BFP functionalization. VWF-A1 and streptavidin were covalently coupled to the probe bead. GC was covalently coupled to the target bead. The schematic is no to scale as the sizes of the molecules have been enlarged relative to the sizes of the beads. (C) Thermal fluctuation data. Data plot of the instantaneous horizontal position x of the probe versus time t collected from one test cycle of the thermal fluctuation assay. During the experiment, the target bead was driven to approach the probe bead (black), contact for 0.1 s (green), retract (purple) and be held (blue and red) stationary with at a preset position. Blue and red traces annotate, respectively, bound and unbound states detected by the descriptive statistical method. Five minutes on average were taken to finish the manual annotation on one trace. (D) Plot of σ90 (the sliding standard deviation of 90 consecutive x positions from data in C around t) versus t. The same color coding is used as C
Fig. 2.
Fig. 2.
Data preparation flowchart. Step 1, prescreening; Step 2, drift removal; the first two steps were applied to both descriptive and HMM methods; Step 3, HMM parameter estimation; Step 4, identification of states by HMM; Step 5, evaluation of on- and off-rates by analysis of waiting time and bond lifetime distributions, respectively
Fig. 3.
Fig. 3.
Developing an HMM method for thermal fluctuation data. (A) Bound and unbound status annotation by the HMM analysis from the same data in Figure 1C. The average time spent for the algorithm to finish the annotation of one trace is 30 s. (B) Illustration of the HMM. At time t, let xt be the observed horizontal position of probe and zt be the unobserved binding state. Observation xt can be classified into two states: zt = 0 (blue) or zt = 1 (red). Also, zt follows a Markov chain and xts are independent normally distributed given zt. (C) Plot of σHMM (the predicted standard deviation from the HMM analysis of A) versus t. Each segment of C corresponds to the estimated standard deviation of bound or unbound period of A in red or blue by the HMM analysis
Fig. 4.
Fig. 4.
Comparison of effective on-rates derived from analysis of waiting times collected by the descriptive statistical and HMM methods. (A). Exponential waiting time distributions for the interaction of WT A1 and GC. An ensemble of ∼40 waiting times, defined as the intervals from the moment of a bond dissociation to the moment of the next bond association, was measured by the descriptive statistical method and pooled (blue squares). Another ensemble of ∼200 waiting times was measured by HMM from the same raw data and pooled (red squares). For each method, the natural log of the survival frequency with waiting times >tw was plotted against tw and fitted by a straight line (solid line). The negative slopes of the best-fits represent the cellular on-rate formula image = mrmlAckon estimated by the two methods. The variations in these values are shown by the 95% confidence interval of the best-fit (dotted lines). The red dotted lines are obscured because they overlap with the red solid line. (B). Comparison of effective on-rate Ackon estimated by descriptive statistical and HMM methods for WT, G1324S (Type 2M) and R1450E (Type 2B) A1s versus GC. Ackon was calculated by dividing formula image by the product of the protein densities on the probe (ml for A1) and target (mr for GC) beads, i.e. mrml = 1.96, 2.8 and 0.19 × 105 µm−4 determined by flow cytometry for respective conditions. The error bars indicate the 95% confidence interval for each method
Fig. 5.
Fig. 5.
Comparison of off-rates derived from analysis of bond lifetimes collected by the descriptive statistical and HMM methods. (A). Exponential bond lifetime distributions for the interaction of WT A1 and GC. An ensemble of ∼50 bond lifetimes, defined as the time span from association to dissociation of one bond, was pooled by the descriptive statistical method (blue squares). Another ensemble of ∼200 bond lifetimes was measured by the HMM method from the same raw data and pooled (red squares). For data obtained by each method, the natural log of the survival frequency with bond lifetimes >tb was plotted against tb and fitted by a straight line. The negative slopes of the best-fits represent the off-rate koff. (B). Comparison of off-rates estimated by the descriptive statistical and HMM for WT, G1324S (Type 2M) and R1450E (Type 2B) A1s versus GC. The error bars show the 95% confidence interval for each method
Fig. 6.
Fig. 6.
Performance comparison of the descriptive statistical and HMM methods. (A and B) Errors (measured as 95% confidence interval, CI) of the estimated cellular on-rates formula image (A) and off-rates koff (B) for 2D binding kinetics of GPIbα–VWF-A1 interaction under the following biological conditions: the WT VWF-A1 (circles), the LOF VWF-A1 mutant G1324S (squares) and the GOF VWF-A1 mutant R1450E (triangles). The errors were plotted for both the descriptive statistical method (blue) and the HMM method (red). (C and D) The numbers of waiting times (C) and bond lifetimes (D) that the descriptive statistical method (blue) and the HMM method (red) are respectively capable of measuring from the same set of raw data
Fig. 7.
Fig. 7.
Tuning parameter selection by half-sampling cross validation. (A) Half-sampling cross validation. The relative error of off-rate from odd sequence versus off-rate from even sequence was plotted against P0. (B) The relative error of off-rate versus P0 by comparing the HMM with descriptive statistical method with the same data as the whole sequence
Fig. 8.
Fig. 8.
Learning curve comparison between the descriptive statistical method and the HMM. (A) Comparison of the times spent by a new student to learn the descriptive statistical method (blue) and the HMM (red). Two students who were new to both methods were surveyed. The times for them to finish analysis of one dataset were plotted versus different time checkpoints. Each curve represents a surveyed student. (B) Comparison of the times spent by the experienced students to analyze the same set of raw data using the descriptive statistical method (blue) and the HMM (red). Two students were surveyed

Similar articles

References

    1. Auton M, et al. Destabilization of the A1 domain in von Willebrand factor dissociates the A1A2A3 tri-domain and provokes spontaneous binding to glycoprotein Ibalpha and platelet activation under shear stress. J. Biol. Chem. 2010;285:22831–22839. - PMC - PubMed
    1. Baum LE, et al. A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann. Math. Statist. 1970;41:164–171.
    1. Berndt MC, et al. Ristocetin-dependent reconstitution of binding of von Willebrand factor to purified human platelet membrane glycoprotein Ib-IX complex. Biochemistry. 1988;27:633–640. - PubMed
    1. Cardon LR, Stormo GD. Expectation maximization algorithm for identifying protein-binding sites with variable lengths from unaligned DNA fragments. J. Mol. Biol. 1992;223:159–170. - PubMed
    1. Celeux G, Durand J-B. Selecting hidden Markov model state number with cross-validated likelihood. Computation Stat. 2008;23:541–564.

Publication types