. 2013 Jun 15;29(12):1511-8.

doi: 10.1093/bioinformatics/btt180. Epub 2013 Apr 18.

An HMM-based algorithm for evaluating rates of receptor-ligand binding kinetics from thermal fluctuation data

Lining Ju¹, Yijie Dylan Wang, Ying Hung, Chien-Fu Jeff Wu, Cheng Zhu

Affiliations

PMID: 23599504
PMCID: PMC3673216
DOI: 10.1093/bioinformatics/btt180

An HMM-based algorithm for evaluating rates of receptor-ligand binding kinetics from thermal fluctuation data

Lining Ju et al. Bioinformatics. 2013.

. 2013 Jun 15;29(12):1511-8.

doi: 10.1093/bioinformatics/btt180. Epub 2013 Apr 18.

Authors

Lining Ju¹, Yijie Dylan Wang, Ying Hung, Chien-Fu Jeff Wu, Cheng Zhu

Affiliation

¹ Coulter Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta 30318, USA.

PMID: 23599504
PMCID: PMC3673216
DOI: 10.1093/bioinformatics/btt180

Abstract

Motivation: Abrupt reduction/resumption of thermal fluctuations of a force probe has been used to identify association/dissociation events of protein-ligand bonds. We show that off-rate of molecular dissociation can be estimated by the analysis of the bond lifetime, while the on-rate of molecular association can be estimated by the analysis of the waiting time between two neighboring bond events. However, the analysis relies heavily on subjective judgments and is time-consuming. To automate the process of mapping out bond events from thermal fluctuation data, we develop a hidden Markov model (HMM)-based method.

Results: The HMM method represents the bond state by a hidden variable with two values: bound and unbound. The bond association/dissociation is visualized and pinpointed. We apply the method to analyze a key receptor-ligand interaction in the early stage of hemostasis and thrombosis: the von Willebrand factor (VWF) binding to platelet glycoprotein Ibα (GPIbα). The numbers of bond lifetime and waiting time events estimated by the HMM are much more than those estimated by a descriptive statistical method from the same set of raw data. The kinetic parameters estimated by the HMM are in excellent agreement with those by a descriptive statistical analysis, but have much smaller errors for both wild-type and two mutant VWF-A1 domains. Thus, the computerized analysis allows us to speed up the analysis and improve the quality of estimates of receptor-ligand binding kinetics.

PubMed Disclaimer

Figures

**Fig. 1.**
Thermal fluctuation assay. (A) BFP photomicrograph. A micropipette-aspirated RBC with a bead (left, termed ‘probe’) attached to the apex was aligned with a bead (right, termed ‘target’) aspirated by another micropipette. (B) BFP functionalization. VWF-A1 and streptavidin were covalently coupled to the probe bead. GC was covalently coupled to the target bead. The schematic is no to scale as the sizes of the molecules have been enlarged relative to the sizes of the beads. (C) Thermal fluctuation data. Data plot of the instantaneous horizontal position x of the probe versus time t collected from one test cycle of the thermal fluctuation assay. During the experiment, the target bead was driven to approach the probe bead (black), contact for 0.1 s (green), retract (purple) and be held (blue and red) stationary with at a preset position. Blue and red traces annotate, respectively, bound and unbound states detected by the descriptive statistical method. Five minutes on average were taken to finish the manual annotation on one trace. (D) Plot of σ₉₀(the sliding standard deviation of 90 consecutive x positions from data in C around t) versus t. The same color coding is used as C

**Fig. 2.**
Data preparation flowchart. *Step 1*, prescreening; *Step 2*, drift removal; the first two steps were applied to both descriptive and HMM methods; *Step 3*, HMM parameter estimation; *Step 4*, identification of states by HMM; *Step 5*, evaluation of on- and off-rates by analysis of waiting time and bond lifetime distributions, respectively

**Fig. 3.**
Developing an HMM method for thermal fluctuation data. (A) Bound and unbound status annotation by the HMM analysis from the same data in Figure 1C. The average time spent for the algorithm to finish the annotation of one trace is 30 s. (B) Illustration of the HMM. At time t, let *x_t* be the observed horizontal position of probe and *z_t* be the unobserved binding state. Observation *x_t* can be classified into two states: *z_t* = 0 (blue) or *z_t* = 1 (red). Also, *z_t* follows a Markov chain and *x_t*s are independent normally distributed given *z_t*. (C) Plot of σ_HMM(the predicted standard deviation from the HMM analysis of A) versus t. Each segment of C corresponds to the estimated standard deviation of bound or unbound period of A in red or blue by the HMM analysis

**Fig. 4.**
Comparison of effective on-rates derived from analysis of waiting times collected by the descriptive statistical and HMM methods. (A). Exponential waiting time distributions for the interaction of WT A1 and GC. An ensemble of ∼40 waiting times, defined as the intervals from the moment of a bond dissociation to the moment of the next bond association, was measured by the descriptive statistical method and pooled (blue squares). Another ensemble of ∼200 waiting times was measured by HMM from the same raw data and pooled (red squares). For each method, the natural log of the survival frequency with waiting times >t_w was plotted against t_w and fitted by a straight line (solid line). The negative slopes of the best-fits represent the cellular on-rate = m_rm_lA_ck_on estimated by the two methods. The variations in these values are shown by the 95% confidence interval of the best-fit (dotted lines). The red dotted lines are obscured because they overlap with the red solid line. (B). Comparison of effective on-rate A_ck_on estimated by descriptive statistical and HMM methods for WT, G1324S (Type 2M) and R1450E (Type 2B) A1s versus GC. A_ck_onwas calculated by dividing by the product of the protein densities on the probe (m_lfor A1) and target (m_r for GC) beads, i.e. m_rm_l = 1.96, 2.8 and 0.19 × 10⁵ µm⁻⁴ determined by flow cytometry for respective conditions. The error bars indicate the 95% confidence interval for each method

formula image — **Fig. 4.**
Comparison of effective on-rates derived from analysis of waiting times collected by the descriptive statistical and HMM methods. (A). Exponential waiting time distributions for the interaction of WT A1 and GC. An ensemble of ∼40 waiting times, defined as the intervals from the moment of a bond dissociation to the moment of the next bond association, was measured by the descriptive statistical method and pooled (blue squares). Another ensemble of ∼200 waiting times was measured by HMM from the same raw data and pooled (red squares). For each method, the natural log of the survival frequency with waiting times >t_w was plotted against t_w and fitted by a straight line (solid line). The negative slopes of the best-fits represent the cellular on-rate = m_rm_lA_ck_on estimated by the two methods. The variations in these values are shown by the 95% confidence interval of the best-fit (dotted lines). The red dotted lines are obscured because they overlap with the red solid line. (B). Comparison of effective on-rate A_ck_on estimated by descriptive statistical and HMM methods for WT, G1324S (Type 2M) and R1450E (Type 2B) A1s versus GC. A_ck_onwas calculated by dividing by the product of the protein densities on the probe (m_lfor A1) and target (m_r for GC) beads, i.e. m_rm_l = 1.96, 2.8 and 0.19 × 10⁵ µm⁻⁴ determined by flow cytometry for respective conditions. The error bars indicate the 95% confidence interval for each method

**Fig. 5.**
Comparison of off-rates derived from analysis of bond lifetimes collected by the descriptive statistical and HMM methods. (A). Exponential bond lifetime distributions for the interaction of WT A1 and GC. An ensemble of ∼50 bond lifetimes, defined as the time span from association to dissociation of one bond, was pooled by the descriptive statistical method (blue squares). Another ensemble of ∼200 bond lifetimes was measured by the HMM method from the same raw data and pooled (red squares). For data obtained by each method, the natural log of the survival frequency with bond lifetimes >t_b was plotted against t_b and fitted by a straight line. The negative slopes of the best-fits represent the off-rate k_off. (B). Comparison of off-rates estimated by the descriptive statistical and HMM for WT, G1324S (Type 2M) and R1450E (Type 2B) A1s versus GC. The error bars show the 95% confidence interval for each method

**Fig. 6.**
Performance comparison of the descriptive statistical and HMM methods. (**A and B**) Errors (measured as 95% confidence interval, CI) of the estimated cellular on-rates (A) and off-rates k_off(B) for 2D binding kinetics of GPIbα–VWF-A1 interaction under the following biological conditions: the WT VWF-A1 (circles), the LOF VWF-A1 mutant G1324S (squares) and the GOF VWF-A1 mutant R1450E (triangles). The errors were plotted for both the descriptive statistical method (blue) and the HMM method (red). (C and D) The numbers of waiting times (C) and bond lifetimes (D) that the descriptive statistical method (blue) and the HMM method (red) are respectively capable of measuring from the same set of raw data

**Fig. 7.**
Tuning parameter selection by half-sampling cross validation. (A) Half-sampling cross validation. The relative error of off-rate from odd sequence versus off-rate from even sequence was plotted against P₀. (B) The relative error of off-rate versus P₀ by comparing the HMM with descriptive statistical method with the same data as the whole sequence

**Fig. 8.**
Learning curve comparison between the descriptive statistical method and the HMM. (A) Comparison of the times spent by a new student to learn the descriptive statistical method (blue) and the HMM (red). Two students who were new to both methods were surveyed. The times for them to finish analysis of one dataset were plotted versus different time checkpoints. Each curve represents a surveyed student. (B) Comparison of the times spent by the experienced students to analyze the same set of raw data using the descriptive statistical method (blue) and the HMM (red). Two students were surveyed

See this image and copyright information in PMC

References

1. Auton M, et al. Destabilization of the A1 domain in von Willebrand factor dissociates the A1A2A3 tri-domain and provokes spontaneous binding to glycoprotein Ibalpha and platelet activation under shear stress. J. Biol. Chem. 2010;285:22831–22839. - PMC - PubMed
1. Baum LE, et al. A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann. Math. Statist. 1970;41:164–171.
1. Berndt MC, et al. Ristocetin-dependent reconstitution of binding of von Willebrand factor to purified human platelet membrane glycoprotein Ib-IX complex. Biochemistry. 1988;27:633–640. - PubMed
1. Cardon LR, Stormo GD. Expectation maximization algorithm for identifying protein-binding sites with variable lengths from unaligned DNA fragments. J. Mol. Biol. 1992;223:159–170. - PubMed
1. Celeux G, Durand J-B. Selecting hidden Markov model state number with cross-validated likelihood. Computation Stat. 2008;23:541–564.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

An HMM-based algorithm for evaluating rates of receptor-ligand binding kinetics from thermal fluctuation data

Affiliation

An HMM-based algorithm for evaluating rates of receptor-ligand binding kinetics from thermal fluctuation data

Authors

Affiliation

Abstract

Figures

Similar articles

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Miscellaneous

Abstract

Figures

Similar articles

References

Publication types

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Miscellaneous