Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 28:21:2524-2535.
doi: 10.1016/j.csbj.2023.03.033. eCollection 2023.

Statistical learning of protein elastic network from positional covariance matrix

Affiliations

Statistical learning of protein elastic network from positional covariance matrix

Chieh Cheng Yu et al. Comput Struct Biotechnol J. .

Abstract

Positional fluctuation and covariance during protein dynamics are key observables for understanding the molecular origin of biological functions. A frequently employed potential energy function for describing protein structural variation at the coarse-gained level is elastic network model (ENM). A long-standing issue in biomolecular simulation is thus the parametrization of ENM spring constants from the components of positional covariance matrix (PCM). Based on sensitivity analysis of PCM, the direct-coupling statistics of each spring, which is a specific combination of position fluctuation and covariance, is found to exhibit prominent signal of parameter dependence. This finding provides the basis for devising the objective function and the scheme of running through the effective one-dimensional optimization of every spring by self-consistent iteration. Formal derivation of the positional covariance statistical learning (PCSL) method also motivates the necessary data regularization for stable calculations. Robust convergence of PCSL is achieved in taking an all-atom molecular dynamics trajectory or an ensemble of homologous structures as input data. The PCSL framework can also be generalized with mixed objective functions to capture specific property such as the residue flexibility profile. Such physical chemistry-based statistical learning thus provides a useful platform for integrating the mechanical information encoded in various experimental or computational data.

Keywords: All-atom molecular dynamics simulation; Elastic network model; Homologous structure; Positional covariance matrix; Serine protease; Statistical learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

ga1
Graphical abstract
Fig. 1
Fig. 1
The scheme of positional covariance statistical learning (PCSL) for computing the spring constant list [kij] of an elastic network model (ENM) from an ensemble of protein structures, {riI}.
Fig. 2
Fig. 2
Sensitivities of positional fluctuations and covariances in ENM to a spring constant. The rat trypsin (RT) structure with the protein data bank ID 3TGI is used to construct the elastic network model of Cα carbons. The lc cutoff for including an atom pair in the ENM is 10 Å, and every spring takes the same elastic constant value. For this zeroth-order ENM, magnitudes of positional fluctuations of node p and covariances of nodes p and q are Cpp0 and Cpq0, respectively. Variance of the inter-node distance is cˆpq0. The derivatives with respect to the kij between i = 172 and j = 226 are shown as an example. The center of geometry of the spring is rij0 and that of the node pair is rpq0. The distance between the two centers is l(rij0,rpq0). (a) Cpq0kij versus l(rij0,rpq0). (b) cˆpq0kij versus l(rij0,rpq0).
Fig. 3
Fig. 3
Structure of the rat trypsin (RT) model system and the Cii*, cˆij*, and δlij2* of Cα atoms in Å2 calculated from the reference all-atom MD trajectory. (a) RT is a S1A serine protease family member and the 3TGI structure is shown in a ribbon representation. The color bar indicates the Cii* values. (b) For residue pairs of ∣j − i∣ > 1, the comparison of cˆij* with δlij2*. Insert: the comparison for neighboring residues of ∣j − i∣ = 1 with the same x- and y-axis.
Fig. 4
Fig. 4
Calibration of PCSL in terms of the calculated spring constants and the predicted positional covariance. (a) The root-of-mean-squared (RMS) difference between kij([cˆij*]), the spring constants learned from [cˆij*], and kij([δlij2*]), those learned from [δlij2*], as a function of the data-regularization parameter ϵ. The RMS difference Δk([cˆ*ij],[δlij2*]) is defined in Eq. (12). The left y-axis reports the value for ∣j − i∣ = 1 springs while the right y-axis is for the elastic parameters of ∣j − i∣ > 1. The insert is the scatter plot of kij([cˆij*]) versus kij([δlij2*]) for the ∣j − i∣ > 1 springs. (b) The scatter plot of the Cij value given by the ENM with the statistically learned spring constants and Cij*, the targeted value. Color bar indicates the kij values for lij0<lc.
Fig. 5
Fig. 5
Comparison of PCSL with or without applying the additional scoring function for restraining the positional fluctuation profile by eq. (8) and eq. (9). Here, input data is the all-atom MD trajectory. (a) The Cii profile predicted by the ENM of kij([δlij2*]) and of kij([cˆij*]) does not involve the additional scoring function. The Cii profile of the ENM of kij([cˆij*,Cii*]) contains the scoring function for restraining the positional fluctuation profile. (b) The scatter plot of kij([cˆ*ij],[C*ii]) versus kij([cˆ*ij]) for ∣j − i∣ > 1 springs. The inserts show the change in spring constant Δkij due to the addition of Cii* restraints in PCSL for the springs of Y39 and K175.
Fig. 6
Fig. 6
PCSL using an ensemble of homologous RT structures as input, homo-ENM, and the comparison with taking an all-atom MD trajectory as the targeted data, MD-ENM. In both cases, data regularization with ϵ = 0.6 (c.f. Methods) and restraint in residue flexibility profile with β = 0.7 (c.f. eq. (9)) are used. (a) The Cii calculated from the homo-ENM and MD-ENM. The linked red boxes are examples of strongly coupled residue clusters. (b) The kij heat map of ∣j − i∣ > 1 springs for homo-ENM and MD-ENM. The red boxes are the residue clusters as those in (a). The RT secondary structures are highlighted on top. N-β barrel and C-β barrel are the N-terminal and C-terminal β barrel respectively. Numbers below are the sequential indices of β strands.
Fig. 7
Fig. 7
PCSL for homo-ENM and MD-ENM shows specific mechanical hotspots. (a) The count of each residue in appearing in the exceptionally strong spring constants in homo-ENM and in MD-ENM. The count coming from the 42 common springs is in brown. Stacking on top of the common count is the count due to the exceptionally strong springs in homo-ENM but not in MD-ENM (pink), and that due to those in MD-ENM but not in homo-ENM (cyan). Residues are only labelled if the total count is higher than four in either model. The secondary structures can be referenced to Fig. 6b. (b) Illustration of the exceptionally strong springs and the mechanical hotspot residues in (a) on the protein structure in a ribbon representation. The coloring scheme follows that in (a). The residues and springs from the common category are made semi-transparent in the left and middle panel for clarity.

Similar articles

Cited by

References

    1. Karplus M., Kuriyan J. Molecular dynamics and protein function. Proc Natl Acad Sci USA. 2005;102(19):6679–6685. doi: 10.1073/pnas.0408930102. - DOI - PMC - PubMed
    1. Bahar I., Lezon T.R., Yang L.-W., Eyal E. Global dynamics of proteins: bridging between structure and function. Annu Rev Biophys. 2010;39:23–42. doi: 10.1146/annurev.biophys.093008.131258. - DOI - PMC - PubMed
    1. Orozco M. A theoretical view of protein dynamics. Chem Soc Rev. 2014;43(14):5051–5066. doi: 10.1039/C3CS60474H. - DOI - PubMed
    1. Yang L.-W., Bahar I. Coupling between catalytic site and collective dynamics: a requirement for mechanochemical activity of enzymes. Structure. 2005;13(6):893–904. doi: 10.1016/j.str.2005.03.015. - DOI - PMC - PubMed
    1. Bahar I., Chennubhotla C., Tobi D. Intrinsic dynamics of enzymes in the unbound state and, relation to allosteric regulation. Curr Opin Struct Biol. 2007;17(6):633–640. doi: 10.1016/j.sbi.2007.09.011. - DOI - PMC - PubMed

LinkOut - more resources