Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Aug 8;82(8):108.
doi: 10.1007/s11538-020-00783-2.

The de Rham-Hodge Analysis and Modeling of Biomolecules

Affiliations

The de Rham-Hodge Analysis and Modeling of Biomolecules

Rundong Zhao et al. Bull Math Biol. .

Abstract

Biological macromolecules have intricate structures that underpin their biological functions. Understanding their structure-function relationships remains a challenge due to their structural complexity and functional variability. Although de Rham-Hodge theory, a landmark of twentieth-century mathematics, has had a tremendous impact on mathematics and physics, it has not been devised for macromolecular modeling and analysis. In this work, we introduce de Rham-Hodge theory as a unified paradigm for analyzing the geometry, topology, flexibility, and Hodge mode analysis of biological macromolecules. Geometric characteristics and topological invariants are obtained either from the Helmholtz-Hodge decomposition of the scalar, vector, and/or tensor fields of a macromolecule or from the spectral analysis of various Laplace-de Rham operators defined on the molecular manifolds. We propose Laplace-de Rham spectral-based models for predicting macromolecular flexibility. We further construct a Laplace-de Rham-Helfrich operator for revealing cryo-EM natural frequencies. Extensive experiments are carried out to demonstrate that the proposed de Rham-Hodge paradigm is one of the most versatile tools for the multiscale modeling and analysis of biological macromolecules and subcellular organelles. Accurate, reliable, and topological structure-preserving algorithms for implementing discrete exterior calculus (DEC) have been developed to facilitate the aforementioned modeling and analysis of biological macromolecules. The proposed de Rham-Hodge paradigm has potential applications to subcellular organelles and the structure construction from medium- or low-resolution cryo-EM maps, and functional predictions from massive biomolecular datasets.

Keywords: Algebraic topology; Cryo-EM analysis; De Rham–Hodge theory; Differential geometry; Macromolecular Hodge mode analysis; Macromolecular flexibility.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Illustration of tangential spectra of a cryo-EM map EMD 7972. Topologically, EMD 7972 (Baradaran et al. 2018) has six handles and two cavities. The left column is the original shape and its anatomy showing the topological complexity. On the right-hand side of the parenthesis, the first row shows tangential harmonic eigenfields, the second row shows tangential gradient eigenfields, and the third row shows tangential curl eigenfields. The credit for the leftmost picture belongs to Hayam Mohamed Abdelrahman
Fig. 2
Fig. 2
Illustration of the normal spectra of protein and DNA complex 6D6V. Topologically, the crystal structure of 6D6V (Jiang et al. 2018) has 1 handle. The left column shows the secondary structure and the solvent-excluded surface (SES). On the right-hand side, the first two rows show normal gradient eigenfields, and the last two rows show normal curl eigenfields
Fig. 3
Fig. 3
Illustration of Hodge Laplacian spectra. This figure shows the properties of three spectral groups, namely tangential gradient eigenfields (T), normal gradient eigenfields (N), and curl eigenfields (C), for EMD 8962 (Singh et al. 2018). a The original input surface and three distinct spectral groups. b The cross-section of a typical tangential gradient eigenfield and the distribution of eigenvalues for group T. c The cross-section of a typical normal gradient eigenfield and the distribution of eigenvalues for group N. d A typical curl eigenfield and the distribution of eigenvalues for group C. e The left chart shows the convergence of spectra in the same spectral group due to the increase in the mesh size, i.e., the DoFs from 1000 (1K) to 6000 (6K). Obviously, low-order eigenvalues converge fast (middle chart) and high-order eigenvalues converge slowly (right chart)
Fig. 4
Fig. 4
Illustration of topological analysis. a Eigenfields by null space of tangential Laplace–de Rham operators correspond to handles. b Eigenfields by null space of normal Laplace–de Rham operators correspond to cavities
Fig. 5
Fig. 5
Illustration of geometric analysis. The geometry of different molecules (PDB IDs: 2Z5H (a), 6HU5 (b), and 5HY9 (c)) can be captured by three groups of different Hodge Laplacian spectra with clear separations shown in d. Note that the color of the line plot corresponds to the color of the molecules. The solid lines show the tangential gradient (T) spectrum, the dashed lines show the normal gradient (N) spectrum, and the dot lines show the curl spectrum (C). While there is a possibility that certain spectral sets may be close to each other (see group T of proteins 6HU5 and 5HY9), the other two groups of spectra (see groups N and C of proteins 6HU5 and 5HY9) will show a clear difference. In addition, our topological features will also provide a definite difference. For example, protein 6HU5 has trivial topology (ball), but protein 5HY9 has a handle
Fig. 6
Fig. 6
Illustration of the procedure for flexibility analysis. We use protein 3VZ9 (Nishino et al. 2013) as an example to demonstrate our procedure from a to f. a The input protein crystal structure. b That only C-alpha atoms (yellow spheres) are considered in this case. We assign a Gaussian kernel to each C-alpha atom and extract the level set surface (transparent surface) as our computation domain. c That standard tetrahedral mesh is generated with the domain. (Boundary faces are gray; inner faces are indigo.) We use a standard matrix diagonalization procedure to obtain eigenvalues and eigenvectors. B-factor at each mesh vertex is computed as shown in Eq. (22). d B-factor at the position of a C-alpha atom is obtained by the linear regression using within the nearby region. (For the red C-alpha atom, the linear regression region is colored as purple, which is within the cutoff radius.) e The predicted B-factors on the surface. f The predicted B-factors at C-alpha atoms (orange), compared with the experimental B-factors in the PDB file (blue). Our prediction for 3VA9 has the Pearson correlation coefficient of 0.8081
Fig. 7
Fig. 7
Statistics of the average Pearson correlation coefficient (PCC) with various parameters on the test set of 364 proteins. Each plot has the same cutoff radius varying from 1.0 Å to 6.0 Å with interval 1.0 Å. In each plot, the level set value varies from 0.2 to 0.8 with interval 0.2 shown by different lines; the grid spacing varies from 1.6 Å to 4.0 Å with interval 0.4 Å shown in the horizontal axis
Fig. 8
Fig. 8
Illustration of B-factor prediction. We use proteins 1V70 (Hanawa-Suetsugu et al. 2004), 3F2Z, and 3VZ9 as examples to show our predictions compared with the experiments. The red lines with triangles are the ground truth from experimental data. The blue lines with circles are predictions with our method (EDH). The green lines with cubes are predictions from Gaussian network method (GNM)
Fig. 9
Fig. 9
Hodge modes of EMD 1258. The 0th, 4th, 8th, and 12th Hodge modes are shown
Fig. 10
Fig. 10
Biological flow decomposition. Illustration of a synthetic vector field in EMD 1590 that is decomposed into several mutually orthogonal components based on different boundary conditions
Fig. 11
Fig. 11
The PB implicit solvent model. Γ is the molecular surface separating space into the solute region Ω1 and the solvent region Ω2
Fig. 12
Fig. 12
a The force field of two positive charges; b the first eigenvector; c the force field of one negative and one positive charge; c the second eigenvector
Fig. 13
Fig. 13
The first row shows the first five eigenmodes. The second row shows vector fields under corresponding charge combinations
Fig. 14
Fig. 14
Illustration of orientation. The pre-assigned orientation is colored in red. Induced orientation by ∂ is colored in green. The vertices are assumed to have a positive pre-assigned orientation. Therefore, the induced orientation from edge orientation is +1 at the head and −1 at the tail. For a triangle facet, +1 is assigned whenever the pre-assigned orientation conforms with the induced orientation, and −1 vice versa. A similar rule applies to tets which obey a right-hand orientation with the normal pointing outward. Non-adjacent vertices give 0
Fig. 15
Fig. 15
Illustration of the primal and dual elements of the tetrahedral mesh. All the red vertices are mesh primal vertices. All the indigo vertices are dual vertices at the circumcenter of each tet. All the gray edges are primal edges. All the pink edges are dual edges connecting adjacent dual vertices. The first chart shows the dual cell of a primal vertex. The second chart shows the dual facet of the primal edge. The third chart shows the dual edge of the primal facet. The last chart shows the dual vertex of the primal cell (tet)
Fig. 16
Fig. 16
Illustration of cohomology. This figure illustrates the relation by exterior derivative and Hodge star operators. The assembly of Laplacian operator Lk is just starting from primal k-forms, multiplying matrices along the circular direction

References

    1. Alexov E, Mehler EL, Baker N, Baptista AM, Huang Y, Milletti F, Erik Nielsen J, Farrell D, Carstensen T, Olsson MH et al. (2011) Progress in the prediction of pka values in proteins. Proteins Struct Funct Bioinf 79(12):3260–3275 - PMC - PubMed
    1. Antosiewicz J, McCammon JA, Gilson MK (1996) The determinants of pKas in proteins. Biochemistry 35(24):7819–7833 - PubMed
    1. Arnold DN, Falk RS, Winther R (2006) Finite element exterior calculus, homological techniques, and applications. Acta Numer 15:1–155
    1. Atilgan AR, Durell S, Jernigan RL, Demirel M, Keskin O, Bahar I (2001) Anisotropy of fluctuation dynamics of proteins with an elastic network model. Biophys J 80(1):505–515 - PMC - PubMed
    1. Bahar I, Atilgan AR, Erman B (1997) Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential. Fold Des 2:173–181 - PubMed

Publication types

Substances