Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov 2;118(44):e2113533118.
doi: 10.1073/pnas.2113533118.

Deep learning the slow modes for rare events sampling

Affiliations

Deep learning the slow modes for rare events sampling

Luigi Bonati et al. Proc Natl Acad Sci U S A. .

Abstract

The development of enhanced sampling methods has greatly extended the scope of atomistic simulations, allowing long-time phenomena to be studied with accessible computational resources. Many such methods rely on the identification of an appropriate set of collective variables. These are meant to describe the system's modes that most slowly approach equilibrium under the action of the sampling algorithm. Once identified, the equilibration of these modes is accelerated by the enhanced sampling method of choice. An attractive way of determining the collective variables is to relate them to the eigenfunctions and eigenvalues of the transfer operator. Unfortunately, this requires knowing the long-term dynamics of the system beforehand, which is generally not available. However, we have recently shown that it is indeed possible to determine efficient collective variables starting from biased simulations. In this paper, we bring the power of machine learning and the efficiency of the recently developed on the fly probability-enhanced sampling method to bear on this approach. The result is a powerful and robust algorithm that, given an initial enhanced sampling simulation performed with trial collective variables or generalized ensembles, extracts transfer operator eigenfunctions using a neural network ansatz and then accelerates them to promote sampling of rare events. To illustrate the generality of this approach, we apply it to several systems, ranging from the conformational transition of a small molecule to the folding of a miniprotein and the study of materials crystallization.

Keywords: collective variables; enhanced sampling; machine learning; molecular dynamics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interest.

Figures

Fig. 1
Fig. 1
(Upper) The Deep-TICA protocol used in this paper. On the left, the miniprotein chignolin is shown, with lines denoting pairwise distances used as descriptors. (Lower) NN architecture and optimization details of Deep-TICA CVs.
Fig. 2
Fig. 2
The Deep-TICA procedure applied to a multithermal simulation of alanine dipeptide. (A) Time evolution of the ϕ-angle in the exploratory OPES multithermal simulation, colored according to the potential energy. (B) Time evolution of the same angle for the simulation in which also the bias on Deep-TICA 1 is added, colored with the value of the latter variable. It can be seen that the system immediately reaches a diffusive behavior. (C) Ramachandran plot of the configurations explored in the Deep-TICA simulation, colored with the average value of Deep-TICA 1. Gray lines denote the isolines of the FES, spaced every 2 kBT. Note that the sampling is focused on the minima and the transition regions that connect them.
Fig. 3
Fig. 3
(A) Time evolution of the ψ-angle in the exploratory simulation driven by ψ. The points are colored with the values of the ϕ-angle. (B) Time evolution of the ϕ-angle in the final Deep-TICA simulation, colored with the value of Deep-TICA 1. This results in a diffusive simulation similar to the previous example, which is even more impressive here given the poor quality of the exploratory sampling.
Fig. 4
Fig. 4
The Deep-TICA procedure applied to chignolin folding. (A) Time evolution of the Cα rmsd for one replica during the initial multithermal run. The points are colored according to their potential energy value. Low energy values reflect the fact that configurations relevant at lower temperatures are sampled. (B) Scatterplot of the two leading Deep-TICA CVs in the exploratory simulation. Points are colored according to the average Cα rmsd values. A weighted k-means clustering identifies four clusters whose centers are denoted by a white ×. The pale background colors reflect how space is partitioned by the clustering algorithm. Snapshots of chignolin in the folded (high values of Deep-TICA 1) and unfolded (low values of Deep-TICA 1) states are also shown, realized with the Visual Molecular Dynamics (VMD) software (69). (C) Time evolution of Cα rmsd for a replica in the multithermal simulation also biasing Deep-TICA 1, colored with the value of the latter variable. The time evolution for the other replicas is reported in SI Appendix, Fig. S7.
Fig. 5
Fig. 5
FES of chignolin at T = 340 K as a function of the two leading Deep-TICA CVs. In Upper and Lower Right are shown the projections of the FES along the corresponding axis (solid lines), confronted with the reference value obtained from a long unbiased MD trajectory at 340 K (50) (dotted lines). Note that the projection of Deep-TICA 2 is obtained by integrating only the region of space with Deep-TICA 1 > 0.65 (marked by a dotted line in Lower Left) to highlight the barriers between the folded metastable states.
Fig. 6
Fig. 6
Comparison between (A and C) a Deep-LDA–driven simulation and (B and D) the one based on the present Deep-TICA approach. In A and B, we report the time evolution of (A) the Deep-LDA CV in the initial simulation and (B) the Deep-TICA CV in the improved one. In A and B, the points are colored according to the fraction of diamond-like atoms in the system, computed as in ref. . Gray shaded lines indicate the values of the two CVs in unbiased simulations of the liquid (bottom lines) and solid (top lines). C and D report the correlation between the two data-driven CVs and the fraction of diamond-like atoms. White circles denote the mean values of the two CVs in the liquid and solid states, while the dotted gray line interpolates between them. In D, we report also a few snapshots of the crystallization process made with the Open Visualization Tool (OVITO) software (71).

References

    1. Peters B., Reaction Rate Theory and Rare Events (Elsevier, 2017).
    1. Valsson O., Tiwary P., Parrinello M., Enhancing important fluctuations: Rare events and metadynamics from a conceptual viewpoint. Annu. Rev. Phys. Chem. 67, 159–184 (2016). - PubMed
    1. Torrie G. M., Valleau J. P., Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling. J. Comput. Phys. 23, 187–199 (1977).
    1. Mezei M., Adaptive umbrella sampling: Self-consistent determination of the non-Boltzmann bias. J. Comput. Phys. 68, 237–248 (1987).
    1. Voter A. F., Accelerated molecular dynamics of infrequent events. Phys. Rev. Lett. 78, 3908–3911 (1997).

LinkOut - more resources