Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 Apr:79:102533.
doi: 10.1016/j.sbi.2023.102533. Epub 2023 Jan 31.

Machine learned coarse-grained protein force-fields: Are we there yet?

Affiliations
Review

Machine learned coarse-grained protein force-fields: Are we there yet?

Aleksander E P Durumeric et al. Curr Opin Struct Biol. 2023 Apr.

Abstract

The successful recent application of machine learning methods to scientific problems includes the learning of flexible and accurate atomic-level force-fields for materials and biomolecules from quantum chemical data. In parallel, the machine learning of force-fields at coarser resolutions is rapidly gaining relevance as an efficient way to represent the higher-body interactions needed in coarse-grained force-fields to compensate for the omitted degrees of freedom. Coarse-grained models are important for the study of systems at time and length scales exceeding those of atomistic simulations. However, the development of transferable coarse-grained models via machine learning still presents significant challenges. Here, we discuss recent developments in this field and current efforts to address the remaining challenges.

PubMed Disclaimer

Conflict of interest statement

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Figure 1:
Figure 1:
Sequential reduction in resolution of a variant of the miniprotein Chignolin (CLN025) from a solvated all-atom representation containing many thousands of atoms, to an implicit solvent representation, to a heavy-backbone representation with Cβ beads, and finally to a Cα CG representation containing 10 beads.
Figure 2:
Figure 2:
A pipeline for creating and using ML CG models from atomistic simulation data and experimental measurements. A chosen CG mapping can reduce reference information into a CG dataset that can be used to train ML CG models. This training can rely on both simulation and experimental observables in order to reduce the complexity of the learning task and respect physical constraints. A trained ML CG model can then be validated through CG MD and used for general property predictions.
Figure 3:
Figure 3:
State-of-the-art performance for a Cα CG ML model on the benchmark protein CLN025. A) Comparison of the CG free energy landscape of CLN025 (produced using MD) for a learned CG ML model with the corresponding free energy for the reference all-atom dataset projected onto slow degrees of freedom (TICA) [74]. B) Ensembles of structures sampled from the CG ML model MD simulation (in red) are superimposed onto all-atom reference structure counterparts (in blue). Basin 1 represents the unfolded state, basin 2 the misfolded state, and basin 3 the folded state.

References

    1. Levitt M, Warshel A, Computer simulation of protein folding, Nature 253 (5494) (1975) 694–698. - PubMed
    1. Clementi C, Coarse-grained models of protein folding: toy models or predictive tools?, Curr. Opin. Struct. Biol 18 (1) (2008) 10–15. - PubMed
    1. Bryngelson JD, Wolynes PG, Spin glasses and the statistical mechanics of protein folding., Proc. Natl. Acad. Sci. USA 84 (21) (1987) 7524–7528. - PMC - PubMed
    1. Onuchic JN, Luthey-Schulten Z, Wolynes PG, Theory of Protein Folding: The energy landscape perspective, Annu. Rev. Phys. Chem 48 (1) (1997) 545–600. - PubMed
    1. Dill KA, Bromberg S, Yue K, Chan HS, Ftebig KM, Yee DP, Thomas PD, Principles of protein folding — a perspective from simple exact models, Protein Science 4 (4) (1995) 561–602. doi:10.1002/pro.5560040401. - DOI - PMC - PubMed

Publication types

LinkOut - more resources