Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov 25;1(12):2377-2384.
doi: 10.1021/jacsau.1c00449. eCollection 2021 Dec 27.

Accurate Machine Learning Prediction of Protein Circular Dichroism Spectra with Embedded Density Descriptors

Affiliations

Accurate Machine Learning Prediction of Protein Circular Dichroism Spectra with Embedded Density Descriptors

Luyuan Zhao et al. JACS Au. .

Abstract

A data-driven approach to simulate circular dichroism (CD) spectra is appealing for fast protein secondary structure determination, yet the challenge of predicting electric and magnetic transition dipole moments poses a substantial barrier for the goal. To address this problem, we designed a new machine learning (ML) protocol in which ordinary pure geometry-based descriptors are replaced with alternative embedded density descriptors and electric and magnetic transition dipole moments are successfully predicted with an accuracy comparable to first-principle calculation. The ML model is able to not only simulate protein CD spectra nearly 4 orders of magnitude faster than conventional first-principle simulation but also obtain CD spectra in good agreement with experiments. Finally, we predicted a series of CD spectra of the Trp-cage protein associated with continuous changes of protein configuration along its folding path, showing the potential of our ML model for supporting real-time CD spectroscopy study of protein dynamics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
(a) NMA structure and protein structure. (b) Valence molecular orbitals and two electronic transitions of the peptide bond which are n → π* or π → π* transitions. (c) Machine learning protocol for predicting protein CD spectra.
Figure 2
Figure 2
ML prediction of the electric and magnetic transition dipole moments of peptide bonds. (a) Correlation plots of the TDDFT and ML predicted electric transition dipole moments of the n → π* and π → π* transitions using CM with GBR. (b) Correlation plots of the TDDFT and ML predicted magnetic transition dipole moments of the n → π* and π → π* transitions using CM with GBR. (c) Same as (a) but using EANN. (d) Same as (b) but using EANN.
Figure 3
Figure 3
Experimental (black curves) and ML predicted (red curves) CD spectra of different types proteins. Intensity is scaled to have the same maximum intensity for each panel.
Figure 4
Figure 4
(a) Experimental (black curves) and ML predicted (red curves) CD spectra. The ML predictions are based on 1000 MD configurations. (b) The ML predicted CD spectra of the Trp-cage protein along its folding path (S1 → S100, S1: the original unfolded structure, S25: slightly folded along with the decrease of coil content, S50: folding faster and helical elements appear, S75: a cage formed with the rapid increase of α-helix, S100: the final stably folded structure). All spectra are averaged over 100 MD conformations for each state.

References

    1. Pan X. J.; Thompson M. C.; Zhang Y.; Liu L.; Fraser J. S.; Kelly M. J. S.; Kortemme T. Expanding the space of protein geometries by computational design of de novo fold families. Science 2020, 369 (6507), 1132.10.1126/science.abc0881. - DOI - PMC - PubMed
    1. Chen C. Y.; Chang Y. C.; Lin B. L.; Huang C. H.; Tsai M. D. Temperature-Resolved Cryo-EM Uncovers Structural Bases of Temperature-Dependent Enzyme Functions. J. Am. Chem. Soc. 2019, 141 (51), 19983–19987. 10.1021/jacs.9b10687. - DOI - PubMed
    1. Mangubat-Medina A. E.; Martin S. C.; Hanaya K.; Ball Z. T. A Vinylogous Photocleavage Strategy Allows Direct Photocaging of Backbone Amide Structure. J. Am. Chem. Soc. 2018, 140 (27), 8401–8404. 10.1021/jacs.8b04893. - DOI - PubMed
    1. Salvi N.; Abyzov A.; Blackledge M. Analytical Description of NMR Relaxation Highlights Correlated Dynamics in Intrinsically Disordered Proteins. Angew. Chem., Int. Ed. 2017, 56 (45), 14020–14024. 10.1002/anie.201706740. - DOI - PubMed
    1. Wu S.; Wang D.; Liu J.; Feng Y.; Weng J.; Li Y.; Gao X.; Liu J.; Wang W. The Dynamic Multisite Interactions between Two Intrinsically Disordered Proteins. Angew. Chem., Int. Ed. 2017, 56 (26), 7515–7519. 10.1002/anie.201701883. - DOI - PubMed