. 2021 Feb 12;17(2):e1008308.

doi: 10.1371/journal.pcbi.1008308. eCollection 2021 Feb.

OpenAWSEM with Open3SPN2: A fast, flexible, and accessible framework for large-scale coarse-grained biomolecular simulations

Wei Lu^{1

2}, Carlos Bueno^{1

3}, Nicholas P Schafer^{1

3

4}, Joshua Moller^{5

6}, Shikai Jin^{1

7}, Xun Chen^{1

3}, Mingchen Chen¹, Xinyu Gu^{1

3}, Aram Davtyan¹, Juan J de Pablo^{5

6}, Peter G Wolynes^{1

3

2

7}

Affiliations

¹ Center for Theoretical Biological Physics, Rice University, Houston, Texas, United States of America.
² Department of Physics, Rice University, Houston, Texas, United States of America.
³ Department of Chemistry, Rice University, Houston, Texas, United States of America.
⁴ Schafer Science, LLC, Houston, Texas United States of America.
⁵ Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois, United States of America.
⁶ Argonne National Laboratory, Lemont, Illinois, United States of America.
⁷ Department of Biosciences, Rice University, Houston, Texas, United States of America.

PMID: 33577557
PMCID: PMC7906472
DOI: 10.1371/journal.pcbi.1008308

OpenAWSEM with Open3SPN2: A fast, flexible, and accessible framework for large-scale coarse-grained biomolecular simulations

Wei Lu et al. PLoS Comput Biol. 2021.

. 2021 Feb 12;17(2):e1008308.

doi: 10.1371/journal.pcbi.1008308. eCollection 2021 Feb.

Authors

Affiliations

¹ Center for Theoretical Biological Physics, Rice University, Houston, Texas, United States of America.
² Department of Physics, Rice University, Houston, Texas, United States of America.
³ Department of Chemistry, Rice University, Houston, Texas, United States of America.
⁴ Schafer Science, LLC, Houston, Texas United States of America.
⁵ Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois, United States of America.
⁶ Argonne National Laboratory, Lemont, Illinois, United States of America.
⁷ Department of Biosciences, Rice University, Houston, Texas, United States of America.

PMID: 33577557
PMCID: PMC7906472
DOI: 10.1371/journal.pcbi.1008308

Abstract

We present OpenAWSEM and Open3SPN2, new cross-compatible implementations of coarse-grained models for protein (AWSEM) and DNA (3SPN2) molecular dynamics simulations within the OpenMM framework. These new implementations retain the chemical accuracy and intrinsic efficiency of the original models while adding GPU acceleration and the ease of forcefield modification provided by OpenMM's Custom Forces software framework. By utilizing GPUs, we achieve around a 30-fold speedup in protein and protein-DNA simulations over the existing LAMMPS-based implementations running on a single CPU core. We showcase the benefits of OpenMM's Custom Forces framework by devising and implementing two new potentials that allow us to address important aspects of protein folding and structure prediction and by testing the ability of the combined OpenAWSEM and Open3SPN2 to model protein-DNA binding. The first potential is used to describe the changes in effective interactions that occur as a protein becomes partially buried in a membrane. We also introduced an interaction to describe proteins with multiple disulfide bonds. Using simple pairwise disulfide bonding terms results in unphysical clustering of cysteine residues, posing a problem when simulating the folding of proteins with many cysteines. We now can computationally reproduce Anfinsen's early Nobel prize winning experiments by using OpenMM's Custom Forces framework to introduce a multi-body disulfide bonding term that prevents unphysical clustering. Our protein-DNA simulations show that the binding landscape is funneled towards structures that are quite similar to those found using experiments. In summary, this paper provides a simulation tool for the molecular biophysics community that is both easy to use and sufficiently efficient to simulate large proteins and large protein-DNA systems that are central to many cellular processes. These codes should facilitate the interplay between molecular simulations and cellular studies, which have been hampered by the large mismatch between the time and length scales accessible to molecular simulations and those relevant to cell biology.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Benchmark timing results for AWSEM simulations with the LAMMPS and the OpenMM implementations on a linear scale (left) and on a log scale (right).**
The x-axis is the number of residues in the proteins that are being simulated. The y-axis shows the number of computer hours needed to run a 1 million-step simulation. Each protein was simulated 5 times using each implementation. The lines are quadratic fits. The simulation protein set was chosen to have a wide range of protein sequence lengths ranging from 164 residues to 3724 residues.

**Fig 2. Benchmark timing results for 3SPN2 simulations with the LAMMPS implementation of 3SPN2 and the OpenMM implementation of 3SPN2 on a linear scale (left) and on a log scale (right).**
The x-axis is the number of nucleotides in the DNA that is being simulated. The y-axis shows the number of computer hours that are needed to run a 1 million-timestep simulation. Each DNA length was simulated 5 times using each implementation. The lines are quadratic fits. The DNA lengths range from 110 nucleotides to 1580 nucleotides.

**Fig 3. Benchmark results for AWSEM-3SPN2 simulations of protein-DNA complexes using the LAMMPS and the OpenMM implementations of both forcefields on a linear scale (left) and on a log scale (right).**
The x-axis shows the PDB ID. The y-axis shows the computer hours needed to simulate for 1 million steps. Each complex was simulated 5 times using each implementation. The protein length ranges from 52 nucleotides to 2050 amino acids, while the DNA length ranges from 2 to 40 nucleotides.

Fig 4. A scatter plot of the interaction energy between the DNA and the protein versus the fraction of the symmetrized native contacts formed at each time frame during the last 7.5 million steps of simulations from 10 runs.
The average energy as a function of the number of symmetrized native contacts is indicated with blue line. A simulation snapshot showing the overlap of the crystal structure (colored in red) and the predicted structure (colored in cyan) that has the lowest interface energy. There is a high correlation between the protein-DNA interface energy and the number of symmetrized contacts, indicating that the binding process is funneled to the correct interface. The overlap figure was created by aligning only the protein parts of the crystal structure and the predicted structure. We see that the DNA in both structures turns out to be aligned quite well, showing good structural agreement between the lowest energy simulated structure and the experimental structure.

**Fig 5. A schematic figure for the Z-dependent contact potential.**
The residues outside of the membrane, where the membrane boundary is indicated by the two colored lines, interact using the globular parameters. The residues inside the membrane interact using the membrane-optimized parameters. If one residue is inside, while another one is outside, the pair interacts as if they both were in water. In the heat maps on the left side of the figure, red color indicates a favorable interaction between the pair of residues indicated on the horizontal and vertical axes, whereas blue color indicates an unfavorable interaction. Separate heat maps are shown for the direct, low-density, and high-density interaction matrices in the water (globular) and membrane environments.

**Fig 6. Structure prediction results using the three contact potential schemes evaluated using Q_water (left) and Q_mem (right).**
Q_water measures the structural similarity to the native structure using only the residues that are outside of the membrane, whereas Q_mem measures the structural similarity of the structures for those residues found inside the membrane. The closer the similarity score is to 1.0, the more native like is the prediction. The hybrid potential in general performs better than either the pure globular protein model or the pure membrane model.

**Fig 7. Overlay of the native structures and the best Q_water and Q_mem structures using the membrane burial depth dependent contact potential.**
For each protein, the upper figure shows the part of the protein that is found buried in the membrane and the lower part of the figure shows the globular domain.

**Fig 8. The fraction of correct location assignments of the residues relative to the membrane using a purely sequence-based method (PureseqTM) and that yielded by running OpenAWSEM simulations (AWSEM).**

**Fig 9. Structure prediction results for six disulfide rich proteins using various strengths of the saturable disulfide bond interaction.**
We plot the best Q from 20 simulated annealing runs that started from different random velocity seeds for each different value of the disulfide interaction strength. As the strength of the disulfide interactions increases, the best Q increases. 1tcg, 1lmm, 1bpi and 1ppb all have 3 disulfide bond. 1fs3 has 4 disulfide bonds, and 1hn4 has 7 disulfide bonds. The relatively modest best Q for thrombin (1ppb) probably comes from the fact that we have only modeled the main chain of the molecule, but thrombin also has a short chain that has been experimentally shown to be important for function [42].

**Fig 10. The fractions of correct disulfide bonds in the predictions of several disulfide rich proteins.**
These fractions are shown for several different strengths of the saturable interaction. At full strength, nearly all the pairs form correctly.

**Fig 11. The formation of disulfide bonds in a single annealing trajectory with k = 5.**
Following the trajectory in time, disulfide pairs are darkened in when they are formed. Red indicates that a native disulfide bond has been formed. Blue indicates that a non-native disulfide bond has formed. The alignment of the best Q structure from this trajectory with the crystal structure is shown in SI. Its Q value is 0.77.

Fig 12. The average formation of disulfide bonds as a function of time over the 20 annealing runs, with the patterns from the standard AWSEM shown on the left and patterns from the nonadditive disulfide potential runs with k = 5 shown on the right.
Red indicates that native disulfide bond has formed. Blue indicates the formation of a non-native disulfide bond. The darker the color, the larger fraction of the trajectories that form this disulfide bond during this time frame. We see that, occasionally, even with the full strength saturable interactions, sometimes non-native disulfides persist after the rapid annealings.

See this image and copyright information in PMC

Cited by

OpenABC enables flexible, simplified, and efficient GPU accelerated simulations of biomolecular condensates.
Liu S, Wang C, Latham AP, Ding X, Zhang B. Liu S, et al. PLoS Comput Biol. 2023 Sep 11;19(9):e1011442. doi: 10.1371/journal.pcbi.1011442. eCollection 2023 Sep. PLoS Comput Biol. 2023. PMID: 37695778 Free PMC article.
UNRES-GPU for physics-based coarse-grained simulations of protein systems at biological time- and size-scales.
Ocetkiewicz KM, Czaplewski C, Krawczyk H, Lipska AG, Liwo A, Proficz J, Sieradzan AK, Czarnul P. Ocetkiewicz KM, et al. Bioinformatics. 2023 Jun 1;39(6):btad391. doi: 10.1093/bioinformatics/btad391. Bioinformatics. 2023. PMID: 37338530 Free PMC article.
Machines on Genes through the Computational Microscope.
Sinha S, Pindi C, Ahsan M, Arantes PR, Palermo G. Sinha S, et al. J Chem Theory Comput. 2023 Apr 11;19(7):1945-1964. doi: 10.1021/acs.jctc.2c01313. Epub 2023 Mar 22. J Chem Theory Comput. 2023. PMID: 36947696 Free PMC article. Review.
A structural dynamics model for how CPEB3 binding to SUMO2 can regulate translational control in dendritic spines.
Gu X, Schafer NP, Bueno C, Lu W, Wolynes PG. Gu X, et al. PLoS Comput Biol. 2022 Nov 8;18(11):e1010657. doi: 10.1371/journal.pcbi.1010657. eCollection 2022 Nov. PLoS Comput Biol. 2022. PMID: 36346822 Free PMC article.
Exploring the folding energy landscapes of heme proteins using a hybrid AWSEM-heme model.
Chen X, Lu W, Tsai MY, Jin S, Wolynes PG. Chen X, et al. J Biol Phys. 2022 Mar;48(1):37-53. doi: 10.1007/s10867-021-09596-3. Epub 2022 Jan 9. J Biol Phys. 2022. PMID: 35000062 Free PMC article.

See all "Cited by" articles

References

1. Lindorff-Larsen K, Piana S, Dror RO, Shaw DE. How fast-folding proteins fold. Science. 2011;334(6055):517–520. 10.1126/science.1208351 - DOI - PubMed
1. Suomivuori CM, Latorraca NR, Wingler LM, Eismann S, King MC, Kleinhenz AL, et al.. Molecular mechanism of biased signaling in a prototypical G protein–coupled receptor. Science. 2020;367(6480):881–887. 10.1126/science.aaz0326 - DOI - PMC - PubMed
1. Kauzmann W. Some factors in the interpretation of protein denaturation. In: Advances in protein chemistry. vol. 14. Elsevier; 1959. p. 1–63. - PubMed
1. Papoian GA, Ulander J, Eastwood MP, Luthey-Schulten Z, Wolynes PG. Water in protein structure prediction. Proceedings of the National Academy of Sciences. 2004;101(10):3352–3357. 10.1073/pnas.0307851100 - DOI - PMC - PubMed
1. Papoian GA, Ulander J, Wolynes PG. Role of water mediated interactions in protein- protein recognition landscapes. Journal of the American Chemical Society. 2003;125(30):9170–9178. 10.1021/ja034729u - DOI - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

OpenAWSEM with Open3SPN2: A fast, flexible, and accessible framework for large-scale coarse-grained biomolecular simulations

Affiliations

OpenAWSEM with Open3SPN2: A fast, flexible, and accessible framework for large-scale coarse-grained biomolecular simulations

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources

Other Literature Sources