. 2010 Mar 14;18(4):423-35.

doi: 10.1016/j.str.2010.01.012.

Dynameomics: a comprehensive database of protein dynamics

Marc W van der Kamp¹, R Dustin Schaeffer, Amanda L Jonsson, Alexander D Scouras, Andrew M Simms, Rudesh D Toofanny, Noah C Benson, Peter C Anderson, Eric D Merkley, Steven Rysavy, Dennis Bromley, David A C Beck, Valerie Daggett

Affiliations

PMID: 20399180
PMCID: PMC2892689
DOI: 10.1016/j.str.2010.01.012

Dynameomics: a comprehensive database of protein dynamics

Marc W van der Kamp et al. Structure. 2010.

. 2010 Mar 14;18(4):423-35.

doi: 10.1016/j.str.2010.01.012.

Authors

Affiliation

¹ Department of Bioengineering, University of Washington, Seattle, WA 98195-5013, USA.

PMID: 20399180
PMCID: PMC2892689
DOI: 10.1016/j.str.2010.01.012

Abstract

The dynamic behavior of proteins is important for an understanding of their function and folding. We have performed molecular dynamics simulations of the native state and unfolding pathways of over 2000 protein/peptide systems (approximately 11,000 independent simulations) representing the majority of folds in globular proteins. These data are stored and organized using an innovative database approach, which can be mined to obtain both general and specific information about the dynamics and folding/unfolding of proteins, relevant subsets thereof, and individual proteins. Here we describe the project in general terms and the type of information contained in the database. Then we provide examples of mining the database for information relevant to protein folding, structure building, the effect of single-nucleotide polymorphisms, and drug design. The native state simulation data and corresponding analyses for the 100 most populated metafolds, together with related resources, are publicly accessible through http://www.dynameomics.org.

PubMed Disclaimer

Figures

**Figure 1**
Consensus domain dictionary generation and target selection. (A) Domain dictionaries partition structures from the Protein Data Bank into domains and folds. The current versions of the SCOP, CATH and DALI domain dictionaries include about 30,000 structures. We find consensus between the domains within these separate domain dictionaries to generate consensus domains. These consensus domains are filtered by sequence to generate a non-redundant consensus domain dictionary. These non-redundant domains are then clustered into metafolds and ranked by population. A single fold representative is selected from each metafold. This representative is either suitable for simulation and becomes part of our release set, or is judged unsuitable for simulation and is rejected. Simulation and analysis data of fold representatives from the Top 100 most populated folds in our release set are publicly available on our website. (B) Three examples of fold representatives (in red) that were rejected because they are not truly autonomous domains. Left, cathepsin D from PDB 1LYA; Middle, chain 4 in the human poliovirus 1, from PDB 1AL2; Right, delta crystallin I from PDB 1I0A.

**Figure 2**
Overview of the generation and use of our simulation data, organized through the Dynameomics database. After selection of protein targets to cover the majority of known protein folds (Figure 1) as well as to examine the influence of single residue mutations (Figure 5), structures obtained from the PDB are prepared for simulation by addition of hydrogens, minimization and solvation, according to our standard protocols (see text). Simulations are then run to capture both native state and unfolding dynamics. The simulation trajectories are stored in the database, together with a set of standard analysis (which can be viewed online). The simulations of native state dynamics are used to define global residue properties and build a fragment library (Figure 3). Specialized mining of the simulations in the database is also performed in-house, to examine further questions on protein dynamics and unfolding (Figure 4), including the use of new techniques such as wavelet and flexibility analysis. We also offer direct access to the native state simulations of the Top 100 fold representatives in our database to external users, who can query these data using SQL Server.

**Figure 3**
Residue and fragment properties obtained from our simulation database. (A) GGAGG simulated at 298K for 100 ns, shown at 1 ns snapshots, aligned on the central Ala. Below are plots of solvent accessible surface area (SASA) and a Ramachandran map of the central Ala. (B) Top (left) and side (right) views of 100 representative sample rotamers from native state simulations for Leu. Below are plots of dihedral angle distributions and rotamer transition frequencies. (C) A 7-residue fragment as stored in the fragment library. Distances between heavy atoms of the terminal residues are used to characterize the fragment (only 4 are shown for clarity). Below, 20 fragments from our simulations matching our initial fragment (in black).

**Figure 4**
Mining of the collected native state and unfolding simulations. (A) Flexibility and early unfolding trajectory of the DNA-binding domain of human telomeric protein, HTRF1. Flexibility vectors are shown as arrows scaled to four times the standard deviation of the atom’s motion along its principal axis for better visualization. The minimized crystal structure is shown with arrows indicating the direction of the trend in flexibility of three of the most flexible regions of the protein. (B) Projections of the unfolding of the protein Im7 in property space: A typical two property reaction coordinate using radius of gyration and fraction of native contacts, Q (1^st plot); Projections of 4 global protein properties in three-dimensional space (radius of gyration, number of native contacts, main chain SASA and % Helix) (2^nd plot); Two-dimensional projections from PCA on 15 properties from the unfolding of Im7. Native (N), intermediate (I) and denatured (D) state ensembles are indicated (3^rd plot); One dimensional reaction coordinate for the unfolding of Im7 derived from a histogram of the mean distances to the reference ensemble for every unfolding structure (4^th plot); Representative structures for each ensemble are shown (5^th plot). (C) Structural mining of the states along the unfolding pathway. Left, Ramachandran distributions for all of the residues in the native, transition and denatured states. (Φ,Ψ) space is divided into 72 bins of 5°, colored by fractional population on a nonlinear scale. Middle, the percentage of time spent in α-helix, β-sheet and P_II conformations and right, the fraction of contacts present over 183 proteins are shown for the native (blue), transition (green) and denatured (red) states. The fraction of contacts is obtained by normalizing the average number of contacts in each ensemble by the maximum number of contacts of each type (nat, nonnat, and total). Nonnat. for nonnative (i.e. not present in the starting structure).

**Figure 5**
SNPs and their structural effects on proteins. (A) Structures of human catechol O-methyltransferase. Crystal structures of the wild-type (left) and the V108M polymorph (center) show little structural difference. MD simulations, however, reveal major structural distortion in the V108M polymorph (right). Residue 108 is represented by orange spheres, and residues that bind to S-adenosylmethionine and substrate are represented by green and red spheres, respectively. Note the movement of a6 and a7 and the associated disruptions to the active site. (B) Surface cleft (boxed) created by the V143A polymorph (t = 30ns) in p53. The cavity volume is ~800 Å³ and may serve as a good binding site for molecules to rescue p53 function. A ligand (colored magenta) with K_d = 160 nM as predicted by docking (using AutoDock, Morris et al., 1998) is shown bound in the cleft (right). (C) Interrupted DNA contacts induced by the V143A p53 polymorph (t = 30ns). Side chains with completely interrupted DNA contacts are labeled. The ligand depicted in part B is colored magenta (right).

See this image and copyright information in PMC

Comment in

Making membrane proteins for structures: a trillion tiny tweaks.
Baker M. Baker M. Nat Methods. 2010 Jun;7(6):429-34. doi: 10.1038/nmeth0610-429. Nat Methods. 2010. PMID: 20508636 No abstract available.
A database of dynamics.
Doerr A. Doerr A. Nat Methods. 2010 Jun;7(6):426. doi: 10.1038/nmeth0610-426. Nat Methods. 2010. PMID: 20524218

Cited by

Tumorigenic p53 mutants undergo common structural disruptions including conversion to α-sheet structure.
Bromley D, Daggett V. Bromley D, et al. Protein Sci. 2020 Sep;29(9):1983-1999. doi: 10.1002/pro.3921. Epub 2020 Aug 17. Protein Sci. 2020. PMID: 32715544 Free PMC article.
Shared unfolding pathways of unrelated immunoglobulin-like β-sandwich proteins.
Toofanny RD, Calhoun S, Jonsson AL, Daggett V. Toofanny RD, et al. Protein Eng Des Sel. 2019 Dec 31;32(7):331-345. doi: 10.1093/protein/gzz040. Protein Eng Des Sel. 2019. PMID: 31868211 Free PMC article.
Remote thioredoxin recognition using evolutionary conservation and structural dynamics.
Tang GW, Altman RB. Tang GW, et al. Structure. 2011 Apr 13;19(4):461-70. doi: 10.1016/j.str.2011.02.007. Structure. 2011. PMID: 21481770 Free PMC article.
The role of α-sheet structure in amyloidogenesis: characterization and implications.
Prosswimmer T, Daggett V. Prosswimmer T, et al. Open Biol. 2022 Nov;12(11):220261. doi: 10.1098/rsob.220261. Epub 2022 Nov 23. Open Biol. 2022. PMID: 36416010 Free PMC article. Review.
Designed α-sheet peptides inhibit amyloid formation by targeting toxic oligomers.
Hopping G, Kellock J, Barnwal RP, Law P, Bryers J, Varani G, Caughey B, Daggett V. Hopping G, et al. Elife. 2014 Jul 15;3:e01681. doi: 10.7554/eLife.01681. Elife. 2014. PMID: 25027691 Free PMC article.

See all "Cited by" articles

References

1. Anderson PC, Daggett V. Molecular basis for the structural instability of human DJ-1 induced by the L166P mutation associated with Parkinson’s disease. Biochem. 2008;47:9380–9393. - PMC - PubMed
1. Anderson PC, Daggett V. The R46Q, R131Q and R154H polymorphs of human DNA glycosylase/β-lyase hOgg1 severely distort the active site and DNA recognition site but do not cause unfolding. J Am Chem Soc. 2009;131:9506–9515. - PMC - PubMed
1. Beck DAC, Alonso DOV, Daggett V. In lucem molecular mechanics. Seattle, WA: University of Washington; 2000–2010.
1. Beck DAC, Alonso DOV, Inoyama D, Daggett V. The intrinsic conformational propensities of the 20 naturally occurring amino acids and reflection of these propensities in proteins. Proc Natl Acad Sci USA. 2008a;105:12259–12264. - PMC - PubMed
1. Beck DAC, Armen RS, Daggett V. Cutoff size need not strongly influence molecular dynamics results for solvated polypeptides. Biochem. 2005;44:609–616. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Dynameomics: a comprehensive database of protein dynamics

Affiliation

Dynameomics: a comprehensive database of protein dynamics

Authors

Affiliation

Abstract

Figures

Comment in

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources