Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Mar 14;18(4):423-35.
doi: 10.1016/j.str.2010.01.012.

Dynameomics: a comprehensive database of protein dynamics

Affiliations

Dynameomics: a comprehensive database of protein dynamics

Marc W van der Kamp et al. Structure. .

Abstract

The dynamic behavior of proteins is important for an understanding of their function and folding. We have performed molecular dynamics simulations of the native state and unfolding pathways of over 2000 protein/peptide systems (approximately 11,000 independent simulations) representing the majority of folds in globular proteins. These data are stored and organized using an innovative database approach, which can be mined to obtain both general and specific information about the dynamics and folding/unfolding of proteins, relevant subsets thereof, and individual proteins. Here we describe the project in general terms and the type of information contained in the database. Then we provide examples of mining the database for information relevant to protein folding, structure building, the effect of single-nucleotide polymorphisms, and drug design. The native state simulation data and corresponding analyses for the 100 most populated metafolds, together with related resources, are publicly accessible through http://www.dynameomics.org.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Consensus domain dictionary generation and target selection. (A) Domain dictionaries partition structures from the Protein Data Bank into domains and folds. The current versions of the SCOP, CATH and DALI domain dictionaries include about 30,000 structures. We find consensus between the domains within these separate domain dictionaries to generate consensus domains. These consensus domains are filtered by sequence to generate a non-redundant consensus domain dictionary. These non-redundant domains are then clustered into metafolds and ranked by population. A single fold representative is selected from each metafold. This representative is either suitable for simulation and becomes part of our release set, or is judged unsuitable for simulation and is rejected. Simulation and analysis data of fold representatives from the Top 100 most populated folds in our release set are publicly available on our website. (B) Three examples of fold representatives (in red) that were rejected because they are not truly autonomous domains. Left, cathepsin D from PDB 1LYA; Middle, chain 4 in the human poliovirus 1, from PDB 1AL2; Right, delta crystallin I from PDB 1I0A.
Figure 2
Figure 2
Overview of the generation and use of our simulation data, organized through the Dynameomics database. After selection of protein targets to cover the majority of known protein folds (Figure 1) as well as to examine the influence of single residue mutations (Figure 5), structures obtained from the PDB are prepared for simulation by addition of hydrogens, minimization and solvation, according to our standard protocols (see text). Simulations are then run to capture both native state and unfolding dynamics. The simulation trajectories are stored in the database, together with a set of standard analysis (which can be viewed online). The simulations of native state dynamics are used to define global residue properties and build a fragment library (Figure 3). Specialized mining of the simulations in the database is also performed in-house, to examine further questions on protein dynamics and unfolding (Figure 4), including the use of new techniques such as wavelet and flexibility analysis. We also offer direct access to the native state simulations of the Top 100 fold representatives in our database to external users, who can query these data using SQL Server.
Figure 3
Figure 3
Residue and fragment properties obtained from our simulation database. (A) GGAGG simulated at 298K for 100 ns, shown at 1 ns snapshots, aligned on the central Ala. Below are plots of solvent accessible surface area (SASA) and a Ramachandran map of the central Ala. (B) Top (left) and side (right) views of 100 representative sample rotamers from native state simulations for Leu. Below are plots of dihedral angle distributions and rotamer transition frequencies. (C) A 7-residue fragment as stored in the fragment library. Distances between heavy atoms of the terminal residues are used to characterize the fragment (only 4 are shown for clarity). Below, 20 fragments from our simulations matching our initial fragment (in black).
Figure 4
Figure 4
Mining of the collected native state and unfolding simulations. (A) Flexibility and early unfolding trajectory of the DNA-binding domain of human telomeric protein, HTRF1. Flexibility vectors are shown as arrows scaled to four times the standard deviation of the atom’s motion along its principal axis for better visualization. The minimized crystal structure is shown with arrows indicating the direction of the trend in flexibility of three of the most flexible regions of the protein. (B) Projections of the unfolding of the protein Im7 in property space: A typical two property reaction coordinate using radius of gyration and fraction of native contacts, Q (1st plot); Projections of 4 global protein properties in three-dimensional space (radius of gyration, number of native contacts, main chain SASA and % Helix) (2nd plot); Two-dimensional projections from PCA on 15 properties from the unfolding of Im7. Native (N), intermediate (I) and denatured (D) state ensembles are indicated (3rd plot); One dimensional reaction coordinate for the unfolding of Im7 derived from a histogram of the mean distances to the reference ensemble for every unfolding structure (4th plot); Representative structures for each ensemble are shown (5th plot). (C) Structural mining of the states along the unfolding pathway. Left, Ramachandran distributions for all of the residues in the native, transition and denatured states. (Φ,Ψ) space is divided into 72 bins of 5°, colored by fractional population on a nonlinear scale. Middle, the percentage of time spent in α-helix, β-sheet and PII conformations and right, the fraction of contacts present over 183 proteins are shown for the native (blue), transition (green) and denatured (red) states. The fraction of contacts is obtained by normalizing the average number of contacts in each ensemble by the maximum number of contacts of each type (nat, nonnat, and total). Nonnat. for nonnative (i.e. not present in the starting structure).
Figure 5
Figure 5
SNPs and their structural effects on proteins. (A) Structures of human catechol O-methyltransferase. Crystal structures of the wild-type (left) and the V108M polymorph (center) show little structural difference. MD simulations, however, reveal major structural distortion in the V108M polymorph (right). Residue 108 is represented by orange spheres, and residues that bind to S-adenosylmethionine and substrate are represented by green and red spheres, respectively. Note the movement of a6 and a7 and the associated disruptions to the active site. (B) Surface cleft (boxed) created by the V143A polymorph (t = 30ns) in p53. The cavity volume is ~800 Å3 and may serve as a good binding site for molecules to rescue p53 function. A ligand (colored magenta) with Kd = 160 nM as predicted by docking (using AutoDock, Morris et al., 1998) is shown bound in the cleft (right). (C) Interrupted DNA contacts induced by the V143A p53 polymorph (t = 30ns). Side chains with completely interrupted DNA contacts are labeled. The ligand depicted in part B is colored magenta (right).

Comment in

Similar articles

Cited by

References

    1. Anderson PC, Daggett V. Molecular basis for the structural instability of human DJ-1 induced by the L166P mutation associated with Parkinson’s disease. Biochem. 2008;47:9380–9393. - PMC - PubMed
    1. Anderson PC, Daggett V. The R46Q, R131Q and R154H polymorphs of human DNA glycosylase/β-lyase hOgg1 severely distort the active site and DNA recognition site but do not cause unfolding. J Am Chem Soc. 2009;131:9506–9515. - PMC - PubMed
    1. Beck DAC, Alonso DOV, Daggett V. In lucem molecular mechanics. Seattle, WA: University of Washington; 2000–2010.
    1. Beck DAC, Alonso DOV, Inoyama D, Daggett V. The intrinsic conformational propensities of the 20 naturally occurring amino acids and reflection of these propensities in proteins. Proc Natl Acad Sci USA. 2008a;105:12259–12264. - PMC - PubMed
    1. Beck DAC, Armen RS, Daggett V. Cutoff size need not strongly influence molecular dynamics results for solvated polypeptides. Biochem. 2005;44:609–616. - PubMed

Publication types

LinkOut - more resources