Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2008 Jan;275(1):1-21.
doi: 10.1111/j.1742-4658.2007.06178.x. Epub 2007 Nov 23.

Protein crystallography for non-crystallographers, or how to get the best (but not more) from published macromolecular structures

Affiliations
Review

Protein crystallography for non-crystallographers, or how to get the best (but not more) from published macromolecular structures

Alexander Wlodawer et al. FEBS J. 2008 Jan.

Abstract

The number of macromolecular structures deposited in the Protein Data Bank now exceeds 45,000, with the vast majority determined using crystallographic methods. Thousands of studies describing such structures have been published in the scientific literature, and 14 Nobel prizes in chemistry or medicine have been awarded to protein crystallographers. As important as these structures are for understanding the processes that take place in living organisms and also for practical applications such as drug design, many non-crystallographers still have problems with critical evaluation of the structural literature data. This review attempts to provide a brief outline of technical aspects of crystallography and to explain the meaning of some parameters that should be evaluated by users of macromolecular structures in order to interpret, but not over-interpret, the information present in the coordinate files and in their description. A discussion of the extent of the information that can be gleaned from the coordinates of structures solved at different resolution, as well as problems and pitfalls encountered in structure determination and interpretation are also covered.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Crystal structure of the enzyme frankensteinase. (A) A stereoview showing a tracing of the protein chain in the common rainbow colors (slowly changing from blue N-terminus to red C-terminus). Active site residues are in ball-and-stick rendering, the Mg2+ ion is shown as a gray ball, and water molecules as red spheres. Frankensteinase was conceived and refined with COOT [22] and drawn with PYMOL [21]. (B) Detail of the Mg2+ binding site. Atoms are shown in ball-and-stick rendering, with carbon atoms colored green, oxygen red, and nitrogen blue. A few problems with this structure need to be emphasized. (a) No such protein has ever existed or is likely to exist in the future. (b) The coordinates were freely taken from several real proteins, but were assembled in a way that would satisfy only M. C. Escher. (c) An ‘active site’ consisting of the side chains of phenylalanine, leucine, and valine is rather unlikely to have catalytic properties. (d) Identification of a metal ion that is not properly coordinated by any part of the protein is rather doubtful. (e) The distances between the ion and the coordinating atoms are shown with four decimal digit precision, vastly exceeding their accuracy. Besides, the ‘bond’ distances are entirely unacceptable for magnesium. PDB accession code: For obvious reasons the model of frankensteinase was not deposited in the PDB. It can be obtained upon request from the corresponding author.
Fig. 2
Fig. 2
Stereoviews of electron-density maps. The final atomic model of a fragment of the DraD invasin (PDB code 2axw) [79] is superimposed on the maps. (A) The 1.75 Å resolution map calculated with Fobs amplitudes and initially estimated phases, contoured at the 1.5σ level. This map was used to construct the first model of the protein molecule. (B) The 1.0 Å resolution map calculated with Fobs amplitudes and the phases obtained upon completion of the refinement, contoured at 1.7σ. The final map shows the complete fragment of the chain with considerably better detail, since it was calculated at much higher resolution (using over five times more reflections) and with very accurate phases.
Fig. 3
Fig. 3
The appearance of electron density as a function of the resolution of the experimental data. The N-terminal fragment (Lys1–Val2–Phe3) of triclinic lysozyme (PDB code 2vb1) [80] with the (Fobs, φcalc) maps calculated with different resolution cut-off. Whereas at the highest resolution of 0.65 Å there were 184 676 reflections used for map calculation, at 5 Å resolution only 415 reflections were included.
Fig. 4
Fig. 4
Electron density for a region with static disorder. The model and the corresponding (Fobs, φcalc) map for ArgA63 in the structure of DraD invasin (PDB code 2axw) [79], with its side chain in two conformations. The map was calculated at 1.0 Å resolution and displayed at the 1.7σ contour level.
Fig. 5
Fig. 5
Criteria for assessment of the quality of crystallographic models of macromolecular structures. For the resolution and R criteria, the more ‘green’ (i.e. lower) the value, the better. With RfreeR and rmsd from ideality the situation is different because there is some optimal value and drastic departures in both directions also set a red flag, although for different reasons. When the difference between Rfree and R exceeds 7%, it indicates possible over-interpretation of the experimental data. But if it is very low (say below 2%), it strongly suggest that the test data set is not truly ‘free’, for example, because the structure is pseudosymmetric or, even worse, because the test reflections have been compromised in a round of refinement or were not properly transferred from one data set to another. When rmsd(bonds) is very high, it is an obvious signal of model errors. However, when it is very low (e.g. 0.004 Å), it indicates that through too tight restraints the model underwent geometry optimization, rather than refinement driven by the experimental diffraction data. There are different opinions about how rigorous the stereochemical restraints should be. However, because the ‘ideal’ bond lengths themselves suffer from errors in the order of 0.02 Å, it is reasonable to require the model to adhere to them also only at this level.
Fig. 6
Fig. 6
Schematic representation of a fragment of the protein backbone chain with definition of torsion angles φ, ψ and ω for the ith residue. These angles have a reference value of 0° in the eclipsed conformation, but as presented in the figure they are all equal to 180°.
Fig. 7
Fig. 7
Two examples of a Ramachandran diagram. (A) Plot for Erwinia chrysanthemi L-asparaginase, one of the largest structures solved to date at atomic resolution (PDB code 1o7j). (B) Plot for the 2.26 Å structure of the C3b complement protein (PDB code 2hr0) characterized by a very large number of main-chain dihedral angles outside of the allowed region, a vast majority of them originating from a single polypeptide chain.
Fig. 8
Fig. 8
Interpretation of the location of hydrogen atoms. (A) Assignment of hydrogen atoms based on the pattern of carboxylate C–O bond lengths of the residues in the active site of sedolisin refined at 1.0 Å resolution (PDB code 1ga6) [81]. The bond-length errors are ~ 0.02 Å, therefore the differences between the C–O bonds within the carboxylic groups are not decisive, but strongly suggestive about the protonation state of the Glu and Asp residues, especially that they form an internally consistent pattern. (B) Hydrogen-omit map for the Thr51 residue in the model of triclinic lysozyme refined at 0.65 Å resolution (PDB code 2vb1) [80], contoured at the 3σ level. Hydrogen atoms are colored gray. Evidently, at this threonine the methyl and hydroxyl groups do not rotate freely, but adopt stable conformations, due to their interactions with neighboring residues in the crystal.

References

    1. Kendrew JC, Bodo G, Dintzis HM, Parrish RG, Wyckoff H, Phillips DC. A three-dimensional model of the myoglobin molecule obtained by X-ray analysis. Nature. 1958;181:662–666. - PubMed
    1. Bernstein FC, Koetzle TF, Williams GJB, Meyer EF, Jr, Brice MD, Rogers JR, Kennard O, Shimanouchi T, Tasumi M. The Protein Data Bank: a computer-based archival file for macromolecular structures. J Mol Biol. 1977;112:535–547. - PubMed
    1. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The Protein Data Bank. Nucleic Acids Res. 2000;28:235–242. - PMC - PubMed
    1. Levitt M. Growth of novel protein structural data. Proc Natl Acad Sci USA. 2007;104:3183–3188. - PMC - PubMed
    1. Brown EN, Ramaswamy S. Quality of protein crystal structures. Acta Crystallogr D Biol Crystallogr. 2007;63:941–950. - PubMed

Publication types