Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Sep;281(18):4046-60.
doi: 10.1111/febs.12922. Epub 2014 Sep 17.

The R-factor gap in macromolecular crystallography: an untapped potential for insights on accurate structures

Affiliations
Free PMC article

The R-factor gap in macromolecular crystallography: an untapped potential for insights on accurate structures

James M Holton et al. FEBS J. 2014 Sep.
Free PMC article

Abstract

In macromolecular crystallography, the agreement between observed and predicted structure factors (Rcryst and Rfree ) is seldom better than 20%. This is much larger than the estimate of experimental error (Rmerge ). The difference between Rcryst and Rmerge is the R-factor gap. There is no such gap in small-molecule crystallography, for which calculated structure factors are generally considered more accurate than the experimental measurements. Perhaps the true noise level of macromolecular data is higher than expected? Or is the gap caused by inaccurate phases that trap refined models in local minima? By generating simulated diffraction patterns using the program MLFSOM, and including every conceivable source of experimental error, we show that neither is the case. Processing our simulated data yielded values that were indistinguishable from those of real data for all crystallographic statistics except the final Rcryst and Rfree . These values decreased to 3.8% and 5.5% for simulated data, suggesting that the reason for high R-factors in macromolecular crystallography is neither experimental error nor phase bias, but rather an underlying inadequacy in the models used to explain our observations. The present inability to accurately represent the entire macromolecule with both its flexibility and its protein-solvent interface may be improved by synergies between small-angle X-ray scattering, computational chemistry and crystallography. The exciting implication of our finding is that macromolecular data contain substantial hidden and untapped potential to resolve ambiguities in the true nature of the nanoscale, a task that the second century of crystallography promises to fulfill.

Database: Coordinates and structure factors for the real data have been submitted to the Protein Data Bank under accession 4tws.

Keywords: R-factor; R-value; crystallography; simulation; theoretical.

PubMed Disclaimer

Figures

Fig 1
Fig 1
Example diffraction images for (A) real data collected from Gd-containing lysozyme at Advanced Light Source beamline 8.3.1, and (B) simulated diffraction generated using MLFSOM. (C,D) Magnifications of parts of (A) and (B), respectively.
Fig 2
Fig 2
Schematic representation of the workflow, showing the process used to generate both simulated and real diffraction images, and then process, solve and refine the data. The refinement statistics for the models are detailed in Table 2.
Fig 3
Fig 3
Reduction in R-factors as individual atoms are automatically built into difference maps. The starting coordinate model used to compute Fstart for the simulated data (blue and green) was a single-conformer model of 100% occupied atoms with isotropic B factors. In both cases, the processed data were provided to phenix.autobuild with anomalous differences, the sequence of lysozyme and no other sources of phase information. The resulting model was trimmed of all water atoms and subjected to a simple five-step rebuilding procedure using data truncated to 2.0 Å: (1) refine in REFMAC for 500 cycles, or until atoms and B factors stop moving, (2) add dummy atoms to the five highest peaks in the difference map, (3) assign new atoms the proper atom name if they are within 0.5 Å of an atom in the reference model, (4) remove atoms with B > 100 Å2 or that fall on negative difference features < −6σ, and (5) repeat until convergence. Models built into the simulated data converge to very low Rfree values (Rcryst = 4.57% and Rfree = 7.27%), roughly the same magnitude as the Rmerge from the XDS/MOSFLM-processed data, whereas the Rfree for the real data never goes below 20%.
Fig 4
Fig 4
Maximum residual peak height in the difference maps at each step in the autobuild procedure. The peak height of the next difference feature of the simulated data steadily increases with building cycle, behavior that is not seen with real macromolecular data.

References

    1. Perrin J. Mécanisme de la décharge des corps électrisés par les rayons de Röntgen. J Phys Theor Appl. 1896;5:350–357.
    1. Barkla CG. Secondary radiation from gases subject to X-rays. Philos Mag. 1903;5:685–698.
    1. Bragg WH. Bragg WL. The reflection of X-rays by crystals. Proc R Soc Lond A. 1913;88:428–438.
    1. Schweigger JSC. Zusätze zu Oersteds elektromagnetischen Versuchen. Neues J Chem Phys. 1820;1:1–17.
    1. Ampére A-M. Théorie Mathématique des Phénomenes Électro-Dynamiques Uniquement Déduite de l'Expérience. Paris: A. Hermann; 1825.

Publication types