Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jul;15(7):2491-500.
doi: 10.1074/mcp.M116.058560. Epub 2016 May 5.

The Importance of Non-accessible Crosslinks and Solvent Accessible Surface Distance in Modeling Proteins with Restraints From Crosslinking Mass Spectrometry

Affiliations

The Importance of Non-accessible Crosslinks and Solvent Accessible Surface Distance in Modeling Proteins with Restraints From Crosslinking Mass Spectrometry

Joshua Matthew Allen Bullock et al. Mol Cell Proteomics. 2016 Jul.

Abstract

Crosslinking mass spectrometry (XL-MS) is becoming an increasingly popular technique for modeling protein monomers and complexes. The distance restraints garnered from these experiments can be used alone or as part of an integrative modeling approach, incorporating data from many sources. However, modeling practices are varied and the difference in their usefulness is not clear. Here, we develop a new scoring procedure for models based on crosslink data-Matched and Nonaccessible Crosslink score (MNXL). We compare its performance with that of other commonly-used scoring functions (Number of Violations and Sum of Violation Distances) on a benchmark of 14 protein domains, each with 300 corresponding models (at various levels of quality) and associated, previously published, experimental crosslinks (XLdb). The distances between crosslinked lysines are calculated either as Euclidean distances or Solvent Accessible Surface Distances (SASD) using a newly-developed method (Jwalk). MNXL takes into account whether a crosslink is nonaccessible, i.e. an experimentally observed crosslink has no corresponding SASD in a model due to buried lysines. This metric alone is shown to have a significant impact on modeling performance and is a concept that is not considered at present if only Euclidean distances are used. Additionally, a comparison between modeling with SASD or Euclidean distance shows that SASD is superior, even when factoring out the effect of the nonaccessible crosslinks. Our benchmarking also shows that MNXL outperforms the other tested scoring functions in terms of precision and correlation to Cα-RMSD from the crystal structure. We finally test the MNXL at different levels of crosslink recovery (i.e. the percentage of crosslinks experimentally observed out of all theoretical ones) and set a target recovery of ∼20% after which the performance plateaus.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Overview of the three main phases of the Jwalk algorithm. A, placement of the protein(s) onto a grid; B, calculation of the solvent accessible surface by expanding a sphere around each atom; and C, a Breadth-First Search of the grid to calculate the shortest possible SASD between the all residues of interest. The search is initiated from one of the surface lysines (blue square) with potential paths (black lines) searching the whole grid. Paths between the starting lysine and targets lysines (green squares) are retained and output as a .pdb file (yellow paths). Fig. 1. (location: Experimental Procedures - Jwalk).
Fig. 2.
Fig. 2.
A, Plots showing the correlation between (i) MNXL, (ii) NoV and (iii) SoVD, and Cα-RMSD of three of the members of the benchmark with PDBid: 1BLF5–333, 1JM7:A and 4FGF and corresponding comparative models. The ideal trend would stretch from the bottom left corner to the top right (see corresponding correlations values in Table I); B, The experimentally determined structures with SASDs mapped on (green) as well as additional highlighting (1JM7:A - blue structure highlights the structural deviations that are possible while still satisfying the crosslinking restraints; 4FGF - The majority of the crosslinks come from the red loop spanning residues 110–125). Fig. 2. (location: Results - Filtering homology models using MNXL).
Fig. 3.
Fig. 3.
The constituent elements of MNXL. Benchmark members with PDBid: 2D3I342–686 and 2HGD scored with (i) non-accessible crosslinks only, (ii) SASD NoV and non-accessible crosslinks, and (iii) MNXL. Fig. 3. (location: Results - Filtering comparative models using MNXL).
Fig. 4.
Fig. 4.
The effect of modeling using Euclidean distances instead of SASD, highlighted with plots from two members of the benchmark - PDBid: 1HRC and 4F5S:A - scored with (i) SASD NoV and non-accessible crosslinks, (ii) Euclidean NoV and non-accessible crosslinks, and (iii) Euclidean NoV only. Fig. 4. (location: Results - Euclidean Distance versus SASD).
Fig. 5.
Fig. 5.
The correlation and precision returned when using different levels of recovery tested via bootstrapping analysis. The experimental cases from the benchmark are plotted on each graph with their experimental recovery (red circles). Major outliers: 4FGF, 1JM7:A and 1HRC are labeled. Fig. 5. (location: Results - Exploring crosslink coverage).

References

    1. Valkov E., Muthukumar S., Chang C.-T., Jonas S., Weichenrieder O., and Izaurralde E. (2016) Structure of the Dcp2-Dcp1 mRNA-decapping complex in the activated conformation. Nat. Struct. Mol. Biol., in press. - PubMed
    1. Bartesaghi A., Matthies D., Banerjee S., Merk A., and Subramaniam S. (2014) Structure of β-galactosidase at 3.2-Å resolution obtained by cryo-electron microscopy. Proc. Natl. Acad. Sci. U. S. A. 111, 11709–14 - PMC - PubMed
    1. Rieping W., Habeck M., and Nilges M. (2005) Inferential structure determination. Science 309, 303–6 - PubMed
    1. Ward A. B., Sali A., and Wilson I. a (2013) Integrative structural biology. Science 339, 913–915 - PMC - PubMed
    1. Topf M., Sali A. (2005) Combining electron microscopy and comparative protein structure modeling. Curr. Opin. Struct. Biol. 15, 578–585 - PubMed

Publication types

LinkOut - more resources