Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jul 1;74(Pt 7):595-605.
doi: 10.1107/S2059798318005752. Epub 2018 Jun 8.

SIMBAD: a sequence-independent molecular-replacement pipeline

Affiliations

SIMBAD: a sequence-independent molecular-replacement pipeline

Adam J Simpkin et al. Acta Crystallogr D Struct Biol. .

Abstract

The conventional approach to finding structurally similar search models for use in molecular replacement (MR) is to use the sequence of the target to search against those of a set of known structures. Sequence similarity often correlates with structure similarity. Given sufficient similarity, a known structure correctly positioned in the target cell by the MR process can provide an approximation to the unknown phases of the target. An alternative approach to identifying homologous structures suitable for MR is to exploit the measured data directly, comparing the lattice parameters or the experimentally derived structure-factor amplitudes with those of known structures. Here, SIMBAD, a new sequence-independent MR pipeline which implements these approaches, is presented. SIMBAD can identify cases of contaminant crystallization and other mishaps such as mistaken identity (swapped crystallization trays), as well as solving unsequenced targets and providing a brute-force approach where sequence-dependent search-model identification may be nontrivial, for example because of conformational diversity among identifiable homologues. The program implements a three-step pipeline to efficiently identify a suitable search model in a database of known structures. The first step performs a lattice-parameter search against the entire Protein Data Bank (PDB), rapidly determining whether or not a homologue exists in the same crystal form. The second step is designed to screen the target data for the presence of a crystallized contaminant, a not uncommon occurrence in macromolecular crystallography. Solving structures with MR in such cases can remain problematic for many years, since the search models, which are assumed to be similar to the structure of interest, are not necessarily related to the structures that have actually crystallized. To cater for this eventuality, SIMBAD rapidly screens the data against a database of known contaminant structures. Where the first two steps fail to yield a solution, a final step in SIMBAD can be invoked to perform a brute-force search of a nonredundant PDB database provided by the MoRDa MR software. Through early-access usage of SIMBAD, this approach has solved novel cases that have otherwise proved difficult to solve.

Keywords: SIMBAD; contaminant; lattice search; molecular replacement pipeline; structure solution.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Flowchart detailing the decision processes in the SIMBAD pipeline. The Full MR step in each case refers to performing a complete MR procedure (rotation and translation search) using the best-ranked models from the initial search (lattice-parameter, contaminant or MoRDa DB).
Figure 2
Figure 2
Logistic regression results showing the likelihood that a penalty score would result in successful MR. The purple line describing the distribution was fitted using a sigmoid model. The coefficient and intercept were determined by the ‘LogisticRegression’ module in sklearn (http://www.scikit-learn.org). (a) The scatter points represent the 2009 raw data points, where the x value corresponds to the total penalty score and the y value is set to 1 or 0 to indicate success or failure in MR. (b) The histogram represents the proportion of success/failure for bin sizes of 1. The figure has been truncated to show the results up to a penalty score of 13; however, the sigmoid model was calculated from data sets with penalty scores of up to 26.
Figure 3
Figure 3
Structural alignment of the C-terminal DNA-binding domains of the apo D138L CAP mutant (PDB entry 3fwe) chain B (pink) and apo wild-type CAP (PDB entry 3hif) chain B (purple), highlighting the conformational change.
Figure 4
Figure 4
Cartoon representation of the E. coli DPS dodecamer, with protomers identified by colour.
Figure 5
Figure 5
Cartoon representation of the S. proteamaculans cyanase decamer, with protomers identified by colour.
Figure 6
Figure 6
(a) SDS–PAGE of the protein sample employed for crystallogenesis. Molecular-mass markers are labelled in kDa. (b) Cartoon representation of the E. coli catalase HPII tetramer, with protomers identified by colour.

References

    1. Adams, P. D. et al. (2010). Acta Cryst. D66, 213–221. - PubMed
    1. Andrews, L. C. & Bernstein, H. J. (2014). J. Appl. Cryst. 47, 346–359. - PMC - PubMed
    1. Bibby, J., Keegan, R. M., Mayans, O., Winn, M. D. & Rigden, D. J. (2012). Acta Cryst. D68, 1622–1631. - PubMed
    1. Bricogne, G., Blanc, E., Brandl, M., Flensburg, C., Keller, P., Paciorek, W., Roversi, P., Sharff, A., Smart, O. S., Vonrhein, C. & Womack, T. O. (2017). BUSTER v.2.10.3. Global Phasing Ltd., Cambridge, UK.
    1. Caliandro, R., Carrozzini, B., Cascarano, G. L., Giacovazzo, C., Mazzone, A. & Siliqi, D. (2009). Acta Cryst. A65, 512–527. - PubMed

Publication types