Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun 1;78(Pt 6):752-769.
doi: 10.1107/S2059798322004399. Epub 2022 May 18.

xia2.multiplex: a multi-crystal data-analysis pipeline

Affiliations

xia2.multiplex: a multi-crystal data-analysis pipeline

Richard J Gildea et al. Acta Crystallogr D Struct Biol. .

Abstract

In macromolecular crystallography, radiation damage limits the amount of data that can be collected from a single crystal. It is often necessary to merge data sets from multiple crystals; for example, small-wedge data collections from micro-crystals, in situ room-temperature data collections and data collection from membrane proteins in lipidic mesophases. Whilst the indexing and integration of individual data sets may be relatively straightforward with existing software, merging multiple data sets from small wedges presents new challenges. The identification of a consensus symmetry can be problematic, particularly in the presence of a potential indexing ambiguity. Furthermore, the presence of non-isomorphous or poor-quality data sets may reduce the overall quality of the final merged data set. To facilitate and help to optimize the scaling and merging of multiple data sets, a new program, xia2.multiplex, has been developed which takes data sets individually integrated with DIALS and performs symmetry analysis, scaling and merging of multi-crystal data sets. xia2.multiplex also performs analysis of various pathologies that typically affect multi-crystal data sets, including non-isomorphism, radiation damage and preferential orientation. After the description of a number of use cases, the benefit of xia2.multiplex is demonstrated within a wider autoprocessing framework in facilitating a multi-crystal experiment collected as part of in situ room-temperature fragment-screening experiments on the SARS-CoV-2 main protease.

Keywords: SARS-CoV-2; data analysis; data processing; multi-crystal data sets; partial data sets; xia2.multiplex.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Flowchart outlining the main sequence of steps taken by xia2.multiplex. Optional steps are indicated by dashed arrows. The command-line programs used at each step are indicated.
Figure 2
Figure 2
Experimental phasing and anomalous signal from multi-crystal room-temperature in situ experiments using lysozyme crystals soaked with various heavy-atom solutions. (a) SHELXC plot of 〈d′′/σ(I)〉. (b) CC all versus CC weak after substructure solution with HKL2MAP/SHELXD. (c) Anomalous difference map peaks identified by ANODE via DIMPLE for lysozyme Au soaks. Contours are drawn at 4σ. (d) Anomalous difference map peak heights identified by ANODE via DIMPLE with and without filtering of outlier regions of data sets.
Figure 3
Figure 3
dials.cosym plots for data from lysozyme Sm soaks as described in Section 4.1. (a) Histogram of (n × m)2 pairwise R ij correlation coefficients and (b) the (n × m) vectors x determined by the minimization of equation (2) during symmetry determination with dials.cosym. The R ij correlation coefficients are clustered towards 1 and the majority of the vectors x form a single cluster, suggesting the absence of an indexing ambiguity, i.e. the Patterson group of the data set corresponds to the maximum lattice symmetry. (c, d) As above but after symmetry determination and scaling. The distribution of the n 2 R ij correlation coefficients is sharpened towards 1 as scaling improves the internal consistency of the data. There is also an effect from multiplicity when comparing with (a), as here the n 2 R ij values are calculated in the highest symmetry group for the lattice. All but one of the n vectors x form a tight cluster, with the vector lengths close to 1. Visualization of (e) the distribution of unit-cell parameters and (f) clustering on unit-cell parameters suggests the presence of an outlier data set.
Figure 4
Figure 4
Hierarchical clustering (a) on pairwise correlation coefficients and (b) on the cosines of the angles between vectors in Fig. 3 ▸(d) identify the presence of an outlier data set.
Figure 5
Figure 5
(a) A clear bimodal distribution of the histogram of pairwise R ij values is a strong indicator of the presence of an indexing ambiguity. (b) The vectors x determined by the minimization of equation (2) in dials.cosym. The separation of the vectors into two clusters indicates the presence of an indexing ambiguity. (c, d) Stereographic projections of crystal orientations for TehA crystals, representing the direction of hkl = 100 and hkl = 001 for each crystal, respectively, relative to the beam direction (z), which is shown as the central ‘+’ into the page. A point close to the centre of the circle indicates that the crystal axis is close to parallel to the beam, whereas a point close to the edge of the unit circle indicates that the crystal axis is close to perpendicular to the beam. Preferential orientation can lead to regions with systematically low multiplicity or missing reflections. (e) shows the reflection multiplicities in the 0 kl plane, where white corresponds to missing reflections. (f) The bivariate distribution of multiplicities is also indicative of an uneven distribution of multiplicities.
Figure 6
Figure 6
Incremental processing with xia2.multiplex and DIMPLE of in situ data collections of SARS-CoV-2 Mpro ligand soak Z4439011520. (a, b) CC1/2 and R p.i.m. data-processing statistics for ligand Z4439011520 with the inclusion of progressively more data sets in data-collection order from top left to bottom right. (c, d) Overall data completeness and gemmi (https://gemmi.readthedocs.io) blob search scores. (e, f, g) The ligand density in the autoprocessed DIMPLE maps for two, nine and 20 crystals, respectively. All contours are drawn at 3σ.
Figure 7
Figure 7
Outlier identification and removal for SARS-CoV-2 Mpro ligand soak Z4439011520. Visualization of (a) the distribution of unit-cell parameters and (b) clustering on unit-cell parameters may suggest possible outlier data sets. (c, d) ΔCC1/2 filtering with dials.scale can also remove data sets that strongly disagree with the majority of data sets. (e, f) Removing outlier data sets can improve the overall merging statistics.
Figure 8
Figure 8
Views of the active site of SARS-CoV-2 Mpro in complex with ABT-957 (a) under cryogenic conditions (Redhead et al., 2021 ▸) and (b) at room temperature. Contours for the ligand density are drawn at 3σ. (c, d) Two slightly displaced views of the active site of SARS-CoV-2 Mpro in complex with ABT-957 to show the conformational differences observed, particularly for the oxopyrrolidine and benzyl moieties of ABT-957 when bound to Mpro, at cryo temperature (cyan) and room temperature (green). The structures were superimposed using PyMOL (Schrödinger)

References

    1. Akey, D. L., Brown, W. C., Konwerski, J. R., Ogata, C. M. & Smith, J. L. (2014). Acta Cryst. D70, 2719–2729. - PMC - PubMed
    1. Aller, P., Sanchez-Weatherby, J., Foadi, J., Winter, G., Lobley, C. M. C., Axford, D., Ashton, A. W., Bellini, D., Brandao-Neto, J., Culurgioni, S., Douangamath, A., Duman, R., Evans, G., Fisher, S., Flaig, R., Hall, D. R., Lukacik, P., Mazzorana, M., McAuley, K. E., Mykhaylyk, V., Owen, R. L., Paterson, N. G., Romano, P., Sandy, J., Sorensen, T., von Delft, F., Wagner, A., Warren, A., Williams, M., Stuart, D. I. & Walsh, M. A. (2015). Methods Mol. Biol. 1261, 233–253. - PubMed
    1. Assmann, G., Brehm, W. & Diederichs, K. (2016). J. Appl. Cryst. 49, 1021–1028. - PMC - PubMed
    1. Axford, D., Foadi, J., Hu, N.-J., Choudhury, H. G., Iwata, S., Beis, K., Evans, G. & Alguel, Y. (2015). Acta Cryst. D71, 1228–1237. - PMC - PubMed
    1. Axford, D., Owen, R. L., Aishima, J., Foadi, J., Morgan, A. W., Robinson, J. I., Nettleship, J. E., Owens, R. J., Moraes, I., Fry, E. E., Grimes, J. M., Harlos, K., Kotecha, A., Ren, J., Sutton, G., Walter, T. S., Stuart, D. I. & Evans, G. (2012). Acta Cryst. D68, 592–600. - PMC - PubMed

Substances