Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jan 1;6(Pt 1):72-84.
doi: 10.1107/S2052252518014951.

SPIND: a reference-based auto-indexing algorithm for sparse serial crystallography data

Affiliations

SPIND: a reference-based auto-indexing algorithm for sparse serial crystallography data

Chufeng Li et al. IUCrJ. .

Abstract

SPIND (sparse-pattern indexing) is an auto-indexing algorithm for sparse snapshot diffraction patterns ('stills') that requires the positions of only five Bragg peaks in a single pattern, when provided with unit-cell parameters. The capability of SPIND is demonstrated for the orientation determination of sparse diffraction patterns using simulated data from microcrystals of a small inorganic molecule containing three iodines, 5-amino-2,4,6-triiodoisophthalic acid monohydrate (I3C) [Beck & Sheldrick (2008 ▸), Acta Cryst. E64, o1286], which is challenging for commonly used indexing algorithms. SPIND, integrated with CrystFEL [White et al. (2012 ▸), J. Appl. Cryst. 45, 335-341], is then shown to improve the indexing rate and quality of merged serial femtosecond crystallography data from two membrane proteins, the human δ-opioid receptor in complex with a bi-functional peptide ligand DIPP-NH2 and the NTQ chloride-pumping rhodopsin (CIR). The study demonstrates the suitability of SPIND for indexing sparse inorganic crystal data with smaller unit cells, and for improving the quality of serial femtosecond protein crystallography data, significantly reducing the amount of sample and beam time required by making better use of limited data sets. SPIND is written in Python and is publicly available under the GNU General Public License from https://github.com/LiuLab-CSRC/SPIND.

Keywords: Bragg peaks; X-ray free-electron lasers; XFEL; auto-indexing algorithms; diffract-then-destroy; dynamical studies; electron diffraction; serial crystallography.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Flowchart of the SPIND indexing algorithm. The five best peaks in each pattern selected based on user-chosen criteria, such as SNR, are used for indexing. The blue boxes refer to prior knowledge and the table is calculated once. The green boxes are steps carried out for each pattern. The (red) rejection module refers to steps (g) to (i).
Figure 2
Figure 2
Illustration for SPIND auto-indexing algorithm. (a) A diffraction pattern recorded by Cornell-SLAC hybrid Pixel Array Detector (CSPAD) (Herrmann et al., 2013; Hart et al., 2012 ▸), with a few exaggerated peaks for illustrative purposes. Five peaks are selected to form ten vector pairs. The vector lengths, ratio of lengths and angles between the vectors are then calculated for the ten pairs for matching with a reference based on a priori knowledge of the unit cell (within some mismatch tolerance). (b) Rejection module for eliminating spurious solution candidates, based on the constraint that all peak pairs share the same crystal orientation. The solution must lie in the intersection of the solution pools, provided that the peaks are from a single crystal.
Figure 3
Figure 3
Simulated I3C patterns indexed by SPIND. (a) Simulated sparse diffraction pattern from an I3C crystal in the orientation specified by Euler angles −10.4676, 46.9022, 139.1443. (b) Poisson noise and random background noise added to (a), so only three Bragg peaks were identifiable (circled). (c) The indexing result by SPIND using only the three peaks in (b). The determined crystal orientation is at Euler angles of −10.5713, 46.8855, 139.2000. The peaks were predicted from the determined orientation and Miller indices were given for Bragg peaks.
Figure 4
Figure 4
The effect of inaccurate guiding unit cells on SPIND indexing rates and peak-prediction accuracy demonstrated on simulated I3C snapshot diffraction patterns. (a) Number of indexed patterns as a function of the α angle of the guiding cell (α = 90° is nominal). (b) Distribution of distance discrepancy in three-dimensional reciprocal space between found and predicted peaks for matched peak pairs using guiding cells with different α angle values. The legend shows the α angle of the guiding cell. The center of the distribution shifts to larger values as α deviates further from the nominal value of 90°. The same trend was observed for values of α < 90° (omitted for clarity). The results demonstrate the robustness of the algorithm to the lattice inhomogeneity, a wide tolerance range for the guiding-cell constants and low false-positive indexing rate when the target lattice cell is clearly distinguishable from the guiding cell. The indexing rate can be used as an indicator for the accuracy of the reference unit cell.
Figure 5
Figure 5
Figures-of-merit as a function of resolution for DOR SFX data. (a) SNR, (b) CC*, (c) R split and (d) Bragg reflection profile radii determined by indexamajig. See Fig. S2(a) for full range of reflection profile radii and Fig. S2(b) for reflection multiplicity in merged data sets in the Supporting information. The keywords ‘refine’ and ‘norefine’ represent the on and off status of the lattice-refinement option in indexamajig in the indexing process. ‘nolatt’ represents that the reference cell and lattice type were not used as input for indexing (but were used as constraints for the indexing solution).
Figure 6
Figure 6
Wilson plots for the merged DOR data sets from different indexing methods. The linearity in the 0.06 to ∼0.13 Å−2 region and the consistency between all indexing methods confirm the quality of the merged data sets.
Figure 7
Figure 7
Statistics of ClR data set. (a) Distribution of number of peaks per pattern. Most patterns contained ten to 30 peaks, and were not indexed using MOSFLM (gray bars, ∼100 000 patterns). Histograms from SPIND-refine and MOSFLM-refine fit within the yellow distribution and are omitted for clarity. (b) Comparison of indexing rates using MOSFLM and SPIND with the lattice-refinement option in indexamajig enabled and disabled. The lattice-refinement feature requires that more than ten found peaks match their predicted peak positions with a small excitation error (that increases smoothly with resolution) (White, Barty et al., 2016 ▸). Patterns are discarded (not indexed) if this criterion is not met. This contributes to the abrupt cut in indexing rate from using SPIND-norefine to SPIND-refine since this data set consists of a significant portion of patterns with few peaks (fewer than five).
Figure 8
Figure 8
Representative indexed diffraction patterns from the ClR data set, recorded on the CSPAD. (a) indexable by both MOSFLM and SPIND, (b) indexable only by SPIND. Identified peaks are marked by red crosses, and peak positions predicted from the orientation matrix given by MOSFLM and SPIND are marked with cyan and green circles, respectively. The overlapping cyan and green circles in (a) correspond to the same Miller indices, thus confirming the consistency of the indexing results between SPIND and MOSFLM.
Figure 9
Figure 9
Figures-of-merit for the ClR data set indexed with various indexing algorithms. (a) CC*, (b) SNR, (c) reflection profile radii and (d) Wilson plots. The keywords ‘refine’ and ‘norefine’ represent the on and off status of the lattice-refinement option in indexamajig of CrystFEL in the indexing process. SPIND-refine has better figures of merit for this data set than the other methods. Small modal reflection profile radii indicate that orientation determined by SPIND is often more accurate than MOSFLM with and without orientation refinement.
Figure 10
Figure 10
Resolution histograms for the ClR data set. (a) Resolution distribution of found peaks for all crystal hits, and distributions of apparent diffraction resolution determined by indexamajig after indexing by (b) MOSFLM-refine, (c) MOSFLM-norefine, (d) SPIND-refine and (e) SPIND-norefine. The additional patterns indexed by SPIND are mostly in the lower-resolution region (around 1 nm−1) which is consistent with the resolution distribution of the found peaks.
Figure 11
Figure 11
Schematic SFX data-analysis pipeline integrating SPIND to CrystFEL.

References

    1. Barty, A., Kirian, R. A., Maia, F. R. N. C., Hantke, M., Yoon, C. H., White, T. A. & Chapman, H. (2014). J. Appl. Cryst. 47, 1118–1131. - PMC - PubMed
    1. Beck, T. & Sheldrick, G. M. (2008). Acta Cryst. E64, o1286. - PMC - PubMed
    1. Beyerlein, K. R., White, T. A., Yefanov, O., Gati, C., Kazantsev, I. G., Nielsen, N. F.-G., Larsen, P. M., Chapman, H. N. & Schmidt, S. (2017). J. Appl. Cryst. 50, 1075–1083. - PMC - PubMed
    1. Brehm, W. & Diederichs, K. (2014). Acta Cryst. D70, 101–109. - PubMed
    1. Brewster, A. S., Sawaya, M. R., Rodriguez, J., Hattne, J., Echols, N., McFarlane, H. T., Cascio, D., Adams, P. D., Eisenberg, D. S. & Sauter, N. K. (2015). Acta Cryst. D71, 357–366. - PMC - PubMed

LinkOut - more resources