Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jun 20;49(Pt 4):1320-1335.
doi: 10.1107/S1600576716008165. eCollection 2016 Aug 1.

Dragonfly: an implementation of the expand-maximize-compress algorithm for single-particle imaging

Affiliations

Dragonfly: an implementation of the expand-maximize-compress algorithm for single-particle imaging

Kartik Ayyer et al. J Appl Crystallogr. .

Abstract

Single-particle imaging (SPI) with X-ray free-electron lasers has the potential to change fundamentally how biomacromolecules are imaged. The structure would be derived from millions of diffraction patterns, each from a different copy of the macromolecule before it is torn apart by radiation damage. The challenges posed by the resultant data stream are staggering: millions of incomplete, noisy and un-oriented patterns have to be computationally assembled into a three-dimensional intensity map and then phase reconstructed. In this paper, the Dragonfly software package is described, based on a parallel implementation of the expand-maximize-compress reconstruction algorithm that is well suited for this task. Auxiliary modules to simulate SPI data streams are also included to assess the feasibility of proposed SPI experiments at the Linac Coherent Light Source, Stanford, California, USA.

Keywords: X-ray free-electron lasers; XFELs; expand–maximize–compress reconstruction algorithm; single-particle imaging.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(a) The experimental geometry of single-particle imaging adopted in the data-stream simulator. (b) This simulator implements a planar square detector comprising d × d square pixels, each of area formula image. The detector is positioned at z D from the X-ray interaction region, where (c) the scatterer (depicted here as a sphere of radius Rp) is typically an electron-density map sampled from a Protein Data Bank file. From these, one can compute the maximum scattering angle captured by the detector, subtended by grey triangles in part (a) to either the edge or corner of the detector. Here, we take this maximum angle φmax as the latter. Combined with the incident photon wavelength λ, this allows us to determine the half-period resolution, a, from the detector’s edge, which is equivalent to the length of the voxel (red) in the reconstructed electron-density map.
Figure 2
Figure 2
Dragonfly flowchart to simulate a data set and perform a reconstruction starting from a sample PDB file and a configuration file, config.ini, with information about the experimental setup. Input and output are shown as text, and modules as blue boxes. The large white rectangle defines the data-stream simulator.
Figure 3
Figure 3
Dragonfly flowchart to process experimental data in sparse format. Information about the experimental parameters is placed in the configuration file config.ini and the detector geometry is in detector.dat. The formats of all three input files are described in §2.5. Notice that the difference between this workflow and that shown in Fig. 2 ▸ is in how the data are generated.
Figure 4
Figure 4
A typical configuration file, describing various parameters used to perform a basic simulation and reconstruction using the KLH1 (4BED.pdb) molecule on the AMO beamline. These parameters are to be compared with the numbers in Table 1 ▸.
Figure 5
Figure 5
Six blocks in the sparse binary data format for 50 patterns. The data are stored contiguously but shown here in row-major format (i.e. to be read from left to right, then down the rows). Each square represents a 32-bit integer. The two integers in the header block are the number of patterns, followed by the number of pixels in the detector. The colors in blocks three to six connect listings of the same pattern. Details given in §2.5.3.
Figure 6
Figure 6
Convergence of diffraction speckle features in a simulated AMO single-particle experiment (parameters listed in Table 1 ▸). In each row we render central slices of the three-dimensional diffraction intensities recovered from KLH1 during an EMC reconstruction, after one, ten, 20 and 50 iterations in ascending row order. (Bottom row) Additional diagnostics on the reconstructed three-dimensional diffraction model. (Left) The r.m.s. change in the three-dimensional model. (Middle) Mutual information and log-likelihood of the model. (Right) The most likely orientations of all the patterns.
Figure 7
Figure 7
Rotation-group refinement for a simulated reconstruction of TMV on the CXI endstation (see Table 1 ▸). Shown here are the central sections of the reconstructed three-dimensional diffraction volume of TMV after 55 iterations. With 90 Intel Xeon X7542 (2.67 GHz) cores, this full reconstruction took less than 6 h, taking 15 min for each of the slowest refinement iterations using 204 960 rotation-group samples. Red dashed lines in the r.m.s. model change mark when the refinement level of the rotation group was increased by one. In the bottom right-hand plot, rows are colored by each photon pattern’s most likely orientation number, which stabilizes after 20 iterations and thereafter quickly re-stabilizes when we increase the rotation-group refinement. The rows (pattern indices) are sorted according to the most likely orientation indices of the last iteration in each rotation-sampling block, which produces a smooth color spectrum along this final column. Since the number of quaternions (quat) increases with rotation refinement, blocks of higher refinement show a wider color spectrum. See §3.2.1 for details.
Figure 8
Figure 8
Deterministic annealing in a simulated reconstruction on the AMO endstation with high photon fluence (see Table 1 ▸). This reconstruction was performed by doubling the β parameter (§3.2.2) every ten iterations, starting from β = 0.001. Doublings occur at the dashed black lines in the diagnostic plots in the bottom row, where the ten-iteration interval was chosen to allow the intermediate reconstructions to stabilize. This stabilization can be judged by the asymptotic saturation of the average mutual information in every β block. After 80 iterations (β = 0.256), this increase was stopped as there did not seem to be much further improvement in the average mutual information. After this, the rotational sampling rate was increased from six to the target of nine. As in the CXI reconstruction (Fig. 7 ▸), this was done in order to save computational time by doing fewer iterations at the highest sampling.
Figure 9
Figure 9
Setup for solid-angle correction. We compute the solid angle subtended by the square pixel (red) on the detector plane (grey). The scatterer (blue sphere) is set at the origin of this figure.
Figure 10
Figure 10
Low-intensity wedge-like volumes appear in the EMC-reconstructed volume with very high signal data frames (most frames contain more than 104 photons). The simulation parameters are listed in Table 1 ▸ as AMO (high). We reconstructed with rotation-group refinement n = 5 and with β = 1 (annealing turned off). In descending order down the rows, the panels show the central sections of the updated model after one, five, ten and 100 iterations, and the lower panels show the diagnostics for 100 iterations (plots described in the caption to Fig. 6 ▸).

References

    1. Aquila, A. et al. (2015). Struct. Dyn. 2, 041701. - PMC - PubMed
    1. Ayyer, K., Geloni, G., Kocharyan, V., Saldin, E., Serkez, S., Yefanov, O. & Zagorodnov, I. (2015). Struct. Dyn. 2, 041702. - PMC - PubMed
    1. Ayyer, K., Philipp, H. T., Tate, M. W., Elser, V. & Gruner, S. M. (2014). Opt. Express, 22, 2403–2413. - PMC - PubMed
    1. Ayyer, K., Philipp, H. T., Tate, M. W., Wierman, J. L., Elser, V. & Gruner, S. M. (2015). IUCrJ, 2, 29–34. - PMC - PubMed
    1. Barty, A., Kirian, R. A., Maia, F. R. N. C., Hantke, M., Yoon, C. H., White, T. A. & Chapman, H. (2014). J. Appl. Cryst. 47, 1118–1131. - PMC - PubMed