Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec 22;478(24):4169-4185.
doi: 10.1042/BCJ20210708.

New tools for automated cryo-EM single-particle analysis in RELION-4.0

Affiliations

New tools for automated cryo-EM single-particle analysis in RELION-4.0

Dari Kimanius et al. Biochem J. .

Abstract

We describe new tools for the processing of electron cryo-microscopy (cryo-EM) images in the fourth major release of the RELION software. In particular, we introduce VDAM, a variable-metric gradient descent algorithm with adaptive moments estimation, for image refinement; a convolutional neural network for unsupervised selection of 2D classes; and a flexible framework for the design and execution of multiple jobs in pre-defined workflows. In addition, we present a stand-alone utility called MDCatch that links the execution of jobs within this framework with metadata gathering during microscope data acquisition. The new tools are aimed at providing fast and robust procedures for unsupervised cryo-EM structure determination, with potential applications for on-the-fly processing and the development of flexible, high-throughput structure determination pipelines. We illustrate their potential on 12 publicly available cryo-EM data sets.

Keywords: computational biochemistry; cryo-electron microscopy; imaging techniques; structural biology.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Class ranker neural network architecture and results.
(A) The overall architecture of the scoring network, which consists of three CNN blocks and a final feed forward network that also incorporates the 18 features. (B) The CNN block architecture that incorporates three convolutional layers. The initial convolutional layer, maps the input channels count C1 to the intermediate channels count C2 and the final layer preforms a down sampling of the box size through a strided convolution and doubles the number of channels. (C) The mean-square error loss during training, comparing with and without features. (D) A confusion matrix showing labeled scores versus predicted scores, with bins of 0.1. (E) Example of classes with their predicted score.
Figure 2.
Figure 2.
Schematics of the prep and proc Schemes that form part of the relion_it.py approach for automated, on-the-fly processing. Scheme operators are shown with rounded boxes, RELION jobs with purple boxes; edges with arrows and forks with light purple diamond shapes. For forks, the BooleanVariable that controls its outcome is indicated in the center of the diamond. The WAIT operator waits for a defined time since it was last executed; the EXIT_maxtime operator terminates the Scheme after a defined time since the Scheme was started; the SET_has_ctffind operator sets BooleanVariable has_ctffind to true if the STAR file generated by the CtfFind job of the prep Scheme exists; the COUNT_mics operator sets the current number of micrographs to the number selected in the job above it; the SET_mics_incr sets BooleanVariable mics_incr to true if the current number of selected micrographs is larger than the previous number of micrographs (which is initialized to zero); the SET_prev_mics operator sets the previous number of micrographs to the current number of selected micrographs; the COUNT_parts operator sets the current number of particles to the number of selected particles in the job above it. The SET_enough_parts operator sets BooleanVariable enough_parts to true if the current number of selected particles is larger than a user-specified minimum.
Figure 3.
Figure 3.. GUI of the relion_it.py script for automated execution of the prep and the proc Schemes.
Figure 4.
Figure 4.. GUI of the MDCatch utility for automated fetching of microscope metadata and launching of on-the-fly image processing.
Figure 5.
Figure 5.. Automated structure determination for the test data sets.
For each data (see Table 1) the reconstruction after refinement with the downsampled particles is shown, together with the number of auto-picked particles and the number of selected particles. Although not shown here, for the GDH, TRPV1 and aldolase data sets, initial model generation does some times get stuck in local minima (see section ‘Initial 3D model generation with the VDAM algorithm’ for more details). No map was reconstructed for apoF.
Figure 6.
Figure 6.
All significant 2D class averages from four different classification runs. (A) and (B) show results for the GDH data set classified using the EM and VDAM algorithm, respectively. (C) and (D) show results for the CB1 data set classified using the EM and VDAM algorithm, respectively. Classes are sorted according to their score from the relion_class_ranker program, which is also shown for each class. Classes that were manually selected for subsequent 3D auto-refinement are highlighted in purple.
Figure 7.
Figure 7.
Analysis of the automated 2D class selection. For the 12 test data sets, the charts on the left show the percentage of particles after manual selection (gray), after automated selection with a default threshold of 0.15 (orange), and automated class selection with a supervised threshold (purple). The center two panels show the false positives rate, i.e. number of particles selected by the class ranking procedure, but not by manual selection, divided by the number of selected particles in the manual selection, and the false negative rate, i.e. number of particles selected by manual selection, but not by the class ranking procedure, divided by the number of selected particles by manual selection, for the default (purple) and the supervised (orange) thresholds. The panel on the right shows the value of the supervised threshold (t = T).
Figure 8.
Figure 8.. Central slices of initial model reconstruction with three classes using VDAM algorithm for five data sets.

References

    1. Yip K.M., Fischer N., Paknia E., Chari A. and Stark H. (2020) Atomic-resolution protein structure determination by cryo-EM. Nature 587, 157–161 10.1038/s41586-020-2833-4 - DOI - PubMed
    1. Nakane T., Kotecha A., Sente A., McMullan G., Masiulis S. and Brown P.M. et al. (2020) Single-particle cryo-EM at atomic resolution. Nature 587, 152–156 10.1038/s41586-020-2829-0 - DOI - PMC - PubMed
    1. Frank J., Radermacher M., Penczek P., Zhu J., Li Y., Ladjadj M. et al. (1996) SPIDER and WEB: processing and visualization of images in 3D electron microscopy and related fields. J. Struct. Biol. 116, 190–199 10.1006/jsbi.1996.0030 - DOI - PubMed
    1. van Heel M., Harauz G., Orlova E.V., Schmidt R. and Schatz M. (1996) A new generation of the imagic image processing system. J. Struct. Biol. 116, 17–24 10.1006/jsbi.1996.0004 - DOI - PubMed
    1. Crowther R., Henderson R. and Smith J.M. (1996) MRC image processing programs. J. Struct. Biol. 116, 9–16 10.1006/jsbi.1996.0003 - DOI - PubMed

Publication types