Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2025 Apr 28:2024.05.30.596729.
doi: 10.1101/2024.05.30.596729.

CryoDRGN-AI: Neural ab initio reconstruction of challenging cryo-EM and cryo-ET datasets

Affiliations

CryoDRGN-AI: Neural ab initio reconstruction of challenging cryo-EM and cryo-ET datasets

Axel Levy et al. bioRxiv. .

Update in

Abstract

Proteins and other biomolecules form dynamic macromolecular machines that are tightly orchestrated to move, bind, and perform chemistry. Cryo-electron microscopy (cryo-EM) and cryo-electron tomography (cryo-ET) can access the intrinsic heterogeneity of these complexes and are therefore key tools for understanding their function. However, 3D reconstruction of the collected imaging data presents a challenging computational problem, especially without any starting information, a setting termed ab initio reconstruction. Here, we introduce cryoDRGN-AI, a method leveraging an expressive neural representation and combining an exhaustive search strategy with gradient-based optimization to process challenging heterogeneous datasets. Using cryoDRGN-AI, we reveal new conformational states in large datasets, reconstruct previously unresolved motions from unfiltered datasets, and demonstrate ab initio reconstruction of biomolecular complexes from in situ data. With this expressive and scalable model for structure determination, we hope to unlock the full potential of cryo-EM and cryo-ET as a high-throughput tool for structural biology and discovery.

PubMed Disclaimer

Conflict of interest statement

Competing Interests The authors declare no competing interests.

Figures

Figure 1:
Figure 1:. The cryoDRGN-AI method for ab initio heterogeneous reconstruction.
a) CryoDRGN-AI can process single particle images or subtilts for subtomogram averaging. b) Architecture overview. Objects in blue are optimized to minimize the loss L. For each image, a projection is generated by a differentiable forward model. Poses ϕi are first estimated using hierarchical pose search (HPS) and then refined by stochastic gradient descent (SGD). Latent embeddings zi are optimized by SGD. c) Mean out-of-plane angular error during training on a synthetic 80S ribosome dataset (100,000 particles, 128 × 128, 3.77 Å/pix, mean ± std. over 6 runs). d) Pose estimation switches to SGD once a set number of images have been processed by HPS. e) Gradient descent is 20x faster and leads to more accurate poses since it is not limited by the resolution of the search grid (mean ± std. over 6 replicates and 5 epochs for HPS, 100 epochs for SGD).
Figure 2:
Figure 2:. CryoDRGN-AI ab initio heterogeneous reconstruction of unfiltered benchmark datasets.
a) Latent embeddings visualized with PCA and reconstructed maps for the pre-catalytic spliceosome dataset (EMPIAR-10180 [17], 2.6 MDa, 128 × 128, 4.25 Å/pix., 327,490 particles). Dashed lines indicate outlines of the extended spliceosome. b) UMAP visualization of latent embeddings and reconstructed maps for the assembling bacterial large ribosomal sub-unit dataset (EMPIAR-10076 [16], approx. 2.1 to 3.3 MDa, 256 × 256, 1.64 Å/pix., 131,899 particles). Latent embeddings are colored by their published labels. Dashed lines indicate outlines of the fully mature 50S ribosome. c) UMAP visualization of latent embeddings and reconstructed maps for the SARS-CoV-2 spike protein dataset [18] (438 kDa, 128 × 128, 3.28 Å/pix., 369,429 particles). Reconstructed density maps of the closed and open states of the receptor binding domain with docked atomic models (PDB:6VXX, PDB:6VYB).
Figure 3:
Figure 3:. CryoDRGN-AI ab initio reconstruction of the DSL1/SNARE complex [19] dataset containing majority junk particles.
a) UMAP visualization of latent embeddings and reconstructed maps for the full dataset (214,511 particles, 128 × 128, 3.47 Å/pix.) with cryoDRGN-AI. The latent embeddings are clustered with k-means k=3. Particles associated with the green cluster are selected. b) CryoDRGN-AI latent embeddings visualized with PCA and reconstructed maps for 75,854 selected particles from a. Additional density maps are shown in Supplementary Video 1. c) Density map from b with docked atomic model (PDB: 8EKI).
Figure 4:
Figure 4:. CryoDRGN-AI ab initio reconstruction of the V-ATPase complex.
a) UMAP visualization of cryoDRGN-AI latent embeddings of the full dataset (EMPIAR-10874 [20], 267,216 particles, 128 × 128, 3.97 Å/pix.). Selected particles are in green. b, c) UMAP visualization of cryoDRGN-AI latent embeddings on the filtered dataset (177,481 particles) with three rotary states (b) or mEAK-7 binding location (c). Gaussian kernel density estimates are overlaid along with visually classified k-means centroids k=100 as small circles, and points corresponding to the maps in panel (e) as large circles. d) Sampled density map showing a broken complex from the unfiltered dataset. e) Sampled density maps of the three rotary states and mEAK-7 binding.
Figure 5:
Figure 5:. CryoDRGN-AI ab initio reconstruction of the human erythrocyte ankyrin-1 complex and identification of a new “supercomplex” state.
a) CryoDRGN-AI latent embeddings visualized with PCA and reconstructed maps for the full dataset (EMPIAR-11043 [21], 710,437 particles, 128 × 128, 2.92 Å/pix.). Reconstructed density maps correspond to the six published classes [21] (1–6), which are distinguished by micelle composition. CryoDRGN-AI also reveals the presence of a new “supercomplex” state (7), which simultaneously contains the rhesus heterotrimer (Rh), the aquaporin (AQP1), the band 3-I (B3-I), the band 3-II (B3-II), the band 3-III (B3-III) dimers and an unknown protein Y in the micelle. Additional density maps shown in Supplementary Video 2. b) Linear trajectory in the populated region between 2a and 2b reveals a continuous rotation of the ankyrin (ANK1) relative to the micelle. Plane of view shown as dotted line in 2a in panel a. c) Validation of the supercomplex structure. A homogeneous reconstruction in cryoSPARC [15] on 20k particles with poses from cryoDRGN-AI. Map-to-map and half-map FSC curves.
Figure 6:
Figure 6:. Single shot heterogeneous ab initio subtomogram averaging with cryoDRGN-AI.
a) UMAP visualization of cryoDRGN-AI latent embeddings of a dataset of the M. pneumoniae 70S ribosome (EMPIAR-10499 [22]). The 100 maps obtained by k-means clustering are classified by visual inspection according to the content of the tRNA sites, between the small sub-unit (SSU) and the large sub-unit (LSU). On top of a 2D scatterplot of the latent embeddings, the classified centroids and a Gaussian KDE of the distribution of states are shown. Large circles indicate the latent embeddings of maps shown in panel b. b) Three intermediate states of the ribosome during the translation elongation cycle. c) Visualization of a representative M. pneumoniae tomogram with ribosomes colored by their class assignments from panel a. d) Sampled maps displaying density for neighboring ribosomes in polysomes.

Similar articles

References

    1. Nakane T. et al. Single-particle cryo-em at atomic resolution. Nature 587, 152–156 (2020). - PMC - PubMed
    1. Yip K. M., Fischer N., Paknia E., Chari A. & Stark H. Atomic-resolution protein structure determination by cryo-em. Nature 587, 157–161 (2020). - PubMed
    1. Frank J. & Ourmazd A. Continuous changes in structure mapped by manifold embedding of single-particle data in cryo-em. Methods 100, 61–67 (2016). - PMC - PubMed
    1. Punjani A. & Fleet D. J. 3d variability analysis: resolving continuous flexibility and discrete heterogeneity from single particle cryo-em. J. Struct. Biol. 213, 107702 (2021). - PubMed
    1. Zhong E. D., Bepler T., Berger B., & Davis J. H. Cryodrgn: Reconstruction of heterogeneous cryo-em structures using neural networks. Nat. Methods 18, 176–185 (2021). - PMC - PubMed

Methods-Only References

    1. Vulović M. et al. Image formation modeling in cryo-electron microscopy. J. Struct. Biol. 183, 19–32 (2013). - PubMed
    1. Tancik M. et al. Fourier features let networks learn high frequency functions in low dimensional domains. In Advances in Neural Information Processing Systems (NeurIPS, 2020).
    1. He K. Zhang X., Ren S. & Sun J. Deep residual learning for image recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision 770–778 (CVPR, 2016).
    1. Paszke A. et al. Pytorch: an imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems (NeurIPS, 2019).
    1. Kingma D. P. & Ba J. Adam: A method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).

Publication types

LinkOut - more resources