Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018;11(2):1441-1492.
doi: 10.1137/17M1153509. Epub 2018 May 31.

Structural Variability from Noisy Tomographic Projections

Affiliations

Structural Variability from Noisy Tomographic Projections

Joakim Andén et al. SIAM J Imaging Sci. 2018.

Abstract

In cryo-electron microscopy, the three-dimensional (3D) electric potentials of an ensemble of molecules are projected along arbitrary viewing directions to yield noisy two-dimensional images. The volume maps representing these potentials typically exhibit a great deal of structural variability, which is described by their 3D covariance matrix. Typically, this covariance matrix is approximately low rank and can be used to cluster the volumes or estimate the intrinsic geometry of the conformation space. We formulate the estimation of this covariance matrix as a linear inverse problem, yielding a consistent least-squares estimator. For n images of size N-by-N pixels, we propose an algorithm for calculating this covariance estimator with computational complexity O ( n N 4 + κ N 6 log N ) , where the condition number κ is empirically in the range 10-200. Its efficiency relies on the observation that the normal equations are equivalent to a deconvolution problem in six dimensions. This is then solved by the conjugate gradient method with an appropriate circulant preconditioner. The result is the first computationally efficient algorithm for consistent estimation of the 3D covariance from noisy projections. It also compares favorably in runtime with respect to previously proposed nonconsistent estimators. Motivated by the recent success of eigenvalue shrinkage procedures for high-dimensional covariance matrix estimation, we incorporate a shrinkage procedure that improves accuracy at lower signal-to-noise ratios. We evaluate our methods on simulated datasets and achieve classification results comparable to state-of-the-art methods in shorter running time. We also present results on clustering volumes in an experimental dataset, illustrating the power of the proposed algorithm for practical determination of structural variability.

Keywords: 44A12; 62G05; 62H30; 62J07; 62J10; 65R32; 68U10; 92C55; Toeplitz matrices; conjugate gradient; cryo-electron microscopy; deconvolution; heterogeneity; principal component analysis; shift invariance; single-particle reconstruction.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Two sample cryo-EM images from a 10000-image dataset depicting the 70S ribosome complex in E. Coli [46]. Each image measures 130-by-130 with a pixel size of 2.82 Å. The images depict two similar molecular structures projected in approximately the same viewing direction, but the high noise level makes it difficult to distinguish the difference in structure.
Figure 2
Figure 2
The eigenvalue distribution of the sample covariance matrix for (a) p = 256, n = 512 and (b) p = 128, n = 1024 with σ = 1 and ℓ = 3 in both regimes. For (a), we have γ = 1/2 and the spiked covariance model predicts a maximum noise eigenvalue at (1+1/2)22.91 and a signal eigenvalue at λ(3, 1/2) ≈ 4.67, while for (b), γ = 1/8 gives (1+1/8)21.83 and λ(3, 1/8) ≈ 4.17.
Figure 3
Figure 3
Effect of shrinkage on the relative error of (a) Bn and (b) Σn for simulated data with N = 16, C = 2, and 7 distinct CTFs. The signal-to-noise ratio (see (91)) is 0.001.
Figure 4
Figure 4
The square magnitudes |ℱh(ω)|2 of two sample CTFs. Their sum forms a bandpass filter, worsening the conditioning of the least-squares estimators.
Figure 5
Figure 5
The simulation ground truth at N = 130 (top) and N = 16 (bottom). (a), (b) Two conformations of the 70S ribosome. (c) Their mean volume (red) and difference map (positive in blue, negative in green).
Figure 6
Figure 6
Covariance estimation results for different simulations with N = 16, C = 2, seven distinct CTFs, and unless otherwise noted, n = 4096 and uniform distribution of orientations. (a) The relative error in Σn as a function of SNRh for n = 1024, n = 4096, and n = 16384. (b) The correlation of the top eigenvector of Σn with that of Cov[x]. (c) Top eigenvector correlations for Σn and n(s). (d) Top eigenvector correlations for Σn with orientations estimated using the ASPIRE toolbox. (e) Top eigenvector correlations for Σn with different orientation distributions over SO(3) described by (92). (f) The cosine of the maximum principal angle between the top three eigenvectors of Σn and those of Cov[x] for a simulation with C = 4 classes.
Figure 7
Figure 7
Clustering results for discrete variability with C = 2 classes imaged using n = 4096 images with resolution N = 16 for uniform distribution of viewing angles and seven distinct CTFs. (a) The top 32 eigenvalues of Σn obtained at SNRh = 0.01. (b) A histogram of the coordinates α^s(1) corresponding to the images ys for s = 1, … , n subject to the same SNRh. (c) The fraction of images classified correctly as a function of SNRh. (d) The normalized root mean squared error (NRMSE) of the reconstructed volumes.
Figure 8
Figure 8
Manifold learning results for continuous variability (a) Volumes are generated by independently rotating two parts (green and blue) by angles θ1 and θ2 while keeping the remainder (red) fixed. (b) The cosine of the maximum principal angle between the top four population eigenvectors and those of Σn. (c) 3D diffusion map embedding coordinates of the volume coordinates {ᾱ1, … , n}, colored according to the first and second rotation angles. (d) The NRMSE of each volume estimate as a function of its diffusion map coordinate. (e) The NRMSE of the reconstructed volumes as a function of SNRh.
Figure 9
Figure 9
(a) The relative residuals for each iteration of CG applied to Anμn = bn, denoted by μn(t), with no CTF and uniform distribution of viewing angles. (b)–(d) The relative residuals for each CG iterate n(t) of Lnn) = Bn with (b) no CTF and uniform distribution of viewing angles, (c) three distinct CTFs and uniform distribution of viewing angles, and (d) three distinct CTFs and nonuniform distribution of viewing angles. For all plots, the residuals of the standard (nonpreconditoned) CG method are compared with using a circulant preconditioner. All methods were applied to n = 16384 images with size N = 16 and σ2 = 1.
Figure 10
Figure 10
Running times for the whole covariance estimation algorithm applied to a dataset of size n = 16384 with varying image size N. Three scenarios are considered: no CTF with uniform distribution of viewing angles, three distinct CTFs with uniform distribution of viewing angles, and three CTFs with nonuniform distribution of viewing angles.
Figure 11
Figure 11
Covariance estimation on the 70S ribosome dataset. (a) Largest eigenvalues of the estimated covariance matrix Σn. (b) The estimated mean volume (red), together with the positive (blue) and negative (green) components of the top eigenvector. (c) Histogram of coordinates {ᾱ1, … , ᾱn} from the Wiener filter estimator. (d) 2D histogram of coordinates { (α^1(1),α^1(2)),,(α^n(1),α^n(2))} from the Wiener filter estimator. (e), (f) Full-resolution reconstructions obtained using RELION applied to the clusters identified in (c).

References

    1. Ammar GS, Gragg WB. The generalized Schur algorithm for the superfast solution of Toeplitz systems. In: Gilewicz J, Pindor M, Siemaszko W, editors. Rational Approximation and Its Applications in Mathematics and Physics. Springer; New York: 1987. pp. 315–330.https://doi.org/10.1007/BFb0072474 Lecture Notes in Math. 1237. - DOI
    1. Amunts A, Brown A, Bai X-c, Llácer JL, Hussain T, Emsley P, Long F, Murshudov G, Scheres SHW, Ramakrishnan V. Structure of the yeast mitochondrial large ribosomal subunit. Science. 2014;343:1485–1489. https://doi.org/10.1126/science.1249410 - DOI - PMC - PubMed
    1. Andén J, Katsevich E, Singer A. Covariance estimation using conjugate gradient for 3D classification in CRYO-EM. Proceedings of ISBI. 2015:200–204. https://doi.org/10.1109/ISBI.2015.7163849 - DOI - PMC - PubMed
    1. Andén J, Singer A. Factor analysis for spectral estimation. Proceedings of SampTA. 2017:169–173. https://doi.org/10.1109/SAMPTA.2017.8024447 - DOI - PMC - PubMed
    1. Axelsson O. Iterative Solution Methods. Cambridge University Press; Cambridge, UK: 1996.

LinkOut - more resources