Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Sep 26;11(39):eadw8122.
doi: 10.1126/sciadv.adw8122. Epub 2025 Sep 26.

Scalable deep learning reconstruction for accelerated multidimensional nuclear magnetic resonance spectroscopy of proteins

Affiliations

Scalable deep learning reconstruction for accelerated multidimensional nuclear magnetic resonance spectroscopy of proteins

Yihui Huang et al. Sci Adv. .

Abstract

High-dimensional nuclear magnetic resonance (NMR) spectroscopy can assist in determining protein structure, but it requires time-consuming acquisition. Deep learning enables ultrafast reconstruction but is limited to spectra of up to three dimensions and cannot provide faithful reconstruction under unseen acceleration factors. Extending deep learning to handle higher-dimensional spectra and varying acceleration factors is desirable. However, scalability requires complex networks and more data, seriously hindering applications. To address this, we designed a network to learn data in one dimension (1D). First, time-domain signals were modeled as the outer product of 1D exponentials. Then, each 1D exponential was approximated with a rank-one Hankel matrix. Last, reconstruction error was corrected with a neural network. Here, we demonstrate robust 3D NMR reconstruction across acceleration factors (2 to 33) using one trained network. In addition, we find that reconstruction of 4D NMR is possible with artificial intelligence. This work opens an avenue for accelerating arbitrarily high-dimensional NMR.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.. Reconstructed 3D HNCACB spectra of GB1-HttNTQ7 protein.
(A) Fully sampled spectrum; (B to E and G to J) reconstructions by SMILE, CS, HLRF, WPR-Net, DLNMR, MoDern, JTF-Net, and ROAD from 5% data; (L to O) reconstructions by DLNMR, MoDern, JTF-Net, and ROAD from 20% data. (F and K) Sampling patterns at NUS rates of 5 and 20%. Note: The first and second rows are the spectrum projections on 1H-15N and 1H-13C planes. Note: The original FID is fully sampled and in size of 90 × 44 × 1024. The NUS is performed on the indirect plane of 90 × 44. JTF-Net is trained by a 2D NUS dataset of sampling rates from 6 to 8%, and WPR-Net is trained by 1D NUS dataset under a sampling rate of 20%. Model weights are shared by the original authors. Other DL methods are trained by 2D NUS dataset under a sampling rate of 5%. Values in pink and blue denote the best evaluation metric under each sampling rate. Arrows in pink and blue mark missing peaks and false peaks, respectively. ppm, parts per million.
Fig. 2.
Fig. 2.. Reconstructed 3D HNCO spectra of Azurin protein.
(A) Fully sampled spectrum; (B to E and G to J) reconstructions by SMILE, CS, HLRF, WPR-Net, DLNMR, MoDern, JTF-Net, and ROAD from 3% data; (L to O) reconstructions by DLNMR, MoDern, JTF-Net, and ROAD from 50% data. (F and K) Sampling patterns at NUS rates of 3 and 50%. Note: The first and second rows are the spectrum projections on 1H-15N and 1H-13C planes. The original FID is fully sampled and in size of 60 × 60 × 1024. NUSs are performed on the indirect plane of 60 × 60. JTF-Net is trained by a 2D NUS dataset of sampling rates from 6 to 8%, and WPR-Net is trained by 1D NUS dataset under a sampling rate of 20%. Model weights are shared by the original authors. Other DL methods are trained by 2D NUS dataset under a sampling rate of 5%. The 2D planes of spectra marked with a red box are shown in section S11.4. Values in pink and blue denote the best metrics under sampling rates of 3 and 50%.
Fig. 3.
Fig. 3.. Mean of quantitative metrics of reconstructed spectra.
(A to D) FPR, MPR, error of the center frequency of peaks, and SPCC of peak intensities of 3D HNCACB spectra of GB1-HttNTQ7 protein. (E to H) FPR, MPR, error of the center frequency of peaks, and SPCC of peak intensities of 3D HNCO spectra of Azurin protein. Note: Each data point represents the quantitative metric of one NUS experiment. Lower value in (A) to (C) and (E) to (G) and higher value in (D) and (H) mean better reconstruction.
Fig. 4.
Fig. 4.. Reconstructed 4D 13C-13C-SF HMQC NOESY spectrum of human MALT1 protein.
(A and B) Reconstructions performed by CS and ROAD from 9% of the data. (C to F) The assignments of diagonal and cross peaks from (A) and (B), respectively. Note: The first, second, and third columns are projections of the reconstructed spectrum on 1H-1H, 13C-1H, and 13C-13C planes. The fourth and fifth columns represent the 2D planes in the 13C/1H dimensions of 24.702/0.817 ppm and 20.401/0.139 ppm. The original FID is NUS sampled and with a size of 32 × 40 × 40 × 1024, and NUS is performed on the first three dimensions.
Fig. 5.
Fig. 5.. Reconstructed 4D methyl HMQC-NOESY-HMQC spectrum of the isoleucine, leucine and valine methyl-labeled m04 protein of cytomegalovirus.
(A and B) Reconstructions performed by SMILE and ROAD from 1.56% of the data. (C to F) The 2D planes extracted in 13C/1H dimension at 15.066/0.717 ppm and 23.474/0.558 ppm, respectively. Note: The first, second, and third columns are projections of the reconstructed spectrum on 1H-1H, 13C-1H, and 13C-13C planes. The original FID is NUS sampled and with a size of 56 × 80 × 80 × 1024, and NUS is performed on the first three dimensions.
Fig. 6.
Fig. 6.. Assignment of peaks in 2D planes of the reconstructed 4D 13C-13C-SF HMQC NOESY spectrum of human MALT1 protein.
(A and B) Reconstructions performed by CS and ROAD from 9% of the data. The original FID is NUS sampled and with a size of 32 × 40 × 40 × 1024, and the 2D plane is extracted at the 13C/1H dimensions of 20.453/0.329 ppm.
Fig. 7.
Fig. 7.. Reconstructed 4D 13C-13C-SF HMQC NOESY spectrum of human MALT1 protein.
(A and B) Reconstructions performed by CS and ROAD from 9% of the data. The Ph0 in one of 13C dimensions is not adjusted to zero before reconstruction. The first, second, and third columns are projections of the reconstructed spectrum on 1H-1H, 13C-1H, and 13C-13C planes.
Fig. 8.
Fig. 8.. Multidimensional modeling of NMR with outer product of exponentials.
(A) Spectrum. (B) FID.
Fig. 9.
Fig. 9.. Algorithm pipeline of ROAD.
(A) The iteration block for 2D NUS reconstruction. (B) Peak retrieval module. (C) Factor matrix correction module (FMCM). (D) ROA module. (E) The iteration block for 3D NUS reconstruction. In (A), the blue lines with arrows at the top represent the undersampled time-domain signal Y˙ and the undersampling operator 𝒰 from the input of ROAD to the modules. Blue lines with arrows at the top represent the undersampled FID Y˙ and the undersampling operator 𝒰 , which are not changed. Blue lines with arrows at the bottom and the middle represent the intermediate time-domain variables, which are changed in flowing from one module to another module.
Fig. 10.
Fig. 10.. Algorithm interpretation with a toy 2D example.
(A) A NUS spectrum before reconstruction; (B) one peak of (A); (C and D) the reconstruction of (B) in the third iteration block without and with data consistency (without neural network correction); (E) fully sampled spectrum; (F) one peak of (E); (H and G) the reconstructed peaks in the 3rd and 10th iteration blocks with neural network correction.

References

    1. Hiller S., Ibraghimov I., Wagner G., Orekhov V. Y., Coupled decomposition of four-dimensional noesy spectra. J. Am. Chem. Soc. 131, 12970–12978 (2009). - PMC - PubMed
    1. A. Marintchev, D. Frueh, G. Wagner, in Methods in Enzymology, J. Lorsch, Ed. (Academic Press, 2007), vol. 430, pp. 283–331. - PubMed
    1. Tugarinov V., Choy W.-Y., Orekhov V. Y., Kay L. E., Solution NMR-derived global fold of a monomeric 82-kDa enzyme. Proc. Natl. Acad. Sci. U.S.A. 102, 622–627 (2005). - PMC - PubMed
    1. Hiller S., Garces R. G., Malia T. J., Orekhov V. Y., Colombini M., Wagner G., Solution structure of the integral human membrane protein VDAC-1 in detergent micelles. Science 321, 1206–1210 (2008). - PMC - PubMed
    1. Han X., Levkovets M., Lesovoy D., Sun R., Wallerstein J., Sandalova T., Agback T., Achour A., Agback P., Orekhov V. Y., Assignment of IVL-Methyl side chain of the ligand-free monomeric human MALT1 paracaspase-IgL3 domain in solution. Biomol. NMR Assign. 16, 363–371 (2022). - PMC - PubMed

MeSH terms

LinkOut - more resources