This is a preprint.
SPA-STOCSY: An Automated Tool for Identification of Annotated and Non-Annotated Metabolites in High-Throughput NMR Spectra
- PMID: 36865102
- PMCID: PMC9980041
- DOI: 10.1101/2023.02.22.529564
SPA-STOCSY: An Automated Tool for Identification of Annotated and Non-Annotated Metabolites in High-Throughput NMR Spectra
Update in
-
SPA-STOCSY: an automated tool for identifying annotated and non-annotated metabolites in high-throughput NMR spectra.Bioinformatics. 2023 Oct 3;39(10):btad593. doi: 10.1093/bioinformatics/btad593. Bioinformatics. 2023. PMID: 37792497 Free PMC article.
Abstract
Nuclear Magnetic Resonance (NMR) spectroscopy is widely used to analyze metabolites in biological samples, but the analysis can be cumbersome and inaccurate. Here, we present a powerful automated tool, SPA-STOCSY (Spatial Clustering Algorithm - Statistical Total Correlation Spectroscopy), which overcomes the challenges by identifying metabolites in each sample with high accuracy. As a data-driven method, SPA-STOCSY estimates all parameters from the input dataset, first investigating the covariance pattern and then calculating the optimal threshold with which to cluster data points belonging to the same structural unit, i.e. metabolite. The generated clusters are then automatically linked to a compound library to identify candidates. To assess SPA-STOCSY’s efficiency and accuracy, we applied it to synthesized and real NMR data obtained from Drosophila melanogaster brains and human embryonic stem cells. In the synthesized spectra, SPA outperforms Statistical Recoupling of Variables, an existing method for clustering spectral peaks, by capturing a higher percentage of the signal regions and the close-to-zero noise regions. In the real spectra, SPA-STOCSY performs comparably to operator-based Chenomx analysis but avoids operator bias and performs the analyses in less than seven minutes of total computation time. Overall, SPA-STOCSY is a fast, accurate, and unbiased tool for untargeted analysis of metabolites in the NMR spectra. As such, it might accelerate the utilization of NMR for scientific discoveries, medical diagnostics, and patient-specific decision making.
Conflict of interest statement
Competing interests
The authors declare no competing interests.
Figures





Similar articles
-
SPA-STOCSY: an automated tool for identifying annotated and non-annotated metabolites in high-throughput NMR spectra.Bioinformatics. 2023 Oct 3;39(10):btad593. doi: 10.1093/bioinformatics/btad593. Bioinformatics. 2023. PMID: 37792497 Free PMC article.
-
Tackling the Peak Overlap Issue in NMR Metabolomics Studies: 1D Projected Correlation Traces from Statistical Correlation Analysis on Nontilted 2D 1H NMR J-Resolved Spectra.J Proteome Res. 2019 May 3;18(5):2241-2253. doi: 10.1021/acs.jproteome.9b00093. Epub 2019 Apr 8. J Proteome Res. 2019. PMID: 30916564
-
Fast Metabolite Identification in Nuclear Magnetic Resonance Metabolomic Studies: Statistical Peak Sorting and Peak Overlap Detection for More Reliable Database Queries.J Proteome Res. 2018 Jan 5;17(1):392-401. doi: 10.1021/acs.jproteome.7b00617. Epub 2017 Nov 27. J Proteome Res. 2018. PMID: 29135266
-
Two-dimensional statistical recoupling for the identification of perturbed metabolic networks from NMR spectroscopy.J Proteome Res. 2010 Sep 3;9(9):4513-20. doi: 10.1021/pr1002615. J Proteome Res. 2010. PMID: 20590164
-
Optimizing 1D 1H-NMR profiling of plant samples for high throughput analysis: extract preparation, standardization, automation and spectra processing.Metabolomics. 2019 Feb 26;15(3):28. doi: 10.1007/s11306-019-1488-3. Metabolomics. 2019. PMID: 30830443 Free PMC article. Review.
References
Publication types
LinkOut - more resources
Full Text Sources
Research Materials