Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Feb 14;15(2):132.
doi: 10.3390/metabo15020132.

Deep Learning-Based Molecular Fingerprint Prediction for Metabolite Annotation

Affiliations

Deep Learning-Based Molecular Fingerprint Prediction for Metabolite Annotation

Hoi Yan Katharine Chau et al. Metabolites. .

Abstract

Background/Objectives: Liquid chromatography coupled with mass spectrometry (LC-MS) is a commonly used platform for many metabolomics studies. However, metabolite annotation has been a major bottleneck in these studies in part due to the limited publicly available spectral libraries, which consist of tandem mass spectrometry (MS/MS) data acquired from just a fraction of known compounds. Application of deep learning methods is increasingly reported as an alternative to spectral matching due to their ability to map complex relationships between molecular fingerprints and mass spectrometric measurements. The objectives of this study are to investigate deep learning methods for molecular fingerprint based on MS/MS spectra and to rank putative metabolite IDs according to similarity of their known and predicted molecular fingerprints. Methods: We trained three types of deep learning methods to model the relationships between molecular fingerprints and MS/MS spectra. Prior to training, various data processing steps, including scaling, binning, and filtering, were performed on MS/MS spectra obtained from National Institute of Standards and Technology (NIST), MassBank of North America (MoNA), and Human Metabolome Database (HMDB). Furthermore, selection of the most relevant m/z bins and molecular fingerprints was conducted. The trained deep learning models were evaluated on ranking putative metabolite IDs obtained from a compound database for the challenges in Critical Assessment of Small Molecule Identification (CASMI) 2016, CASMI 2017, and CASMI 2022 benchmark datasets. Results: Feature selection methods effectively reduced redundant molecular and spectral features prior to model training. Deep learning methods trained with the truncated features have shown comparable performances against CSI:FingerID on ranking putative metabolite IDs. Conclusion: The results demonstrate a promising potential of deep learning methods for metabolite annotation.

Keywords: LC-MS/MS; deep learning; metabolite identification; molecular fingerprint prediction.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

Figure 1
Figure 1
Workflow of a deep learning-based metabolite annotation that includes MS/MS data processing, feature selection, model training, molecular fingerprint prediction, molecular formula prediction, candidate retrieval, and candidate ranking.
Figure 2
Figure 2
Architecture of a deep learning model for predicting molecular fingerprints based on MS/MS spectra transformed into vectors.

Similar articles

References

    1. Blazenovic I., Kind T., Ji J., Fiehn O. Software Tools and Approaches for Compound Identification of LC-MS/MS Data in Metabolomics. Metabolites. 2018;8:31. doi: 10.3390/metabo8020031. - DOI - PMC - PubMed
    1. Scheubert K., Hufsky F., Petras D., Wang M., Nothias L., Dührkop K., Bandeira N., Dorrestein P.C., Böcker S. Significance estimation for large scale metabolomics annotations by spectral matching. Nat. Commun. 2017;8:1494. doi: 10.1038/s41467-017-01318-5. - DOI - PMC - PubMed
    1. Koo I., Kim S., Zhang X. Comparative analysis of mass spectral matching-based compound identification in gas chromatography–mass spectrometry. J. Chromatogr. A. 2013;1298:132–138. doi: 10.1016/j.chroma.2013.05.021. - DOI - PMC - PubMed
    1. Schrimpe-Rutledge A.C., Codreanu S.G., Sherrod S.D., McLean J.A. Untargeted Metabolomics Strategies—Challenges and Emerging Directions. J. Am. Soc. Mass Spectrom. 2016;27:1897–1905. doi: 10.1007/s13361-016-1469-y. - DOI - PMC - PubMed
    1. NIST/EPA/NIH Mass Spectral Library. [(accessed on 22 July 2024)]; Available online: http://www.nist.gov/srd/nist1a.cfm.

LinkOut - more resources