Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Oct 1:18:2818-2825.
doi: 10.1016/j.csbj.2020.09.033. eCollection 2020.

Deep metabolome: Applications of deep learning in metabolomics

Affiliations
Review

Deep metabolome: Applications of deep learning in metabolomics

Yotsawat Pomyen et al. Comput Struct Biotechnol J. .

Abstract

In the past few years, deep learning has been successfully applied to various omics data. However, the applications of deep learning in metabolomics are still relatively low compared to others omics. Currently, data pre-processing using convolutional neural network architecture appears to benefit the most from deep learning. Compound/structure identification and quantification using artificial neural network/deep learning performed relatively better than traditional machine learning techniques, whereas only marginally better results are observed in biological interpretations. Before deep learning can be effectively applied to metabolomics, several challenges should be addressed, including metabolome-specific deep learning architectures, dimensionality problems, and model evaluation regimes.

Keywords: AI, Artificial Intelligence; ANN, Artificial Neural Network; AUC, Area Under the receiver-operating characteristic Curve; Artificial neural network; CCS value, Collision Cross Section value; CFM-EI, Competitive Fragmentation Modeling-Electron Ionization; CNN, Convolutional Neural Network; DL, Deep Learning; DNN, Deep Neural Network; Deep learning; ECFP, Extended Circular Fingerprint; ER, Estrogen Receptor; FID, Free Induction Decay; FP score, Fingerprint correlation score; FTIR, Fourier Transform Infrared; GC–MS, Gas Chromatography-Mass Spectrometry; HDLSS data, High Dimensional Low Sample Size data; IST, Iterative Soft Thresholding; LC-MS, Liquid Chromatography-Mass Spectrometry; LSTM, Long Short-Term Memory; ML, Machine Learning; MLP, Multi-layered Perceptron; MS, Mass Spectrometry; Mass spectrometry; Metabolomics; NEIMS, Neural Electron-Ionization Mass Spectrometry; NMR; NMR, Nuclear Magnetic Resonance; NUS, Non-Uniformly Sampling; PARAFAC2, Parallel Factor Analysis 2; RF, Random Forest; RNN, Recurrent Neural Network; ReLU, Rectified Linear Unit; SMARTS, SMILES arbitrary target specification; SMILE, Sparse Multidimensional Iterative Lineshape-enhanced; SMILES, Simplified Molecular-Input Line-Entry System; SRA, Sequence Read Archive; VAE, Variational Autoencoder; istHMS, Implementation of IST at Harvard Medical School; m/z, mass/charge ratio.

PubMed Disclaimer

Figures

None
Graphical abstract
Fig. 1
Fig. 1
A) Number of publications with the keyword “deep learning” extracted from PUBMED database from 2015 to April 2020 in the genomics, transcriptomics, proteomics, and metabolomics. B) Three categories of metabolomics application that have applied deep learning. C) Barplot of the number of parameters based on different neural network architectures and applications. RNN, recurrent neural network; CNN, convolutional neural network; ANN, shallow artificial neural network.
Fig. 2
Fig. 2
A) Combination of data augmentation and weight sharing from different studies can alleviate dimensionality problem in metabolomics. B) Biological data interpretation could benefit from non-image data conversion to leverage the power of CNN architecture. C) Model evaluation should employ nested cross-validation instead of conventional k-fold cross validation.

References

    1. Grapov D. Rise of Deep Learning for Genomic, Proteomic, and Metabolomic Data Integration in Precision Medicine. OMICS. 2018;22(10):630–636. - PMC - PubMed
    1. Cambiaghi A., Ferrario M., Masseroli M. Analysis of metabolomic data: tools, current strategies and future challenges for omics data integration. Briefings Bioinf. 2016;18(3):498–510. - PubMed
    1. Smith R., Ventura D., Prince J.T. LC-MS alignment in theory and practice: a comprehensive algorithmic review. Briefings Bioinf. 2013;16(1):104–117. - PubMed
    1. Alonso A., Marsal S., Julià A. Analytical methods in untargeted metabolomics: state of the art in 2015. Front Bioeng Biotechnol. 2015;3:23. - PMC - PubMed
    1. Nguyen D.H., Nguyen C.H., Mamitsuka H. Recent advances and prospects of computational methods for metabolite identification: a review with emphasis on machine learning approaches. Briefings Bioinf. 2018;20(6):2028–2043. - PMC - PubMed

LinkOut - more resources