Applications of machine learning in metabolomics: Disease modeling and classification
- PMID: 36506316
- PMCID: PMC9730048
- DOI: 10.3389/fgene.2022.1017340
Applications of machine learning in metabolomics: Disease modeling and classification
Abstract
Metabolomics research has recently gained popularity because it enables the study of biological traits at the biochemical level and, as a result, can directly reveal what occurs in a cell or a tissue based on health or disease status, complementing other omics such as genomics and transcriptomics. Like other high-throughput biological experiments, metabolomics produces vast volumes of complex data. The application of machine learning (ML) to analyze data, recognize patterns, and build models is expanding across multiple fields. In the same way, ML methods are utilized for the classification, regression, or clustering of highly complex metabolomic data. This review discusses how disease modeling and diagnosis can be enhanced via deep and comprehensive metabolomic profiling using ML. We discuss the general layout of a metabolic workflow and the fundamental ML techniques used to analyze metabolomic data, including support vector machines (SVM), decision trees, random forests (RF), neural networks (NN), and deep learning (DL). Finally, we present the advantages and disadvantages of various ML methods and provide suggestions for different metabolic data analysis scenarios.
Keywords: biomarkers; deep learning; machine learning; metabolic disorders; metabolomics.
Copyright © 2022 Galal, Talal and Moustafa.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures
References
-
- Ahola-Olli A. V., Mustelin L., Kalimeri M., Kettunen J., Jokelainen J., Auvinen J., et al. (2019). Circulating metabolites and the risk of type 2 diabetes: A prospective study of 11, 896 young adults from four Finnish cohorts. Diabetologia 62 (12), 2298–2309. 10.1007/s00125-019-05001-w - DOI - PMC - PubMed
-
- Airola A., Pahikkala T., Waegeman W., De Baets B., Salakoski T. (2011). An experimental comparison of cross-validation techniques for estimating the area under the ROC curve. Comput. Statistics Data Analysis 55 (4), 1828–1844. 10.1016/j.csda.2010.11.018 - DOI
Publication types
LinkOut - more resources
Full Text Sources
Other Literature Sources
