Data processing solutions to render metabolomics more quantitative: case studies in food and clinical metabolomics using Metabox 2.0
- PMID: 38488666
- PMCID: PMC10941642
- DOI: 10.1093/gigascience/giae005
Data processing solutions to render metabolomics more quantitative: case studies in food and clinical metabolomics using Metabox 2.0
Abstract
In classic semiquantitative metabolomics, metabolite intensities are affected by biological factors and other unwanted variations. A systematic evaluation of the data processing methods is crucial to identify adequate processing procedures for a given experimental setup. Current comparative studies are mostly focused on peak area data but not on absolute concentrations. In this study, we evaluated data processing methods to produce outputs that were most similar to the corresponding absolute quantified data. We examined the data distribution characteristics, fold difference patterns between 2 metabolites, and sample variance. We used 2 metabolomic datasets from a retail milk study and a lupus nephritis cohort as test cases. When studying the impact of data normalization, transformation, scaling, and combinations of these methods, we found that the cross-contribution compensating multiple standard normalization (ccmn) method, followed by square root data transformation, was most appropriate for a well-controlled study such as the milk study dataset. Regarding the lupus nephritis cohort study, only ccmn normalization could slightly improve the data quality of the noisy cohort. Since the assessment accounted for the resemblance between processed data and the corresponding absolute quantified data, our results denote a helpful guideline for processing metabolomic datasets within a similar context (food and clinical metabolomics). Finally, we introduce Metabox 2.0, which enables thorough analysis of metabolomic data, including data processing, biomarker analysis, integrative analysis, and data interpretation. It was successfully used to process and analyze the data in this study. An online web version is available at http://metsysbio.com/metabox.
Keywords: R package; data processing; metabolomics; normalization; quantitative analysis; scaling; semiquantitative analysis; transformation.
© The Author(s) 2024. Published by Oxford University Press GigaScience.
Conflict of interest statement
The authors declare that they have no competing interests.
Figures





Similar articles
-
Metabox: A Toolbox for Metabolomic Data Analysis, Interpretation and Integrative Exploration.PLoS One. 2017 Jan 31;12(1):e0171046. doi: 10.1371/journal.pone.0171046. eCollection 2017. PLoS One. 2017. PMID: 28141874 Free PMC article.
-
NormalizeMets: assessing, selecting and implementing statistical methods for normalizing metabolomics data.Metabolomics. 2018 Mar 20;14(5):54. doi: 10.1007/s11306-018-1347-7. Metabolomics. 2018. PMID: 30830328
-
MAFFIN: metabolomics sample normalization using maximal density fold change with high-quality metabolic features and corrected signal intensities.Bioinformatics. 2022 Jun 27;38(13):3429-3437. doi: 10.1093/bioinformatics/btac355. Bioinformatics. 2022. PMID: 35639662
-
Using MetaboAnalyst 3.0 for Comprehensive Metabolomics Data Analysis.Curr Protoc Bioinformatics. 2016 Sep 7;55:14.10.1-14.10.91. doi: 10.1002/cpbi.11. Curr Protoc Bioinformatics. 2016. PMID: 27603023 Review.
-
Evaluating Machine Learning Methods of Analyzing Multiclass Metabolomics.J Chem Inf Model. 2023 Dec 25;63(24):7628-7641. doi: 10.1021/acs.jcim.3c01525. Epub 2023 Dec 11. J Chem Inf Model. 2023. PMID: 38079572 Review.
Cited by
-
Quantifying fecal and plasma short-chain fatty acids in healthy Thai individuals.Comput Struct Biotechnol J. 2024 May 8;23:2163-2172. doi: 10.1016/j.csbj.2024.05.007. eCollection 2024 Dec. Comput Struct Biotechnol J. 2024. PMID: 38827233 Free PMC article.
-
DNEA: an R package for fast and versatile data-driven network analysis of metabolomics data.BMC Bioinformatics. 2024 Dec 18;25(1):383. doi: 10.1186/s12859-024-05994-1. BMC Bioinformatics. 2024. PMID: 39695921 Free PMC article.
-
LC-QTOF-MSE with MS1-based precursor ion quantification and SiMD-assisted identification enhances human urine metabolite analysis.Comput Struct Biotechnol J. 2025 Jul 10;27:3079-3089. doi: 10.1016/j.csbj.2025.07.009. eCollection 2025. Comput Struct Biotechnol J. 2025. PMID: 40703097 Free PMC article.