Development of biomarkers based on diet-dependent metabolic serotypes: practical issues in development of expert system-based classification models in metabolomic studies
- PMID: 15669713
- DOI: 10.1089/omi.2004.8.197
Development of biomarkers based on diet-dependent metabolic serotypes: practical issues in development of expert system-based classification models in metabolomic studies
Abstract
Dietary restriction (DR)-induced changes in the serum metabolome may be biomarkers for physiological status (e.g., relative risk of developing age-related diseases such as cancer). Megavariate analysis (unsupervised hierarchical cluster analysis [HCA]; principal components analysis [PCA]) of serum metabolites reproducibly distinguish DR from ad libitum fed rats. Component-based approaches (i.e., PCA) consistently perform as well as or better than distance-based metrics (i.e., HCA). We therefore tested the following: (A) Do identified subsets of serum metabolites contain sufficient information to construct mathematical models of class membership (i.e., expert systems)? (B) Do component-based metrics out-perform distance-based metrics? Testing was conducted using KNN (k-nearest neighbors, supervised HCA) and SIMCA (soft independent modeling of class analogy, supervised PCA). Models were built with single cohorts, combined cohorts or mixed samples from previously studied cohorts as training sets. Both algorithms over-fit models based on single cohort training sets. KNN models had >85% accuracy within training/test sets, but were unstable (i.e., values of k could not be accurately set in advance). SIMCA models had 100% accuracy within all training sets, 89 % accuracy in test sets, did not appear to over-fit mixed cohort training sets, and did not require post-hoc modeling adjustments. These data indicate that (i) previously defined metabolites are robust enough to construct classification models (expert systems) with SIMCA that can predict unknowns by dietary category; (ii) component-based analyses outperformed distance-based metrics; (iii) use of over-fitting controls is essential; and (iv) subtle inter-cohort variability may be a critical issue for high data density biomarker studies that lack state markers.
Similar articles
-
Development of biomarkers based on diet-dependent metabolic serotypes: characteristics of component-based models of metabolic serotypes.OMICS. 2004 Fall;8(3):221-38. doi: 10.1089/omi.2004.8.221. OMICS. 2004. PMID: 15669715
-
Characterization of diet-dependent metabolic serotypes: primary validation of male and female serotypes in independent cohorts of rats.J Nutr. 2002 May;132(5):1039-46. doi: 10.1093/jn/132.5.1039. J Nutr. 2002. PMID: 11983834
-
Development of biomarkers based on diet-dependent metabolic serotypes: concerns and approaches for cohort and gender issues in serum metabolome studies.OMICS. 2004 Fall;8(3):209-20. doi: 10.1089/omi.2004.8.209. OMICS. 2004. PMID: 15669714
-
Application of Kohonen Neural Networks in classification of biologically active compounds.SAR QSAR Environ Res. 1998;8(1-2):93-107. doi: 10.1080/10629369808033262. SAR QSAR Environ Res. 1998. PMID: 9517011 Review.
-
Class modelling by Soft Independent Modelling of Class Analogy: why, when, how? A tutorial.Anal Chim Acta. 2023 Aug 22;1270:341304. doi: 10.1016/j.aca.2023.341304. Epub 2023 May 19. Anal Chim Acta. 2023. PMID: 37311606 Review.
Cited by
-
Metabolomics Predicts Neuroimaging Characteristics of Transient Ischemic Attack Patients.EBioMedicine. 2016 Dec;14:131-138. doi: 10.1016/j.ebiom.2016.11.010. Epub 2016 Nov 9. EBioMedicine. 2016. PMID: 27843094 Free PMC article.
-
Research Progress of Urine Biomarkers in the Diagnosis, Treatment, and Prognosis of Bladder Cancer.Adv Exp Med Biol. 2021;1306:61-80. doi: 10.1007/978-3-030-63908-2_5. Adv Exp Med Biol. 2021. PMID: 33959906
-
A systems-level "misunderstanding": the plasma metabolome in Huntington's disease.Ann Clin Transl Neurol. 2015 Jul;2(7):756-68. doi: 10.1002/acn3.214. Epub 2015 May 28. Ann Clin Transl Neurol. 2015. PMID: 26273688 Free PMC article.
-
The integration of LC-MS and NMR for the analysis of low molecular weight trace analytes in complex matrices.Mass Spectrom Rev. 2020 Mar;39(1-2):35-54. doi: 10.1002/mas.21575. Epub 2018 Jul 19. Mass Spectrom Rev. 2020. PMID: 30024655 Free PMC article. Review.
-
Biological variability dominates and influences analytical variance in HPLC-ECD studies of the human plasma metabolome.BMC Clin Pathol. 2007 Nov 12;7:9. doi: 10.1186/1472-6890-7-9. BMC Clin Pathol. 2007. PMID: 17997839 Free PMC article.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources