Performance evaluate of different chemometrics formalisms used for prostate cancer diagnosis by NMR-based metabolomics
- PMID: 38127222
- DOI: 10.1007/s11306-023-02067-x
Performance evaluate of different chemometrics formalisms used for prostate cancer diagnosis by NMR-based metabolomics
Abstract
Introduction: In general, two characteristics are ever present in NMR-based metabolomics studies: (1) they are assays aiming to classify the samples in different groups, and (2) the number of samples is smaller than the feature (chemical shift) number. It is also common to observe imbalanced datasets due to the sampling method and/or inclusion criteria. These situations can cause overfitting. However, appropriate feature selection and classification methods can be useful to solve this issue.
Objectives: Investigate the performance of metabolomics models built from the association between feature selectors, the absence of feature selection, and classification algorithms, as well as use the best performance model as an NMR-based metabolomic method for prostate cancer diagnosis.
Methods: We evaluated the performance of NMR-based metabolomics models for prostate cancer diagnosis using seven feature selectors and five classification formalisms. We also obtained metabolomics models without feature selection. In this study, thirty-eight volunteers with a positive diagnosis of prostate cancer and twenty-three healthy volunteers were enrolled.
Results: Thirty-eight models obtained were evaluated using AUROC, accuracy, sensitivity, specificity, and kappa's index values. The best result was obtained when Genetic Algorithm was used with Linear Discriminant Analysis with 0.92 sensitivity, 0.83 specificity, and 0.88 accuracy.
Conclusion: The results show that the pick of a proper feature selection method and classification model, and a resampling method can avoid overfitting in a small metabolomic dataset. Furthermore, this approach would decrease the number of biopsies and optimize patient follow-up. 1H NMR-based metabolomics promises to be a non-invasive tool in prostate cancer diagnosis.
Keywords: Biomarkers; Feature selection; Metabonomics; Overfitting; Prostatic neoplasms; Proton magnetic resonance spectroscopy.
© 2023. The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
Similar articles
-
NMR metabolomic profiles associated with long-term risk of prostate cancer.Metabolomics. 2021 Mar 11;17(3):32. doi: 10.1007/s11306-021-01780-9. Metabolomics. 2021. PMID: 33704614
-
Feature Selection Methods for Early Predictive Biomarker Discovery Using Untargeted Metabolomic Data.Front Mol Biosci. 2016 Jul 8;3:30. doi: 10.3389/fmolb.2016.00030. eCollection 2016. Front Mol Biosci. 2016. PMID: 27458587 Free PMC article.
-
Automated prostate cancer detection via comprehensive multi-parametric magnetic resonance imaging texture feature models.BMC Med Imaging. 2015 Aug 5;15:27. doi: 10.1186/s12880-015-0069-9. BMC Med Imaging. 2015. PMID: 26242589 Free PMC article.
-
NMR-based metabolomics studies of human prostate cancer tissue.Metabolomics. 2018 Jun 18;14(7):88. doi: 10.1007/s11306-018-1384-2. Metabolomics. 2018. PMID: 30830350 Review.
-
A decade in prostate cancer: from NMR to metabolomics.Nat Rev Urol. 2011 May 17;8(6):301-11. doi: 10.1038/nrurol.2011.53. Nat Rev Urol. 2011. PMID: 21587223 Review.
Cited by
-
Metabolomics assays applied to schistosomiasis studies: a scoping review.BMC Infect Dis. 2025 Feb 13;25(1):211. doi: 10.1186/s12879-025-10606-1. BMC Infect Dis. 2025. PMID: 39948455 Free PMC article.
References
-
- Calzolari, M. (2022). sklearn-genetic. https://doi.org/10.5281/zenodo.5854662 .
-
- Casadei-Gardini, A., Del Coco, L., Marisi, G., Conti, F., Rovesti, G., Ulivi, P., Canale, M., Frassineti, G. L., Foschi, F. G., Longo, S., Fanizzi, F. P., & Giudetti, A. M. (2020). 1H-NMR based serum metabolomics highlights different specific biomarkers between early and advanced Hepatocellular Carcinoma stages. Cancers, 12(1), 241. https://doi.org/10.3390/cancers12010241 - DOI - PubMed - PMC
-
- Chen, T., & Guestrin, C. (2016). XGBoost. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. https://doi.org/10.1145/2939672.2939785 - DOI
-
- Diaz, S. O., Barros, A. S., Goodfellow, B. J., Duarte, I. F., Galhano, E., Pita, C., Almeida, M. D. C., Carreira, I. M., & Gil, A. M. (2013). Second trimester maternal urine for the diagnosis of trisomy 21 and prediction of poor pregnancy outcomes. Journal of Proteome Research, 12(6), 2946–2957. https://doi.org/10.1021/pr4002355 . - DOI - PubMed
-
- Gómez-Cebrián, N., Rojas-Benedicto, A., Albors-Vaquer, A., López-Guerrero, J. A., Pineda-Lucena, A., & Puchades-Carrasco, L. (2019). Metabolomics contributions to the discovery of prostate cancer biomarkers. Metabolites. https://doi.org/10.3390/metabo9030048 - DOI - PubMed - PMC
MeSH terms
LinkOut - more resources
Medical