Identification of optimal biomarkers associated with distant metastasis in breast cancer using Boruta and Lasso machine learning algorithms
- PMID: 40804615
- PMCID: PMC12345036
- DOI: 10.1186/s12885-025-14664-1
Identification of optimal biomarkers associated with distant metastasis in breast cancer using Boruta and Lasso machine learning algorithms
Abstract
Objective: The aim of this study was to identify optimal biomarkers associated with distant metastasis in patients with breast cancer from among nutritional and inflammatory indicators using the Boruta and Least Absolute Shrinkage and Selection Operator (LASSO) machine learning algorithms, thereby improving the ability to identify distant metastasis.
Methods: A total of 348 patients newly diagnosed with breast cancer were included, comprising 185 patients with nonmetastatic breast cancer and 163 patients with distant metastatic breast cancer. The variables were initially screened using the Boruta algorithm, followed by further optimization through LASSO regression. The selected key indicators were evaluated for their association with distant metastasis risk using multivariate logistic regression analysis and restricted cubic spline functions. Discriminative performance was assessed through ROC curve analysis.
Results: Boruta and LASSO analyses identified five important indicators: the advanced lung cancer inflammation index (ALI), systemic inflammation response index (SIRI), monocyte-to-lymphocyte ratio (MLR), albumin-to-globulin ratio (AGR), and geriatric nutritional risk index (GNRI). Multivariate logistic regression analysis revealed that an elevated SIRI and MLR were associated with an increased risk of distant metastasis in patients with breast cancer, whereas a higher ALI, AGR, and GNRI were associated with a reduced risk. ROC analysis indicated moderate predictive performance for these indicators, with AUC values of approximately 0.65.
Conclusion: The ALI, SIRI, MLR, AGR, and GNRI are effective biomarkers for identifying the risk of distant metastasis in patients with breast cancer. These indicators could be incorporated into clinical practice to improve risk stratification, guide personalized treatment, and enhance patient outcomes.
Keywords: Biomarker; Boruta; Breast cancer; Distant metastasis; LASSO.
© 2025. The Author(s).
Conflict of interest statement
Declarations. Ethics approval and consent to participate: This study was conducted in accordance with the ethical standards of the institutional and/or national research committee and with the principles of the Declaration of Helsinki ( https://www.wma.net/policies-post/wma-declaration-of-helsinki/ ). Ethical approval was obtained from the Medical Ethics Committee of Guangxi Medical University Cancer Hospital (Reference Number: KY2023868). Given the retrospective nature of the study, the requirement for informed consent was waived by the Medical Ethics Committee of Guangxi Medical University Cancer Hospital. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.
Figures





References
-
- Bray F, Laversanne M, Sung H, Ferlay J, Siegel R, Soerjomataram I, Jemal A. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2024;74(3):229–63. 10.3322/caac.21834. - PubMed
-
- Siegel RL, Miller KD, Jemal A. Cancer statistics, 2016. CA Cancer J Clin. 2016;66(1):7–30. 10.3322/caac.21332. - PubMed
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Medical