Machine learning approach for differentiating iron deficiency anemia and thalassemia using random forest and gradient boosting algorithms
- PMID: 40374805
- PMCID: PMC12081706
- DOI: 10.1038/s41598-025-01458-5
Machine learning approach for differentiating iron deficiency anemia and thalassemia using random forest and gradient boosting algorithms
Abstract
Formulas based on red blood cell indices have been used to differentiate between iron deficiency anemia (IDA) and thalassemia (Thal). However, they exhibit varying efficiencies. In this study, we aimed to develop a tool for discriminating between IDA and Thal by using the random forest (RF) and gradient boosting (GB) algorithms. Complete blood count data from 1143 patients with anemia and low mean corpuscular volume were collected (382 patients with IDA, 635 with Thal, and 126 with IDA and Thal). The data were randomly divided into the training and testing datasets in a ratio of 80:20. The RF and GB models had good diagnostic performances for predicting IDA and Thal in the training and testing datasets. In the testing dataset for predicting binary outcomes, GB and RF both had an accuracy of 90.7%, and an area under the receiver operating characteristic curve (AUC-ROC) of 0.953. A lower diagnostic performance was observed when patients with IDA and Thal were included. GB and RF showed accuracies of 80.4% and 82.2%, respectively, and AUC-ROC values of 0.910 and 0.899, respectively. In conclusion, we developed a machine learning approach using GB algorithm. This tool is potentially useful in Thal- and IDA-endemic regions.
Keywords: Gradient boosting; Iron deficiency anemia; Machine learning; Random forest; Thalassemia.
© 2025. The Author(s).
Conflict of interest statement
Declarations. Competing interest: The authors declare no competing interests.
Figures
Similar articles
-
A local equation for differential diagnosis of β-thalassemia trait and iron deficiency anemia by logistic regression analysis in Southeast Iran.Hemoglobin. 2014;38(5):355-8. doi: 10.3109/03630269.2014.948187. Epub 2014 Aug 26. Hemoglobin. 2014. PMID: 25155260
-
ThalPred: a web-based prediction tool for discriminating thalassemia trait and iron deficiency anemia.BMC Med Inform Decis Mak. 2019 Nov 7;19(1):212. doi: 10.1186/s12911-019-0929-2. BMC Med Inform Decis Mak. 2019. PMID: 31699079 Free PMC article.
-
MultiThal-classifier, a machine learning-based multi-class model for thalassemia diagnosis and classification.Clin Chim Acta. 2025 Feb 1;567:120025. doi: 10.1016/j.cca.2024.120025. Epub 2024 Nov 7. Clin Chim Acta. 2025. PMID: 39521397
-
TT@MHA: A machine learning-based webpage tool for discriminating thalassemia trait from microcytic hypochromic anemia patients.Clin Chim Acta. 2023 May 1;545:117368. doi: 10.1016/j.cca.2023.117368. Epub 2023 Apr 29. Clin Chim Acta. 2023. PMID: 37127232
-
Ensemble Methods for APS In-Flight Particle Temperature and Velocity Prediction Considering Torch Electrodes Ageing.J Therm Spray Technol. 2023;32(1):175-187. doi: 10.1007/s11666-022-01472-3. Epub 2023 Jan 6. J Therm Spray Technol. 2023. PMID: 37521320 Free PMC article. Review.
References
-
- Newhall, D. A., Oliver, R. & Lugthart, S. Anaemia: A disease or symptom. Neth. J. Med.78, 104–110 (2020). - PubMed
-
- Winichagoon, P. Prevention and control of anemia: Thailand experiences. J. Nutr.132(Supplement), 862S-866S (2002). - PubMed
-
- Sirachainan, N. et al. New mathematical formula for differentiating thalassemia trait and iron deficiency anemia in thalassemia prevalent area: A study in healthy school-age children. Southeast Asian. J. Trop. Med. Public Health45, 174–182 (2014). - PubMed
MeSH terms
LinkOut - more resources
Full Text Sources
Medical