Decision tree-based learning and laboratory data mining: an efficient approach to amebiasis testing
- PMID: 39881359
- PMCID: PMC11780931
- DOI: 10.1186/s13071-024-06618-6
Decision tree-based learning and laboratory data mining: an efficient approach to amebiasis testing
Abstract
Background: Amebiasis represents a significant global health concern. This is especially evident in developing countries, where infections are more common. The primary diagnostic method in laboratories involves the microscopy of stool samples. However, this approach can sometimes result in the misinterpretation of amebiasis as other gastroenteritis (GE) conditions. The goal of the work is to produce a machine learning (ML) model that uses laboratory findings and demographic information to automatically predict amebiasis.
Method: Data extracted from Jordanian electronic medical records (EMR) between 2020 and 2022 comprised 763 amebic cases and 314 nonamebic cases. Patient demographics, clinical signs, microscopic diagnoses, and leukocyte counts were used to train eight decision tree algorithms and compare their accuracy of predictions. Feature ranking and correlation methods were implemented to enhance the accuracy of classifying amebiasis from other conditions.
Results: The primary dependent variables distinguishing amebiasis include the percentage of neutrophils, mucus presence, and the counts of red blood cells (RBCs) and white blood cells (WBCs) in stool samples. Prediction accuracy and precision ranged from 92% to 94.6% when employing decision tree classifiers including decision tree (DT), random forest (RF), XGBoost, AdaBoost, and gradient boosting (GB). However, the optimized RF model demonstrated an area under the curve (AUC) of 98% for detecting amebiasis from laboratory data, utilizing only 300 estimators with a max depth of 20. This study highlights that amebiasis is a significant health concern in Jordan, responsible for 17.22% of all gastroenteritis episodes in this study. Male sex and age were associated with higher incidence of amebiasis (P = 0.014), with over 25% of cases occurring in infants and toddlers.
Conclusions: The application of ML to EMR can accurately predict amebiasis. This finding significantly contributes to the emerging use of ML as a decision support system in parasitic disease diagnosis.
Keywords: E. histolytica; Amebiasis; Decision tree; Electronic medical records (EMR); Feature selection; Jordan; Leukocytosis; Machine learning; Microscopic diagnosis; Stool RBCs.
© 2024. The Author(s).
Conflict of interest statement
Declarations. Ethics approval and consent to participate: This study has been approved by the scientific and administration committee of scientific research at Al-Balqa Applied University and Jordan’s Al-Hussein/Salt Hospital. The material is the authors’ original work that has not been previously published elsewhere. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.
Figures







Similar articles
-
Prediction and feature selection of low birth weight using machine learning algorithms.J Health Popul Nutr. 2024 Oct 12;43(1):157. doi: 10.1186/s41043-024-00647-8. J Health Popul Nutr. 2024. PMID: 39396025 Free PMC article.
-
A retrospective study using machine learning to develop predictive model to identify rotavirus-associated acute gastroenteritis in children.PeerJ. 2025 Apr 14;13:e19025. doi: 10.7717/peerj.19025. eCollection 2025. PeerJ. 2025. PMID: 40247842 Free PMC article.
-
Exploratory Data Mining Techniques (Decision Tree Models) for Examining the Impact of Internet-Based Cognitive Behavioral Therapy for Tinnitus: Machine Learning Approach.J Med Internet Res. 2021 Nov 2;23(11):e28999. doi: 10.2196/28999. J Med Internet Res. 2021. PMID: 34726612 Free PMC article.
-
Prediction of myopia development among Chinese school-aged children using refraction data from electronic medical records: A retrospective, multicentre machine learning study.PLoS Med. 2018 Nov 6;15(11):e1002674. doi: 10.1371/journal.pmed.1002674. eCollection 2018 Nov. PLoS Med. 2018. PMID: 30399150 Free PMC article.
-
Predicting the risk of emergency admission with machine learning: Development and validation using linked electronic health records.PLoS Med. 2018 Nov 20;15(11):e1002695. doi: 10.1371/journal.pmed.1002695. eCollection 2018 Nov. PLoS Med. 2018. PMID: 30458006 Free PMC article.
References
MeSH terms
LinkOut - more resources
Full Text Sources