. 2024 Jun 26;14(13):1352.

doi: 10.3390/diagnostics14131352.

Can Machine Learning Assist in Diagnosis of Primary Immune Thrombocytopenia? A Feasibility Study

Haroon Miah^{1

2}, Dimitrios Kollias³, Giacinto Luca Pedone², Drew Provan^{1

2}, Frederick Chen^{1

2}

Affiliations

¹ Centre of Immunobiology, Blizard Institute, Queen Mary University of London, London E1 2AT, UK.
² Haematology Department, Barts Health NHS Trust, London E1 1BB, UK.
³ School of Electronic Engineering & Computer Science, Queen Mary University of London, London E1 4NS, UK.

PMID: 39001244
PMCID: PMC11240714
DOI: 10.3390/diagnostics14131352

Can Machine Learning Assist in Diagnosis of Primary Immune Thrombocytopenia? A Feasibility Study

Haroon Miah et al. Diagnostics (Basel). 2024.

. 2024 Jun 26;14(13):1352.

doi: 10.3390/diagnostics14131352.

Authors

Haroon Miah^{1

2}, Dimitrios Kollias³, Giacinto Luca Pedone², Drew Provan^{1

2}, Frederick Chen^{1

2}

Affiliations

¹ Centre of Immunobiology, Blizard Institute, Queen Mary University of London, London E1 2AT, UK.
² Haematology Department, Barts Health NHS Trust, London E1 1BB, UK.
³ School of Electronic Engineering & Computer Science, Queen Mary University of London, London E1 4NS, UK.

PMID: 39001244
PMCID: PMC11240714
DOI: 10.3390/diagnostics14131352

Abstract

Primary Immune Thrombocytopenia (ITP) is a rare autoimmune disease characterised by the immune-mediated destruction of peripheral blood platelets in patients leading to low platelet counts and bleeding. The diagnosis and effective management of ITP are challenging because there is no established test to confirm the disease and no biomarker with which one can predict the response to treatment and outcome. In this work, we conduct a feasibility study to check if machine learning can be applied effectively for the diagnosis of ITP using routine blood tests and demographic data in a non-acute outpatient setting. Various ML models, including Logistic Regression, Support Vector Machine, k-Nearest Neighbor, Decision Tree and Random Forest, were applied to data from the UK Adult ITP Registry and a general haematology clinic. Two different approaches were investigated: a demographic-unaware and a demographic-aware one. We conduct extensive experiments to evaluate the predictive performance of these models and approaches, as well as their bias. The results revealed that Decision Tree and Random Forest models were both superior and fair, achieving nearly perfect predictive and fairness scores, with platelet count identified as the most significant variable. Models not provided with demographic information performed better in terms of predictive accuracy but showed lower fairness scores, illustrating a trade-off between predictive performance and fairness.

Keywords: UK adult ITP registry; artificial intelligence; blood tests; diagnosis; explainability; feature importance; healthcare; machine learning; model fairness; primary immune thrombocytopenia.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

**Figure 1**
The boxplots of the patients’ ages (**right-hand side**) and year of disease diagnosis **(left-hand side**), across the ITP patients and non-ITP patients.

**Figure 2**
The boxplots of the patients’ blood platelet count (**right-hand side**) and blood alt level (**left-hand side**), across the ITP patients and non-ITP patients.

**Figure 3**
The boxplots of the blood neutrophil level (**right-hand side**) and blood haemoglobin level (**left-hand side**) across the ITP patients and non-ITP patients.

**Figure 4**
The boxplots of the white blood cell count (**right-hand side**) and red blood cell count (**left-hand side**) across the ITP patients and non-ITP patients.

**Figure 5**
The gender distributions in the case of ITP (**right side**) and non-ITP (**left side**) patients.

**Figure 6**
Permutation feature importance on the training set for the RF model in the case of the demographic-aware (on the **left side**) and demographic-unaware (on the **right side**) approach.

**Figure 7**
Permutation feature importance on the test set for the RF model in the case of the demographic-aware (on the **left side**) and demographic-unaware (on the **right side**) approach.

**Figure 8**
Permutation feature importance on the training set for the DT model in the case of the demographic-aware (on the **left side**) and demographic-unaware (on the **right side**) approach.

**Figure 9**
Permutation feature importance on the test set for the DT model in the case of the demographic-aware (on the **left side**) and demographic-unaware (on the **right side**) approach.

**Figure 10**
Permutation feature importance on the training set for the LogR model in the case of the demographic-aware (on the **left side**) and demographic-unaware (on the **right side**) approach.

**Figure 11**
Permutation feature importance on the test set for the LogR model in the case of the demographic-aware (on the **left side**) and demographic-unaware (on the **right side**) approach.

**Figure 12**
Permutation feature importance on the training set for the SVM-LN model in the case of the demographic-aware (on the **left side**) and demographic-unaware (on the **right side**) approach.

**Figure 13**
Permutation feature importance on the test set for the SVM-LN model in the case of the demographic-aware (on the **left side**) and demographic-unaware (on the **right side**) approach.

**Figure 14**
Permutation feature importance on the training set for the SVM-RBF model in the case of the demographic-aware (on the **left side**) and demographic-unaware (on the **right side**) approach.

**Figure 15**
Permutation feature importance on the test set for the SVM-RBF model in the case of the demographic-aware (on the **left side**) and demographic-unaware (on the **right side**) approach.

**Figure 16**
Permutation feature importance on the training set for the 2-NN model in the case of the demographic-aware (on the **left side**) and demographic-unaware (on the **right side**) approach.

**Figure 17**
Permutation feature importance on the test set for the 2-NN model in the case of the demographic-aware (on the **left side**) and demographic-unaware (on the **right side**) approach.

**Figure 18**
Permutation feature importance on the training set for the 12-NN model in the case of the demographic-aware (on the **left side**) and demographic-unaware (on the **right side**) approach.

**Figure 19**
Permutation feature importance on the test set for the 12-NN model in the case of the demographic-aware (on the **left side**) and demographic-unaware (on the **right side**) approach.

See this image and copyright information in PMC

References

1. Provan D., Semple J. Recent advances in the mechanisms and treatment of immune thrombocytopenia. EBioMedicine. 2022;76:103820. doi: 10.1016/j.ebiom.2022.103820. - DOI - PMC - PubMed
1. Doobaree I.U., Conway K., Miah H., Miah A., Makris M., Hill Q., Cooper N., Bradbury C., Newland A., Provan D., et al. Incidence of adult primary immune thrombocytopenia in England—An update. Eur. J. Haematol. 2022;109:238–249. doi: 10.1111/ejh.13803. - DOI - PubMed
1. Kollias D., Tagaris A., Stafylopatis A., Kollias S., Tagaris G. Deep neural architectures for prediction in healthcare. Complex Intell. Syst. 2018;4:119–131. doi: 10.1007/s40747-017-0064-6. - DOI
1. Malik P., Pathania M., Rathaur V.K. Overview of artificial intelligence in medicine. J. Fam. Med. Prim. Care. 2019;8:2328–2331. - PMC - PubMed
1. Mani V., Ghonge M.M., Chaitanya N.K., Pal O., Sharma M., Mohan S., Ahmadian A. A new blockchain and fog computing model for blood pressure medical sensor data storage. Comput. Electr. Eng. 2022;102:108202. doi: 10.1016/j.compeleceng.2022.108202. - DOI

LinkOut - more resources

Full Text Sources
- MDPI
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Can Machine Learning Assist in Diagnosis of Primary Immune Thrombocytopenia? A Feasibility Study

Affiliations

Can Machine Learning Assist in Diagnosis of Primary Immune Thrombocytopenia? A Feasibility Study

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources