Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jun 26;14(13):1352.
doi: 10.3390/diagnostics14131352.

Can Machine Learning Assist in Diagnosis of Primary Immune Thrombocytopenia? A Feasibility Study

Affiliations

Can Machine Learning Assist in Diagnosis of Primary Immune Thrombocytopenia? A Feasibility Study

Haroon Miah et al. Diagnostics (Basel). .

Abstract

Primary Immune Thrombocytopenia (ITP) is a rare autoimmune disease characterised by the immune-mediated destruction of peripheral blood platelets in patients leading to low platelet counts and bleeding. The diagnosis and effective management of ITP are challenging because there is no established test to confirm the disease and no biomarker with which one can predict the response to treatment and outcome. In this work, we conduct a feasibility study to check if machine learning can be applied effectively for the diagnosis of ITP using routine blood tests and demographic data in a non-acute outpatient setting. Various ML models, including Logistic Regression, Support Vector Machine, k-Nearest Neighbor, Decision Tree and Random Forest, were applied to data from the UK Adult ITP Registry and a general haematology clinic. Two different approaches were investigated: a demographic-unaware and a demographic-aware one. We conduct extensive experiments to evaluate the predictive performance of these models and approaches, as well as their bias. The results revealed that Decision Tree and Random Forest models were both superior and fair, achieving nearly perfect predictive and fairness scores, with platelet count identified as the most significant variable. Models not provided with demographic information performed better in terms of predictive accuracy but showed lower fairness scores, illustrating a trade-off between predictive performance and fairness.

Keywords: UK adult ITP registry; artificial intelligence; blood tests; diagnosis; explainability; feature importance; healthcare; machine learning; model fairness; primary immune thrombocytopenia.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

Figure 1
Figure 1
The boxplots of the patients’ ages (right-hand side) and year of disease diagnosis (left-hand side), across the ITP patients and non-ITP patients.
Figure 2
Figure 2
The boxplots of the patients’ blood platelet count (right-hand side) and blood alt level (left-hand side), across the ITP patients and non-ITP patients.
Figure 3
Figure 3
The boxplots of the blood neutrophil level (right-hand side) and blood haemoglobin level (left-hand side) across the ITP patients and non-ITP patients.
Figure 4
Figure 4
The boxplots of the white blood cell count (right-hand side) and red blood cell count (left-hand side) across the ITP patients and non-ITP patients.
Figure 5
Figure 5
The gender distributions in the case of ITP (right side) and non-ITP (left side) patients.
Figure 6
Figure 6
Permutation feature importance on the training set for the RF model in the case of the demographic-aware (on the left side) and demographic-unaware (on the right side) approach.
Figure 7
Figure 7
Permutation feature importance on the test set for the RF model in the case of the demographic-aware (on the left side) and demographic-unaware (on the right side) approach.
Figure 8
Figure 8
Permutation feature importance on the training set for the DT model in the case of the demographic-aware (on the left side) and demographic-unaware (on the right side) approach.
Figure 9
Figure 9
Permutation feature importance on the test set for the DT model in the case of the demographic-aware (on the left side) and demographic-unaware (on the right side) approach.
Figure 10
Figure 10
Permutation feature importance on the training set for the LogR model in the case of the demographic-aware (on the left side) and demographic-unaware (on the right side) approach.
Figure 11
Figure 11
Permutation feature importance on the test set for the LogR model in the case of the demographic-aware (on the left side) and demographic-unaware (on the right side) approach.
Figure 12
Figure 12
Permutation feature importance on the training set for the SVM-LN model in the case of the demographic-aware (on the left side) and demographic-unaware (on the right side) approach.
Figure 13
Figure 13
Permutation feature importance on the test set for the SVM-LN model in the case of the demographic-aware (on the left side) and demographic-unaware (on the right side) approach.
Figure 14
Figure 14
Permutation feature importance on the training set for the SVM-RBF model in the case of the demographic-aware (on the left side) and demographic-unaware (on the right side) approach.
Figure 15
Figure 15
Permutation feature importance on the test set for the SVM-RBF model in the case of the demographic-aware (on the left side) and demographic-unaware (on the right side) approach.
Figure 16
Figure 16
Permutation feature importance on the training set for the 2-NN model in the case of the demographic-aware (on the left side) and demographic-unaware (on the right side) approach.
Figure 17
Figure 17
Permutation feature importance on the test set for the 2-NN model in the case of the demographic-aware (on the left side) and demographic-unaware (on the right side) approach.
Figure 18
Figure 18
Permutation feature importance on the training set for the 12-NN model in the case of the demographic-aware (on the left side) and demographic-unaware (on the right side) approach.
Figure 19
Figure 19
Permutation feature importance on the test set for the 12-NN model in the case of the demographic-aware (on the left side) and demographic-unaware (on the right side) approach.

References

    1. Provan D., Semple J. Recent advances in the mechanisms and treatment of immune thrombocytopenia. EBioMedicine. 2022;76:103820. doi: 10.1016/j.ebiom.2022.103820. - DOI - PMC - PubMed
    1. Doobaree I.U., Conway K., Miah H., Miah A., Makris M., Hill Q., Cooper N., Bradbury C., Newland A., Provan D., et al. Incidence of adult primary immune thrombocytopenia in England—An update. Eur. J. Haematol. 2022;109:238–249. doi: 10.1111/ejh.13803. - DOI - PubMed
    1. Kollias D., Tagaris A., Stafylopatis A., Kollias S., Tagaris G. Deep neural architectures for prediction in healthcare. Complex Intell. Syst. 2018;4:119–131. doi: 10.1007/s40747-017-0064-6. - DOI
    1. Malik P., Pathania M., Rathaur V.K. Overview of artificial intelligence in medicine. J. Fam. Med. Prim. Care. 2019;8:2328–2331. - PMC - PubMed
    1. Mani V., Ghonge M.M., Chaitanya N.K., Pal O., Sharma M., Mohan S., Ahmadian A. A new blockchain and fog computing model for blood pressure medical sensor data storage. Comput. Electr. Eng. 2022;102:108202. doi: 10.1016/j.compeleceng.2022.108202. - DOI

LinkOut - more resources