Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May:132:104335.
doi: 10.1016/j.compbiomed.2021.104335. Epub 2021 Mar 16.

Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs

Affiliations

Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs

Marcos Antonio Alves et al. Comput Biol Med. 2021 May.

Abstract

The sudden outbreak of coronavirus disease 2019 (COVID-19) revealed the need for fast and reliable automatic tools to help health teams. This paper aims to present understandable solutions based on Machine Learning (ML) techniques to deal with COVID-19 screening in routine blood tests. We tested different ML classifiers in a public dataset from the Hospital Albert Einstein, São Paulo, Brazil. After cleaning and pre-processing the data has 608 patients, of which 84 are positive for COVID-19 confirmed by RT-PCR. To understand the model decisions, we introduce (i) a local Decision Tree Explainer (DTX) for local explanation and (ii) a Criteria Graph to aggregate these explanations and portrait a global picture of the results. Random Forest (RF) classifier achieved the best results (accuracy 0.88, F1-score 0.76, sensitivity 0.66, specificity 0.91, and AUROC 0.86). By using DTX and Criteria Graph for cases confirmed by the RF, it was possible to find some patterns among the individuals able to aid the clinicians to understand the interconnection among the blood parameters either globally or on a case-by-case basis. The results are in accordance with the literature and the proposed methodology may be embedded in an electronic health record system.

Keywords: COVID–19; Criteria graph; Decision tree; Explainable artificial intelligence; Machine learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Fig. 1
Fig. 1
In the left side, there is a noise set η generated by DTX around the instance to be explained, x. The decision boundary is based on the DTX output. In the right side there is a tree structure representing the rules responsible for explaining the black-box prediction.
Fig. 2
Fig. 2
Criteria graph.
Fig. 3
Fig. 3
Diagram of the proposed method of generating ensemble classifiers with local explainability.
Fig. 4
Fig. 4
The nested cross validation method.
Fig. 5
Fig. 5
Example of synthetic sample generated by SMOTE.
Fig. 6
Fig. 6
AUROC for each algorithm.
Fig. 7
Fig. 7
Explanations provided by SHAP and LIME
Fig. 8
Fig. 8
Kernel density estimation of WBC and PLT.
Fig. 9
Fig. 9
Marginal effect of blood features on the target variable.
Fig. 10
Fig. 10
Criteria Graph for the decision tree explanations. Only factors and interactions that appeared in more than one third of the patients are depicted.

Similar articles

Cited by

References

    1. World Health Organization . 2020. Coronavirus Disease (Covid-19) Pandemic.https://www.who.int/emergencies/diseases/novel-coronavirus-2019 URL.
    1. Zimmermann K., Mannhalter J.W. Technical aspects of quantitative competitive pcr. Biotechniques. 1996;21:268–279. doi: 10.2144/96212rv01. - DOI - PubMed
    1. Ai T., Yang Z., Hou H., Zhan C., Chen C., Lv W., Tao Q., Sun Z., Xia L. 2020. Correlation of Chest Ct and Rt-Pcr Testing in Coronavirus Disease 2019 (Covid-19) in china: a Report of 1014 Cases; p. 200642. Radiology. - DOI - PMC - PubMed
    1. Meng Z., Wang M., Song H., Guo S., Zhou Y., Li W., Zhou Y., Li M., Song X., Zhou Y., et al. medRxiv; 2020. Development and Utilization of an Intelligent Application for Aiding Covid-19 Diagnosis.
    1. Bullock J., Pham K.H., Lam C.S.N., Luengo-Oroz M., et al. 2020. Mapping the Landscape of Artificial Intelligence Applications against Covid-19; p. 11336. arXiv preprint arXiv:2003.

Publication types