Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jan 4;4(1):3.
doi: 10.1038/s41746-020-00372-6.

Machine learning-based prediction of COVID-19 diagnosis based on symptoms

Affiliations

Machine learning-based prediction of COVID-19 diagnosis based on symptoms

Yazeed Zoabi et al. NPJ Digit Med. .

Abstract

Effective screening of SARS-CoV-2 enables quick and efficient diagnosis of COVID-19 and can mitigate the burden on healthcare systems. Prediction models that combine several features to estimate the risk of infection have been developed. These aim to assist medical staff worldwide in triaging patients, especially in the context of limited healthcare resources. We established a machine-learning approach that trained on records from 51,831 tested individuals (of whom 4769 were confirmed to have COVID-19). The test set contained data from the subsequent week (47,401 tested individuals of whom 3624 were confirmed to have COVID-19). Our model predicted COVID-19 test results with high accuracy using only eight binary features: sex, age ≥60 years, known contact with an infected individual, and the appearance of five initial clinical symptoms. Overall, based on the nationwide data publicly reported by the Israeli Ministry of Health, we developed a model that detects COVID-19 cases by simple features accessed by asking basic questions. Our framework can be used, among other considerations, to prioritize testing for COVID-19 when testing resources are limited.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Model performance.
a ROC curves of the predictive model on the prospective test set. The light band around the curve represents pointwise 95% confidence intervals derived by bootstrapping. b A plot of the precision (positive predictive value, PPV) against the recall (sensitivity) of the predictor for different thresholds. The light band around the curve represents pointwise 95% confidence intervals derived by bootstrapping.
Fig. 2
Fig. 2. Important features.
SHapley Additive exPlanations (SHAP) beeswarm plot for predicting COVID-19 diagnosis, showing SHAP values for the most important features of the model. Features in the summary plots (y-axis) are organized by their mean absolute SHAP values. Each point corresponds to an individual person in the study. The position of each point on the x-axis shows the impact that feature has on the classifier’s prediction for a given individual. Values of those features (i.e., fever) are represented by their color.
Fig. 3
Fig. 3. Performance using only balanced features.
a ROC curve and b SHAP beeswarm plot for the prospective test set through training, using only balanced features.
Fig. 4
Fig. 4. Performance on stimulated test sets.
ROC curves showing the performance of the model on stimulated test sets, in which we randomly selected negative reports for all five symptoms at a time and substituted them with blank values. The ROC curve for the original test set is shown in blue. The orange and green curves are ROC curves for randomly substituting 10% and 20%, respectively, of the negative values for all five symptoms.

Similar articles

Cited by

References

    1. Dong, E., Du, H. & Gardner, L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect. Dis.10.1016/S1473-3099(20)30120-1 (2020). - PMC - PubMed
    1. Gozes, O. et al. Rapid AI development cycle for the coronavirus (COVID-19) pandemic: initial results for automated detection & patient monitoring using deep learning CT image analysis. arXiv e-prints 2003, arXiv:2003.05037 (2020).
    1. Song, Y. et al. Deep learning Enables Accurate Diagnosis of Novel Coronavirus (COVID-19) with CT images. medRxiv10.1101/2020.02.23.20026930 (2020). - PMC - PubMed
    1. Wang, S. et al. A deep learning algorithm using CT images to screen for Corona Virus Disease (COVID-19). medRxiv, 10.1101/2020.02.14.20023028 (2020). - PMC - PubMed
    1. Jin, C. et al. Development and evaluation of an AI system for COVID-19 diagnosis. medRxiv, 10.1101/2020.03.20.20039834 (2020).