Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Mar 30;11(1):7166.
doi: 10.1038/s41598-021-86735-9.

Machine learning is the key to diagnose COVID-19: a proof-of-concept study

Affiliations

Machine learning is the key to diagnose COVID-19: a proof-of-concept study

Cedric Gangloff et al. Sci Rep. .

Erratum in

Abstract

The reverse transcription-polymerase chain reaction (RT-PCR) assay is the accepted standard for coronavirus disease 2019 (COVID-19) diagnosis. As any test, RT-PCR provides false negative results that can be rectified by clinicians by confronting clinical, biological and imaging data. The combination of RT-PCR and chest-CT could improve diagnosis performance, but this would requires considerable resources for its rapid use in all patients with suspected COVID-19. The potential contribution of machine learning in this situation has not been fully evaluated. The objective of this study was to develop and evaluate machine learning models using routine clinical and laboratory data to improve the performance of RT-PCR and chest-CT for COVID-19 diagnosis among post-emergency hospitalized patients. All adults admitted to the ED for suspected COVID-19, and then hospitalized at Rennes academic hospital, France, between March 20, 2020 and May 5, 2020 were included in the study. Three model types were created: logistic regression, random forest, and neural network. Each model was trained to diagnose COVID-19 using different sets of variables. Area under the receiving operator characteristics curve (AUC) was the primary outcome to evaluate model's performances. 536 patients were included in the study: 106 in the COVID group, 430 in the NOT-COVID group. The AUC values of chest-CT and RT-PCR increased from 0.778 to 0.892 and from 0.852 to 0.930, respectively, with the contribution of machine learning. After generalization, machine learning models will allow increasing chest-CT and RT-PCR performances for COVID-19 diagnosis.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Data pre-processing. The first step corresponded to raw data, as they were initially stored in the database. Each ID was characterized by multiple rows. On the second step, data were listed in chronological order, with a single row per ID (blue arrow). In the third step, data were simplified. For numeric variables, only the first value was selected (red arrow). For binary variables, the value “true” was retained when it was present at least once in the list (yellow arrow).
Figure 2
Figure 2
Flow chart of patient selection. Patients suspected to have COVID-19 had at least one of the following symptoms: cough, dyspnea, hyperthermia, myalgias, asthenia, diarrhea, confusion or anosmia. Both chest-CT and RT-PCR were performed in all patients with suspected COVID-19 who were hospitalized.
Figure 3
Figure 3
Correlation coefficients for all pairs of variables of interest. Pearson and Spearman coefficients were calculated for continuous and binary variables, respectively. Correlations not significantly different from 0 are in white cells. Positive correlations are in blue cells, and negative correlations in red cells. Leukocyte count and neutrophil count were identified as highly correlated, and leukocyte count was removed from model building.
Figure 4
Figure 4
ROC curves for the 3 logistic regression models based on common clinico-biological variables alone, clinico-biological variables with chest-CT and common clinico-biological variables with RT-PCR. The “Binary logistic regression with clinico-biological variables and RT-PCR” was the best performing model in this study.

References

    1. Wiersinga WJ, Rhodes A, Cheng AC, Peacock SJ, Prescott HC. Pathophysiology, transmission, diagnosis, and treatment of coronavirus disease 2019 (COVID-19): A review. JAMA. 2020;324:782. doi: 10.1001/jama.2020.12839. - DOI - PubMed
    1. Li Q, et al. Early transmission dynamics in Wuhan, China, of Novel coronavirus-infected pneumonia. N. Engl. J. Med. 2020;382:1199–1207. doi: 10.1056/NEJMoa2001316. - DOI - PMC - PubMed
    1. Korean Society of Infectious Diseases and Korea Centers for Disease Control and Prevention Analysis on 54 mortality cases of coronavirus disease 2019 in the Republic of Korea from January 19 to March 10, 2020. J. Korean Med. Sci. 2020;35:e132. doi: 10.3346/jkms.2020.35.e132. - DOI - PMC - PubMed
    1. Peng L, et al. Improved early recognition of coronavirus disease-2019 (COVID-19): Single-center data from a Shanghai Screening Hospital. Arch. Iran. Med. 2020;23:272–276. doi: 10.34172/aim.2020.10. - DOI - PubMed
    1. Wong SCY, et al. Risk of nosocomial transmission of coronavirus disease 2019: An experience in a general ward setting in Hong Kong. J. Hosp. Infect. 2020;105:119–127. doi: 10.1016/j.jhin.2020.03.036. - DOI - PMC - PubMed