Development and External Validation of a Machine Learning Tool to Rule Out COVID-19 Among Adults in the Emergency Department Using Routine Blood Tests: A Large, Multicenter, Real-World Study

doi:10.2196/24048

Multicenter Study

. 2020 Dec 2;22(12):e24048.

doi: 10.2196/24048.

Development and External Validation of a Machine Learning Tool to Rule Out COVID-19 Among Adults in the Emergency Department Using Routine Blood Tests: A Large, Multicenter, Real-World Study

Timothy B Plante^{1

2}, Aaron M Blau², Adrian N Berg^{3

4}, Aaron S Weinberg⁵, Ik C Jun⁵, Victor F Tapson⁵, Tanya S Kanigan⁴, Artur B Adib⁴

Affiliations

¹ Larner College of Medicine at the University of Vermont, Colchester, VT, United States.
² University of Vermont Medical Center, Burlington, VT, United States.
³ Larner College of Medicine at the University of Vermont, Burlington, VT, United States.
⁴ Biocogniv Inc, South Burlington, VT, United States.
⁵ Cedars-Sinai Medical Center, Los Angeles, CA, United States.

PMID: 33226957
PMCID: PMC7713695
DOI: 10.2196/24048

Multicenter Study

Development and External Validation of a Machine Learning Tool to Rule Out COVID-19 Among Adults in the Emergency Department Using Routine Blood Tests: A Large, Multicenter, Real-World Study

Timothy B Plante et al. J Med Internet Res. 2020.

. 2020 Dec 2;22(12):e24048.

doi: 10.2196/24048.

Authors

Timothy B Plante^{1

2}, Aaron M Blau², Adrian N Berg^{3

4}, Aaron S Weinberg⁵, Ik C Jun⁵, Victor F Tapson⁵, Tanya S Kanigan⁴, Artur B Adib⁴

Affiliations

¹ Larner College of Medicine at the University of Vermont, Colchester, VT, United States.
² University of Vermont Medical Center, Burlington, VT, United States.
³ Larner College of Medicine at the University of Vermont, Burlington, VT, United States.
⁴ Biocogniv Inc, South Burlington, VT, United States.
⁵ Cedars-Sinai Medical Center, Los Angeles, CA, United States.

PMID: 33226957
PMCID: PMC7713695
DOI: 10.2196/24048

Abstract

Background: Conventional diagnosis of COVID-19 with reverse transcription polymerase chain reaction (RT-PCR) testing (hereafter, PCR) is associated with prolonged time to diagnosis and significant costs to run the test. The SARS-CoV-2 virus might lead to characteristic patterns in the results of widely available, routine blood tests that could be identified with machine learning methodologies. Machine learning modalities integrating findings from these common laboratory test results might accelerate ruling out COVID-19 in emergency department patients.

Objective: We sought to develop (ie, train and internally validate with cross-validation techniques) and externally validate a machine learning model to rule out COVID 19 using only routine blood tests among adults in emergency departments.

Methods: Using clinical data from emergency departments (EDs) from 66 US hospitals before the pandemic (before the end of December 2019) or during the pandemic (March-July 2020), we included patients aged ≥20 years in the study time frame. We excluded those with missing laboratory results. Model training used 2183 PCR-confirmed cases from 43 hospitals during the pandemic; negative controls were 10,000 prepandemic patients from the same hospitals. External validation used 23 hospitals with 1020 PCR-confirmed cases and 171,734 prepandemic negative controls. The main outcome was COVID 19 status predicted using same-day routine laboratory results. Model performance was assessed with area under the receiver operating characteristic (AUROC) curve as well as sensitivity, specificity, and negative predictive value (NPV).

Results: Of 192,779 patients included in the training, external validation, and sensitivity data sets (median age decile 50 [IQR 30-60] years, 40.5% male [78,249/192,779]), AUROC for training and external validation was 0.91 (95% CI 0.90-0.92). Using a risk score cutoff of 1.0 (out of 100) in the external validation data set, the model achieved sensitivity of 95.9% and specificity of 41.7%; with a cutoff of 2.0, sensitivity was 92.6% and specificity was 59.9%. At the cutoff of 2.0, the NPVs at a prevalence of 1%, 10%, and 20% were 99.9%, 98.6%, and 97%, respectively.

Conclusions: A machine learning model developed with multicenter clinical data integrating commonly collected ED laboratory data demonstrated high rule-out accuracy for COVID-19 status, and might inform selective use of PCR-based testing.

Keywords: COVID-19; SARS-CoV-2; artificial intelligence; development; electronic medical records; emergency department; laboratory results; machine learning; model; testing; validation.

©Timothy B Plante, Aaron M Blau, Adrian N Berg, Aaron S Weinberg, Ik C Jun, Victor F Tapson, Tanya S Kanigan, Artur B Adib. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 02.12.2020.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: TBP, AB, ASW, and ICJ have no disclosures. VFT received research funding from the National Institutes of Health (paid to the institution) for COVID-19–related clinical trials. ANB is a paid intern at Biocogniv. TSK and ABA have ownership of Biocogniv.

Figures

**Figure 1**
Discrimination as assessed by ROC curves for training, external validation, and sensitivity analysis data sets. ROC curves for the 3 different data sets: training (blue), external validation (orange), and sensitivity analysis (green). The training curve was obtained through 5-fold cross-validation, where positive controls are PCR-confirmed cases during the pandemic (N=2183) and negative controls are prepandemic patients (N=10,000) from 43 hospitals in the PHD. The training AUROC was 0.91 (95% CI 0.90-0.92). The external validation curve was performed in the external validation data set after training the model on the training data set. External validation positives are PCR-confirmed cases from Cedars-Sinai Medical Center (N=68) and from the PHD holdout set (N=952) comprising 21 hospitals. External validation negatives are prepandemic (2019) patients, from the same 21 PHD hospitals, that match the top 20 primary non–COVID-19 diagnoses in 2020 (N=154,341), as well as all eligible prepandemic (2008-2019) Beth Israel Deaconess Medical Center patients (N=17,393). The AUROC in the external validation data set was 0.91 (95% CI 0.90-0.92). The sensitivity analysis curve demonstrates the effect of using prepandemic patients as negative controls compared to using PCR-negatives from 2020. In this data set, both positives (N=952) and negatives (N=6890) were PCR-confirmed patients from the PHD holdout set (21 hospitals), and no prepandemic data was included. The AUROC in the sensitivity analysis set was 0.89 (95% CI 0.88-0.90). AUROC: area under the receiver operating characteristic curve; PCR: polymerase chain reaction; PHD: Premier Healthcare Database; ROC: receiver operating characteristic.

**Figure 2**
Discrimination as assessed by AUROC curve in age, sex, race, and ED disposition subgroups in the external validation data set. Non-ICU patients were admitted to the hospital but not to an ICU. Distribution of AUROC curves per demographic, as well as per patient disposition type (ED discharge, non-ICU, and ICU) in the external validation data set. Top numbers are AUROC curves, bottom numbers in parentheses are the number of patients. AUROC: area under the receiver operating characteristic; ED: emergency department; ICU: intensive care unit.

See this image and copyright information in PMC

Cited by

A systematic review of prediction models to diagnose COVID-19 in adults admitted to healthcare centers.
Locquet M, Diep AN, Beaudart C, Dardenne N, Brabant C, Bruyère O, Donneau AF. Locquet M, et al. Arch Public Health. 2021 Jun 18;79(1):105. doi: 10.1186/s13690-021-00630-3. Arch Public Health. 2021. PMID: 34144711 Free PMC article. Review.
COVID-19 diagnosis from routine blood tests using artificial intelligence techniques.
Babaei Rikan S, Sorayaie Azar A, Ghafari A, Bagherzadeh Mohasefi J, Pirnejad H. Babaei Rikan S, et al. Biomed Signal Process Control. 2022 Feb;72:103263. doi: 10.1016/j.bspc.2021.103263. Epub 2021 Nov 1. Biomed Signal Process Control. 2022. PMID: 34745318 Free PMC article.
The accuracy of machine learning approaches using non-image data for the prediction of COVID-19: A meta-analysis.
Kuo KM, Talley PC, Chang CS. Kuo KM, et al. Int J Med Inform. 2022 Aug;164:104791. doi: 10.1016/j.ijmedinf.2022.104791. Epub 2022 May 13. Int J Med Inform. 2022. PMID: 35594810 Free PMC article. Review.
A machine learning PROGRAM to identify COVID-19 and other diseases from hematology data.
Gladding PA, Ayar Z, Smith K, Patel P, Pearce J, Puwakdandawa S, Tarrant D, Atkinson J, McChlery E, Hanna M, Gow N, Bhally H, Read K, Jayathissa P, Wallace J, Norton S, Kasabov N, Calude CS, Steel D, Mckenzie C. Gladding PA, et al. Future Sci OA. 2021 Jun 12;7(7):FSO733. doi: 10.2144/fsoa-2020-0207. eCollection 2021 Aug. Future Sci OA. 2021. PMID: 34254032 Free PMC article.
A novel explainable COVID-19 diagnosis method by integration of feature selection with random forest.
Rostami M, Oussalah M. Rostami M, et al. Inform Med Unlocked. 2022;30:100941. doi: 10.1016/j.imu.2022.100941. Epub 2022 Apr 6. Inform Med Unlocked. 2022. PMID: 35399333 Free PMC article.

See all "Cited by" articles

References

1. Covid in the U.S.: Latest Map and Case Count. The New York Times. [2020-11-23]. https://www.nytimes.com/interactive/2020/us/coronavirus-us-cases.html.
1. Baden LR, Rubin EJ. Covid-19 — The Search for Effective Therapy. N Engl J Med. 2020 May 07;382(19):1851–1852. doi: 10.1056/nejme2005477. - DOI - PMC - PubMed
1. Karow J. Scientists seek solution to coronavirus testing bottleneck. Modern Healthcare. [2020-08-04]. https://www.modernhealthcare.com/patients/scientists-seek-solution-coron....
1. Goldstein BA, Navar AM, Carter RE. Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges. Eur Heart J. 2017 Jun 14;38(23):1805–1814. doi: 10.1093/eurheartj/ehw302. http://europepmc.org/abstract/MED/27436868 - DOI - PMC - PubMed
1. Peterson ED. Machine Learning, Predictive Analytics, and Clinical Practice: Can the Past Inform the Present? JAMA. 2019 Nov 22;:1. doi: 10.1001/jama.2019.17831. - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information
Miscellaneous
- NCI CPTAC Assay Portal

[1] Covid in the U.S.: Latest Map and Case Count. The New York Times. [2020-11-23]. https://www.nytimes.com/interactive/2020/us/coronavirus-us-cases.html.

[2] Covid in the U.S.: Latest Map and Case Count. The New York Times. [2020-11-23]. https://www.nytimes.com/interactive/2020/us/coronavirus-us-cases.html.

[3] Baden LR, Rubin EJ. Covid-19 — The Search for Effective Therapy. N Engl J Med. 2020 May 07;382(19):1851–1852. doi: 10.1056/nejme2005477. - DOI - PMC - PubMed

[4] Baden LR, Rubin EJ. Covid-19 — The Search for Effective Therapy. N Engl J Med. 2020 May 07;382(19):1851–1852. doi: 10.1056/nejme2005477. - DOI - PMC - PubMed

[5] Karow J. Scientists seek solution to coronavirus testing bottleneck. Modern Healthcare. [2020-08-04]. https://www.modernhealthcare.com/patients/scientists-seek-solution-coron....

[6] Karow J. Scientists seek solution to coronavirus testing bottleneck. Modern Healthcare. [2020-08-04]. https://www.modernhealthcare.com/patients/scientists-seek-solution-coron....

[7] Goldstein BA, Navar AM, Carter RE. Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges. Eur Heart J. 2017 Jun 14;38(23):1805–1814. doi: 10.1093/eurheartj/ehw302. http://europepmc.org/abstract/MED/27436868 - DOI - PMC - PubMed

[8] Goldstein BA, Navar AM, Carter RE. Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges. Eur Heart J. 2017 Jun 14;38(23):1805–1814. doi: 10.1093/eurheartj/ehw302. http://europepmc.org/abstract/MED/27436868 - DOI - PMC - PubMed

[9] Peterson ED. Machine Learning, Predictive Analytics, and Clinical Practice: Can the Past Inform the Present? JAMA. 2019 Nov 22;:1. doi: 10.1001/jama.2019.17831. - DOI - PubMed

[10] Peterson ED. Machine Learning, Predictive Analytics, and Clinical Practice: Can the Past Inform the Present? JAMA. 2019 Nov 22;:1. doi: 10.1001/jama.2019.17831. - DOI - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Development and External Validation of a Machine Learning Tool to Rule Out COVID-19 Among Adults in the Emergency Department Using Routine Blood Tests: A Large, Multicenter, Real-World Study

Affiliations

Development and External Validation of a Machine Learning Tool to Rule Out COVID-19 Among Adults in the Emergency Department Using Routine Blood Tests: A Large, Multicenter, Real-World Study

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Miscellaneous