Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug;18(181):20210284.
doi: 10.1098/rsif.2021.0284. Epub 2021 Aug 4.

A multi-layer model for the early detection of COVID-19

Affiliations

A multi-layer model for the early detection of COVID-19

Erez Shmueli et al. J R Soc Interface. 2021 Aug.

Abstract

Current COVID-19 screening efforts mainly rely on reported symptoms and the potential exposure to infected individuals. Here, we developed a machine-learning model for COVID-19 detection that uses four layers of information: (i) sociodemographic characteristics of the individual, (ii) spatio-temporal patterns of the disease, (iii) medical condition and general health consumption of the individual and (iv) information reported by the individual during the testing episode. We evaluated our model on 140 682 members of Maccabi Health Services who were tested for COVID-19 at least once between February and October 2020. These individuals underwent, in total, 264 516 COVID-19 PCR tests, out of which 16 512 were positive. Our multi-layer model obtained an area under the curve (AUC) of 81.6% when evaluated over all the individuals in the dataset, and an AUC of 72.8% when only individuals who did not report any symptom were included. Furthermore, considering only information collected before the testing episode-i.e. before the individual had the chance to report on any symptom-our model could reach a considerably high AUC of 79.5%. Our ability to predict early on the outcomes of COVID-19 tests is pivotal for breaking transmission chains, and can be used for a more efficient testing policy.

Keywords: COVID-19; early detection; electronic medical records; machine learning.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Layers 1 and 2: sociodemographic information of the tested individual and the spatio-temporal dynamics of the disease. (a) Percentage of positive tests stratified by gender, ethnicity, and socio-economic level. The percentages of positive tests are linked with gender, ethnicity, and socio-economic level. Error bars represent the 95% confidence interval. (b) Percentage of positive tests over time for three clinics located in different cities and for the entire country. The value for each day is calculated as the percentage of positive tests over the 14 days preceding this day.
Figure 2.
Figure 2.
Layer 4: information collected during the testing episode. (a) Percentages of positive tests stratified by symptoms and age group. Several symptoms that are known to be caused by COVID-19 (e.g. loss of taste or smell) were more associated with a positive outcome. (b) Percentages of positive tests based on exposure to individuals with a laboratory-confirmed COVID-19 test and on the test’s location. Individuals who were exposed to infected individuals and those who were tested at home had an elevated risk of being found COVID-19 positive.
Figure 3.
Figure 3.
Predictive models’ performance. (a) Mean AUC of models based on layers 1–4 (sociodemographic information of the tested individual, spatio-temporal patterns of the disease, medical condition and general health consumption of the tested individual, and information collected during the testing episode) and the full model that combines all four layers. The full model yielded considerably better classification between COVID-19-positive and COVID-19-negative tests, with a mean AUC of 81.6%. Error bars represent the standard deviation of the 10 executions of the model. (b) Receiver operating characteristic curves for the full model and the model considering layers 1–3. The full model’s classification ability is only slightly better than that of the model considering the first three layers (i.e. excluding layer 4: information collected during the testing episode).

References

    1. WHO coronavirus disease (Covid-19) dashboard. 2021 https://covid19.who.int/.
    1. Leung K, Shum MH, Leung GM, Lam TT, Wu JT. 2021. Early transmissibility assessment of the n501y mutant strains of SARS-CoV-2 in the United Kingdom, October to November 2020. Eurosurveillance 26, 2002106. (10.2807/1560-7917.ES.2020.26.1.2002106) - DOI - PMC - PubMed
    1. Munitz A, Yechezkel M, Dickstein Y, Yamin D, Gerlic M. 2021. The rise of SARS-CoV-2 variant B.1.1.7 in Israel intensifies the role of surveillance and vaccination in elderly. medRxiv. (10.1101/2021.02.16.21251819) - DOI - PMC - PubMed
    1. Polack FP et al. 2020. Safety and efficacy of the BNT162b2 mRNA COVID-19 vaccine. N. Engl. J. Med. 383, 2603-2615. (10.1056/NEJMoa2034577) - DOI - PMC - PubMed
    1. Pagliusi S et al. 2020. Emerging manufacturers engagements in the COVID-19 vaccine research, development and supply. Vaccine 38, 5418-5423. (10.1016/j.vaccine.2020.06.022) - DOI - PMC - PubMed

Publication types