Observational Study

. 2021 Sep;3(9):e587-e598.

doi: 10.1016/S2589-7500(21)00131-X. Epub 2021 Jul 29.

Early detection of COVID-19 in the UK using self-reported symptoms: a large-scale, prospective, epidemiological surveillance study

Liane S Canas¹, Carole H Sudre², Joan Capdevila Pujol³, Lorenzo Polidori³, Benjamin Murray⁴, Erika Molteni⁴, Mark S Graham⁴, Kerstin Klaser⁴, Michela Antonelli⁴, Sarah Berry⁵, Richard Davies³, Long H Nguyen⁶, David A Drew⁶, Jonathan Wolf³, Andrew T Chan⁶, Tim Spector⁵, Claire J Steves⁵, Sebastien Ourselin⁴, Marc Modat⁴

Affiliations

¹ School of Biomedical Engineering and Imaging Sciences, King's College London, London, UK. Electronic address: liane.dos_santos_canas@kcl.ac.uk.
² School of Biomedical Engineering and Imaging Sciences, King's College London, London, UK; Medical Research Council Unit for Lifelong Health and Ageing, Department of Population Science and Experimental Medicine, University College London, London, UK; Centre for Medical Image Computing, Department of Computer Science, University College London, London, UK.
³ ZOE, London, UK.
⁴ School of Biomedical Engineering and Imaging Sciences, King's College London, London, UK.
⁵ Department of Twin Research and Genetic Epidemiology, King's College London, London, UK.
⁶ Clinical and Translational Epidemiology Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA.

PMID: 34334333
PMCID: PMC8321433
DOI: 10.1016/S2589-7500(21)00131-X

Observational Study

Early detection of COVID-19 in the UK using self-reported symptoms: a large-scale, prospective, epidemiological surveillance study

Liane S Canas et al. Lancet Digit Health. 2021 Sep.

. 2021 Sep;3(9):e587-e598.

doi: 10.1016/S2589-7500(21)00131-X. Epub 2021 Jul 29.

Authors

Affiliations

¹ School of Biomedical Engineering and Imaging Sciences, King's College London, London, UK. Electronic address: liane.dos_santos_canas@kcl.ac.uk.
² School of Biomedical Engineering and Imaging Sciences, King's College London, London, UK; Medical Research Council Unit for Lifelong Health and Ageing, Department of Population Science and Experimental Medicine, University College London, London, UK; Centre for Medical Image Computing, Department of Computer Science, University College London, London, UK.
³ ZOE, London, UK.
⁴ School of Biomedical Engineering and Imaging Sciences, King's College London, London, UK.
⁵ Department of Twin Research and Genetic Epidemiology, King's College London, London, UK.
⁶ Clinical and Translational Epidemiology Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA.

PMID: 34334333
PMCID: PMC8321433
DOI: 10.1016/S2589-7500(21)00131-X

Abstract

Background: Self-reported symptoms during the COVID-19 pandemic have been used to train artificial intelligence models to identify possible infection foci. To date, these models have only considered the culmination or peak of symptoms, which is not suitable for the early detection of infection. We aimed to estimate the probability of an individual being infected with SARS-CoV-2 on the basis of early self-reported symptoms to enable timely self-isolation and urgent testing.

Methods: In this large-scale, prospective, epidemiological surveillance study, we used prospective, observational, longitudinal, self-reported data from participants in the UK on 19 symptoms over 3 days after symptoms onset and COVID-19 PCR test results extracted from the COVID-19 Symptom Study mobile phone app. We divided the study population into a training set (those who reported symptoms between April 29, 2020, and Oct 15, 2020) and a test set (those who reported symptoms between Oct 16, 2020, and Nov 30, 2020), and used three models to analyse the self-reported symptoms: the UK's National Health Service (NHS) algorithm, logistic regression, and the hierarchical Gaussian process model we designed to account for several important variables (eg, specific COVID-19 symptoms, comorbidities, and clinical information). Model performance to predict COVID-19 positivity was compared in terms of sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) in the test set. For the hierarchical Gaussian process model, we also evaluated the relevance of symptoms in the early detection of COVID-19 in population subgroups stratified according to occupation, sex, age, and body-mass index.

Findings: The training set comprised 182 991 participants and the test set comprised 15 049 participants. When trained on 3 days of self-reported symptoms, the hierarchical Gaussian process model had a higher prediction AUC (0·80 [95% CI 0·80-0·81]) than did the logistic regression model (0·74 [0·74-0·75]) and the NHS algorithm (0·67 [0·67-0·67]). AUCs for all models increased with the number of days of self-reported symptoms, but were still high for the hierarchical Gaussian process model at day 1 (0·73 [95% CI 0·73-0·74]) and day 2 (0·79 [0·78-0·79]). At day 3, the hierarchical Gaussian process model also had a significantly higher sensitivity, but a non-statistically lower specificity, than did the two other models. The hierarchical Gaussian process model also identified different sets of relevant features to detect COVID-19 between younger and older subgroups, and between health-care workers and non-health-care workers. When used during different pandemic periods, the model was robust to changes in populations.

Interpretation: Early detection of SARS-CoV-2 infection is feasible with our model. Such early detection is crucial to contain the spread of COVID-19 and efficiently allocate medical resources.

Funding: ZOE, the UK Government Department of Health and Social Care, the Wellcome Trust, the UK Engineering and Physical Sciences Research Council, the UK National Institute for Health Research, the UK Medical Research Council, the British Heart Foundation, the Alzheimer's Society, the Chronic Disease Research Foundation, and the Massachusetts Consortium on Pathogen Readiness.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests ATC reports personal fees from Pfizer, Bayer Pharma, and Boehringer Ingelheim, outside the submitted work. CJS reports grants from the Chronic Disease Research Foundation, during the conduct of the study. JCP, LP, JW, and TS report other (work) and consultancy from ZOE, during the conduct of the study. CHS reports grants from the Alzheimer's Society, during the conduct of the study. DAD reports grants from the National Institutes of Health during the conduct of the study and has previously served as a co-investigator on an unrelated trial supported by ZOE. RD reports grants from the Department of Health and Social Care, during the conduct of the study, and personal fees from ZOE, outside the submitted work. SO reports grants from the Wellcome Trust, Innovate UK Research and Innovation, and the Chronic Disease Research Foundation, during the conduct of the study. All other authors declare no competing interests.

Figures

**Figure 1**
Feature relevance by occupation Symptoms are grouped according to their clinical manifestations: gastrointestinal symptoms and other symptoms (yellow sector), flu-like symptoms (green sector), neurological symptoms (purple sector), and cardiac and respiratory symptoms (white sector). The grey line represents overall symptom relevance without stratification. Points further from the centre correspond to a higher relevance. Relevance is normalised for direct interpretation.

**Figure 2**
Feature relevance by sex Symptoms are grouped according to their clinical manifestations: gastrointestinal symptoms and other symptoms (yellow sector), flu-like symptoms (green sector), neurological symptoms (purple sector), and cardiac and respiratory symptoms (white sector). The grey line represents overall symptom relevance without stratification. Points further from the centre correspond to a higher relevance. Relevance is normalised for direct interpretation.

**Figure 3**
Feature relevance by age group Symptoms are grouped according to their clinical manifestations: gastrointestinal symptoms and other symptoms (yellow sector), flu-like symptoms (green sector), neurological symptoms (purple sector), and cardiac and respiratory symptoms (white sector). The grey line represents overall symptom relevance without stratification. Points further from the centre correspond to a higher relevance. Relevance is normalised for direct interpretation.

**Figure 4**
Feature relevance by BMI category Symptoms are grouped according to their clinical manifestations: gastrointestinal symptoms and other symptoms (yellow sector), flu-like symptoms (green sector), neurological symptoms (purple sector), and cardiac and respiratory symptoms (white sector). The grey line represents overall symptom relevance without stratification. Points further from the centre correspond to a higher relevance. Relevance is normalised for direct interpretation. BMI=body-mass index.

See this image and copyright information in PMC

References

1. Sironi M, Hasnain SE, Rosenthal B. SARS-CoV-2 and COVID-19: a genetic, epidemiological, and evolutionary perspective. Infect Genet Evol. 2020;84 - PMC - PubMed
1. Emanuel EJ, Persad G, Upshur R. Fair allocation of scarce medical resources in the time of Covid-19. N Engl J Med. 2020;382:2049–2055. - PubMed
1. Tang Y-W, Schmitz JE, Persing DH, Stratton CW. Laboratory diagnosis of COVID-19: current issues and challenges. J Clin Microbiol. 2020;58:512–520. - PMC - PubMed
1. Binnicker MJ. Challenges and controversies to testing for COVID-19. J Clin Microbiol. 2020;58:58. - PMC - PubMed
1. Rubin R. The challenges of expanding rapid tests to curb COVID-19. JAMA. 2020;324:1813–1815. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Early detection of COVID-19 in the UK using self-reported symptoms: a large-scale, prospective, epidemiological surveillance study

Affiliations

Early detection of COVID-19 in the UK using self-reported symptoms: a large-scale, prospective, epidemiological surveillance study

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Miscellaneous