Using Machine Learning to Estimate Unobserved COVID-19 Infections in North America
- PMID: 32618918
- PMCID: PMC7396213
- DOI: 10.2106/JBJS.20.00715
Using Machine Learning to Estimate Unobserved COVID-19 Infections in North America
Abstract
Background: The detection of coronavirus disease 2019 (COVID-19) cases remains a huge challenge. As of April 22, 2020, the COVID-19 pandemic continues to take its toll, with >2.6 million confirmed infections and >183,000 deaths. Dire projections are surfacing almost every day, and policymakers worldwide are using projections for critical decisions. Given this background, we modeled unobserved infections to examine the extent to which we might be grossly underestimating COVID-19 infections in North America.
Methods: We developed a machine-learning model to uncover hidden patterns based on reported cases and to predict potential infections. First, our model relied on dimensionality reduction to identify parameters that were key to uncovering hidden patterns. Next, our predictive analysis used an unbiased hierarchical Bayesian estimator approach to infer past infections from current fatalities.
Results: Our analysis indicates that, when we assumed a 13-day lag time from infection to death, the United States, as of April 22, 2020, likely had at least 1.3 million undetected infections. With a longer lag time-for example, 23 days-there could have been at least 1.7 million undetected infections. Given these assumptions, the number of undetected infections in Canada could have ranged from 60,000 to 80,000. Duarte's elegant unbiased estimator approach suggested that, as of April 22, 2020, the United States had up to >1.6 million undetected infections and Canada had at least 60,000 to 86,000 undetected infections. However, the Johns Hopkins University Center for Systems Science and Engineering data feed on April 22, 2020, reported only 840,476 and 41,650 confirmed cases for the United States and Canada, respectively.
Conclusions: We have identified 2 key findings: (1) as of April 22, 2020, the United States may have had 1.5 to 2.029 times the number of reported infections and Canada may have had 1.44 to 2.06 times the number of reported infections and (2) even if we assume that the fatality and growth rates in the unobservable population (undetected infections) are similar to those in the observable population (confirmed infections), the number of undetected infections may be within ranges similar to those described above. In summary, 2 different approaches indicated similar ranges of undetected infections in North America.
Level of evidence: Prognostic Level V. See Instructions for Authors for a complete description of levels of evidence.
Figures
References
-
- Chan SP. Coronavirus: ‘World faces worst recession since Great Depression.’ BBC. 2020. https://www.bbc.com/news/business-52273988 Accessed 2020 Apr 21.
-
- Worldometer. Data on COVID-19 coronavirus pandemic (2020). 2020. Accessed 2020 Apr 21 https://www.worldometers.info/coronavirus/
-
- Normille D. ‘Suppress and lift’: Hong Kong and Singapore say they have a coronavirus strategy that works. Science Magazine. 2020. April 13 Accessed 2020 Apr 21 https://www.sciencemag.org/news/2020/04/suppress-and-lift-hong-kong-and-...
-
- Hale T, Webster S, Petherick A, Phillips T, Kira B. Oxford COVID-19 government response tracker. Blavatnik School of Government, University of Oxford; 2020. https://www.bsg.ox.ac.uk/research/research-projects/coronavirus-governme... Accessed 2020 Apr 21. - PubMed
-
- World Health Organization. National capacities review tool for a novel coronavirus. 2020. January 9 Accessed 2020 Apr 21 https://www.who.int/publications-detail/national-capacities-review-tool-...
MeSH terms
LinkOut - more resources
Full Text Sources
Research Materials
Miscellaneous
