Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec 21;118(51):e2111452118.
doi: 10.1073/pnas.2111452118.

An open repository of real-time COVID-19 indicators

Alex Reinhart  1 Logan Brooks  2 Maria Jahja  3   2 Aaron Rumack  2 Jingjing Tang  4 Sumit Agrawal  5 Wael Al Saeed  6 Taylor Arnold  7 Amartya Basu  8 Jacob Bien  9 Ángel A Cabrera  10 Andrew Chin  2 Eu Jing Chua  2 Brian Clark  2 Sarah Colquhoun  5 Nat DeFries  2 David C Farrow  5 Jodi Forlizzi  10 Jed Grabman  5 Samuel Gratzl  2 Alden Green  3 George Haff  2 Robin Han  10 Kate Harwood  5 Addison J Hu  3   2 Raphael Hyde  5 Sangwon Hyun  9 Ananya Joshi  6 Jimi Kim  11 Andrew Kuznetsov  10 Wichada La Motte-Kerr  2 Yeon Jin Lee  12   13 Kenneth Lee  14 Zachary C Lipton  2 Michael X Liu  10 Lester Mackey  15 Kathryn Mazaitis  2 Daniel J McDonald  16 Phillip McGuinness  5 Balasubramanian Narasimhan  17   18 Michael P O'Brien  5 Natalia L Oliveira  3   2 Pratik Patil  3   2 Adam Perer  10 Collin A Politsch  2 Samyak Rajanala  17 Dawn Rucker  6 Chris Scott  5 Nigam H Shah  19 Vishnu Shankar  20 James Sharpnack  14 Dmitry Shemetov  2 Noah Simon  21 Benjamin Y Smith  5 Vishakha Srivastava  2 Shuyi Tan  16 Robert Tibshirani  17   18 Elena Tuzhilina  17 Ana Karina Van Nortwick  2 Valérie Ventura  3 Larry Wasserman  3   2 Benjamin Weaver  5 Jeremy C Weiss  22 Spencer Whitman  5 Kristin Williams  10 Roni Rosenfeld  2 Ryan J Tibshirani  3   2
Affiliations

An open repository of real-time COVID-19 indicators

Alex Reinhart et al. Proc Natl Acad Sci U S A. .

Abstract

The COVID-19 pandemic presented enormous data challenges in the United States. Policy makers, epidemiological modelers, and health researchers all require up-to-date data on the pandemic and relevant public behavior, ideally at fine spatial and temporal resolution. The COVIDcast API is our attempt to fill this need: Operational since April 2020, it provides open access to both traditional public health surveillance signals (cases, deaths, and hospitalizations) and many auxiliary indicators of COVID-19 activity, such as signals extracted from deidentified medical claims data, massive online surveys, cell phone mobility data, and internet search trends. These are available at a fine geographic resolution (mostly at the county level) and are updated daily. The COVIDcast API also tracks all revisions to historical data, allowing modelers to account for the frequent revisions and backfill that are common for many public health data sources. All of the data are available in a common format through the API and accompanying R and Python software packages. This paper describes the data sources and signals, and provides examples demonstrating that the auxiliary signals in the COVIDcast API present information relevant to tracking COVID activity, augmenting traditional public health reporting and empowering research and decision-making.

Keywords: digital surveillance; internet surveys; medical insurance claims; open data.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interest.

Figures

Fig. 1.
Fig. 1.
National trends, from April 2020 to April 2021, of four signals in the COVIDcast API. The auxiliary signals, based on medical claims data and massive surveys, track changes in officially reported cases quite well. (They have all been placed on the same range as reported cases per 100,000 people.)
Fig. 2.
Fig. 2.
Geo-wise correlations with case rates, from April 15, 2020 to April 15, 2021, calculated over all counties for which all signals were available and which had at least 500 cumulative cases by the end of this period.
Fig. 3.
Fig. 3.
Time-wise correlations with case rates, from April 15, 2020 to April 15, 2021, calculated over all counties for which all signals were available and which had at least 500 cumulative cases by the end of this period.
Fig. 4.
Fig. 4.
(Left) Reported cases per day in Bexar County, Texas, during the summer of 2020. On July 16, 4,810 backlogged cases were reported, although they actually occurred over the preceding 2 wk (this shows up as a prolonged spike due to the 7-d trailing averaging applied to the case counts). (Right) Daily CTIS estimates of CLI-in-community showed more stable underlying trends.
Fig. 5.
Fig. 5.
Estimated percentage of outpatient (DV-CLI) displayed across multiple issue dates, with later issue dates adding additional data and revising past data from prior issue dates.
Fig. 6.
Fig. 6.
The 95th percentiles of relative error of early reported values of key signals compared to final values reported much later. For each date between October 15, 2020 and April 15, 2021, the values for each state reported between 10 and 90 d later are compared to “final” versions recorded as of August 13, 2021. Even officially reported case and death data can have large revisions 30 d to 60 d or more after initial reporting; much of this is driven by individual large revisions affecting specific states and dates, rather than by systematic changes affecting all states and dates.
Fig. 7.
Fig. 7.
CTIS estimates of the percentage of people willing to get vaccinated, back on January 20, 2021, compared to CDC reporting of the percentage of people vaccinated, on July 20, 2021. Each point is a county (with at least 250 survey responses between January 14–20, 2021), colored by its parent United States Census region.

References

    1. Dong E., Du H., Gardner L., An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect. Dis. 20, 533–534 (2020). - PMC - PubMed
    1. New York Times, Coronavirus in the U.S.: Latest map and case count. https://www.nytimes.com/interactive/2021/us/covid-cases.html. Accessed 30 April 2021.
    1. USAFacts, US COVID-19 cases and deaths by state. https://usafacts.org/visualizations/coronavirus-covid-19-spread-map/. Accessed 29 April 2021.
    1. Kass-Hout T. A., et al. ., Application of change point analysis to daily influenza-like illness emergency department visits. J. Am. Med. Inform. Assoc. 19, 1075–1081 (2012). - PMC - PubMed
    1. Santillana M., et al. ., Combining search, social media, and traditional data sources to improve influenza surveillance. PLOS Comput. Biol. 11, e1004513 (2015). - PMC - PubMed

Publication types