Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Nov 3;17(11):e0251470.
doi: 10.1371/journal.pone.0251470. eCollection 2022.

Reliability of COVID-19 data: An evaluation and reflection

Affiliations

Reliability of COVID-19 data: An evaluation and reflection

April R Miller et al. PLoS One. .

Abstract

Importance: The rapid proliferation of COVID-19 has left governments scrambling, and several data aggregators are now assisting in the reporting of county cases and deaths. The different variables affecting reporting (e.g., time delays in reporting) necessitates a well-documented reliability study examining the data methods and discussion of possible causes of differences between aggregators.

Objective: To statistically evaluate the reliability of COVID-19 data across aggregators using case fatality rate (CFR) estimates and reliability statistics.

Design, setting, and participants: Cases and deaths were collected daily by volunteers via state and local health departments, as primary sources and newspaper reports, as secondary sources. In an effort to begin comparison for reliability statistical analysis, BroadStreet collected data from other COVID-19 aggregator sources, including USAFacts, Johns Hopkins University, New York Times, The COVID Tracking Project.

Main outcomes and measures: COVID-19 cases and death counts at the county and state levels.

Results: Lower levels of inter-rater agreement were observed across aggregators associated with the number of deaths, which manifested itself in state level Bayesian estimates of COVID-19 fatality rates.

Conclusions and relevance: A national, publicly available data set is needed for current and future disease outbreaks and improved reliability in reporting.

PubMed Disclaimer

Conflict of interest statement

The specific roles of these authors are articulated in the ‘author contributions’ section. This does not alter our adherence to PLOS ONE policies on sharing data and materials.

Figures

Fig 1
Fig 1. Provides the intra-rater reliability between each aggregator pair as Kappas state daily counts.
Darker (1.00) to lighter (0.00) color indicates more to less agreement, respectively.
Fig 2
Fig 2. Provides the reliability within each county for the number of COVID-19 cases using the reliability categories proposed by Cicchetti and Sparrow [39].
Darker (1.00) to lighter (0.00) color indicates more to less agreement. For kappa, smaller sample size differences won’t penalize smaller value differences.
Fig 3
Fig 3. Provides the reliability within each county for the number of COVID-19 deaths using the reliability categories proposed by Cicchetti and Sparrow.
Darker (1.00) to lighter (0.00) color indicates more to less agreement.
Fig 4
Fig 4. Time-progression of case fatality rate across states and sources.
The horizontal bars represent the 90% equal-tail credible intervals, with the vertical black bars indicating the posterior means.
Fig 5
Fig 5. Aggregated (June 30, 2020) case fatality rate across states and sources.
The horizontal bars represent the 90% equal-tail credible intervals, with the vertical black bars indicating the posterior means.

References

    1. Whitelaw S, Mamas MA, Topol E, Van Spall HGC. Applications of digital technology in COVID-19 pandemic planning and response. Lancet Digit Health. 2020. Aug;2(8):e435–40. doi: 10.1016/S2589-7500(20)30142-4 - DOI - PMC - PubMed
    1. Callaghan S. COVID-19 Is a Data Science Issue. Patterns N Y N. 2020. May 8;1(2):100022. doi: 10.1016/j.patter.2020.100022 - DOI - PMC - PubMed
    1. Data Collection and Reporting | NNDSS [Internet]. [cited 2021 Mar 12]. https://wwwn.cdc.gov/nndss/data-collection.html.
    1. Killeen BD, Wu JY, Shah K, Zapaishchykova A, Nikutta P, Tamhane A, et al. A County-level Dataset for Informing the United States’ Response to COVID-19. ArXiv200400756 Phys Q-Bio [Internet]. 2020 Sep 10 [cited 2021 Mar 12]; http://arxiv.org/abs/2004.00756.
    1. Shiode N, Shiode S, Rod-Thatcher E, Rana S, Vinten-Johansen P. The mortality rates and the space-time patterns of John Snow’s cholera epidemic map. Int J Health Geogr. 2015. Jun 17;14:21. doi: 10.1186/s12942-015-0011-y - DOI - PMC - PubMed

Publication types