Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec 6;11(12):e047623.
doi: 10.1136/bmjopen-2020-047623.

COVID-19 surveillance data quality issues: a national consecutive case series

Affiliations

COVID-19 surveillance data quality issues: a national consecutive case series

Cristina Costa-Santos et al. BMJ Open. .

Abstract

Objectives: High-quality data are crucial for guiding decision-making and practising evidence-based healthcare, especially if previous knowledge is lacking. Nevertheless, data quality frailties have been exposed worldwide during the current COVID-19 pandemic. Focusing on a major Portuguese epidemiological surveillance dataset, our study aims to assess COVID-19 data quality issues and suggest possible solutions.

Settings: On 27 April 2020, the Portuguese Directorate-General of Health (DGS) made available a dataset (DGSApril) for researchers, upon request. On 4 August, an updated dataset (DGSAugust) was also obtained.

Participants: All COVID-19-confirmed cases notified through the medical component of National System for Epidemiological Surveillance until end of June.

Primary and secondary outcome measures: Data completeness and consistency.

Results: DGSAugust has not followed the data format and variables as DGSApril and a significant number of missing data and inconsistencies were found (eg, 4075 cases from the DGSApril were apparently not included in DGSAugust). Several variables also showed a low degree of completeness and/or changed their values from one dataset to another (eg, the variable 'underlying conditions' had more than half of cases showing different information between datasets). There were also significant inconsistencies between the number of cases and deaths due to COVID-19 shown in DGSAugust and by the DGS reports publicly provided daily.

Conclusions: Important quality issues of the Portuguese COVID-19 surveillance datasets were described. These issues can limit surveillance data usability to inform good decisions and perform useful research. Major improvements in surveillance datasets are therefore urgently needed-for example, simplification of data entry processes, constant monitoring of data, and increased training and awareness of healthcare providers-as low data quality may lead to a deficient pandemic control.

Keywords: COVID-19; epidemiology; health informatics; information management; public health; statistics & research methods.

PubMed Disclaimer

Conflict of interest statement

Competing interests: None declared.

Figures

Figure 1
Figure 1
Example of one possible information flow from the moment the data are introduced until the dataset is made available to researchers. The ⊗ symbol means that data are not sent and therefore not present in the research database (DB). The dashed line represents a manual cumbersome process that is many times executed by public health professionals and that is very susceptible to errors. DGS, Directorate-General of Health.
Figure 2
Figure 2
Number of unique case identifiers presented in the datasets of COVID-19 cases diagnosed since the start of the pandemic until 27 April (date when the first database was made available) and after 27 April. DGS, Directorate-General of Health.

Similar articles

Cited by

References

    1. Morgan O. How decision makers can use quantitative approaches to guide outbreak responses. Philos Trans R Soc Lond B Biol Sci 2019;374:20180365. 10.1098/rstb.2018.0365 - DOI - PMC - PubMed
    1. Xu B, Kraemer MUG, Gutierrez B, Open COVID-19 Data Curation Group . Open access epidemiological data from the COVID-19 outbreak. Lancet Infect Dis 2020;20:534. 10.1016/S1473-3099(20)30119-5 - DOI - PMC - PubMed
    1. Yozwiak NL, Schaffner SF, Sabeti PC. Data sharing: make outbreak research open access. Nature 2015;518:477–9. 10.1038/518477a - DOI - PubMed
    1. German RR, Lee LM, Horan JM, et al. . Updated guidelines for evaluating public health surveillance systems: recommendations from the guidelines Working group. MMWR Recomm Rep 2001;50:1-35; quiz CE1-7. - PubMed
    1. Alonso V, Santos JV, Pinto M, et al. . Health records as the basis of clinical coding: is the quality adequate? A qualitative study of medical coders' perceptions. Health Inf Manag 2020;49:28-37. 10.1177/1833358319826351 - DOI - PubMed

Publication types