Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Oct:36 Suppl 1:S49-58.
doi: 10.1007/s40264-013-0098-7.

Managing data quality for a drug safety surveillance system

Affiliations

Managing data quality for a drug safety surveillance system

Abraham G Hartzema et al. Drug Saf. 2013 Oct.

Abstract

Objective: The objective of this study is to present a data quality assurance program for disparate data sources loaded into a Common Data Model, highlight data quality issues identified and resolutions implemented.

Background: The Observational Medical Outcomes Partnership is conducting methodological research to develop a system to monitor drug safety. Standard processes and tools are needed to ensure continuous data quality across a network of disparate databases, and to ensure that procedures used to extract-transform-load (ETL) processes maintain data integrity. Currently, there is no consensus or standard approach to evaluate the quality of the source data, or ETL procedures.

Methods: We propose a framework for a comprehensive process to ensure data quality throughout the steps used to process and analyze the data. The approach used to manage data anomalies includes: (1) characterization of data sources; (2) detection of data anomalies; (3) determining the cause of data anomalies; and (4) remediation.

Findings: Data anomalies included incomplete raw dataset: no race or year of birth recorded. Implausible data: year of birth exceeding current year, observation period end date precedes start date, suspicious data frequencies and proportions outside normal range. Examples of errors found in the ETL process were zip codes incorrectly loaded, drug quantities rounded, drug exposure length incorrectly calculated, and condition length incorrectly programmed.

Conclusions: Complete and reliable observational data are difficult to obtain, data quality assurance processes need to be continuous as data is regularly updated; consequently, processes to assess data quality should be ongoing and transparent.

PubMed Disclaimer

Similar articles

Cited by

References

    1. Pharmacoepidemiol Drug Saf. 2007 Apr;16(4):393-401 - PubMed
    1. Clin J Am Soc Nephrol. 2006 Jan;1(1):43-51 - PubMed
    1. Med Care. 2012 Jul;50 Suppl:S21-9 - PubMed
    1. Am Heart J. 2002 Aug;144(2):290-6 - PubMed
    1. Surg Endosc. 2007 Oct;21(10):1733-7 - PubMed

MeSH terms

LinkOut - more resources