Managing data quality for a drug safety surveillance system
- PMID: 24166223
- DOI: 10.1007/s40264-013-0098-7
Managing data quality for a drug safety surveillance system
Abstract
Objective: The objective of this study is to present a data quality assurance program for disparate data sources loaded into a Common Data Model, highlight data quality issues identified and resolutions implemented.
Background: The Observational Medical Outcomes Partnership is conducting methodological research to develop a system to monitor drug safety. Standard processes and tools are needed to ensure continuous data quality across a network of disparate databases, and to ensure that procedures used to extract-transform-load (ETL) processes maintain data integrity. Currently, there is no consensus or standard approach to evaluate the quality of the source data, or ETL procedures.
Methods: We propose a framework for a comprehensive process to ensure data quality throughout the steps used to process and analyze the data. The approach used to manage data anomalies includes: (1) characterization of data sources; (2) detection of data anomalies; (3) determining the cause of data anomalies; and (4) remediation.
Findings: Data anomalies included incomplete raw dataset: no race or year of birth recorded. Implausible data: year of birth exceeding current year, observation period end date precedes start date, suspicious data frequencies and proportions outside normal range. Examples of errors found in the ETL process were zip codes incorrectly loaded, drug quantities rounded, drug exposure length incorrectly calculated, and condition length incorrectly programmed.
Conclusions: Complete and reliable observational data are difficult to obtain, data quality assurance processes need to be continuous as data is regularly updated; consequently, processes to assess data quality should be ongoing and transparent.
Similar articles
-
A Comparative Assessment of Observational Medical Outcomes Partnership and Mini-Sentinel Common Data Models and Analytics: Implications for Active Drug Safety Surveillance.Drug Saf. 2015 Aug;38(8):749-65. doi: 10.1007/s40264-015-0297-5. Drug Saf. 2015. PMID: 26055920
-
Healthcare Databases for Drug Safety Research: Data Validity Assessment Remains Crucial.Drug Saf. 2018 Sep;41(9):829-833. doi: 10.1007/s40264-018-0673-z. Drug Saf. 2018. PMID: 29714003 Review.
-
Dynamic-ETL: a hybrid approach for health data extraction, transformation and loading.BMC Med Inform Decis Mak. 2017 Sep 13;17(1):134. doi: 10.1186/s12911-017-0532-3. BMC Med Inform Decis Mak. 2017. PMID: 28903729 Free PMC article.
-
Incrementally Transforming Electronic Medical Records into the Observational Medical Outcomes Partnership Common Data Model: A Multidimensional Quality Assurance Approach.Appl Clin Inform. 2019 Oct;10(5):794-803. doi: 10.1055/s-0039-1697598. Epub 2019 Oct 23. Appl Clin Inform. 2019. PMID: 31645076 Free PMC article.
-
Avoiding and identifying errors in health technology assessment models: qualitative study and methodological review.Health Technol Assess. 2010 May;14(25):iii-iv, ix-xii, 1-107. doi: 10.3310/hta14250. Health Technol Assess. 2010. PMID: 20501062 Review.
Cited by
-
Computational approaches for pharmacovigilance signal detection: toward integrated and semantically-enriched frameworks.Drug Saf. 2015 Mar;38(3):219-32. doi: 10.1007/s40264-015-0278-8. Drug Saf. 2015. PMID: 25749722 Free PMC article.
-
A Comparison of Data Quality Assessment Checks in Six Data Sharing Networks.EGEMS (Wash DC). 2017 Jun 12;5(1):8. doi: 10.5334/egems.223. EGEMS (Wash DC). 2017. PMID: 29881733 Free PMC article.
-
Data Extraction and Management in Networks of Observational Health Care Databases for Scientific Research: A Comparison of EU-ADR, OMOP, Mini-Sentinel and MATRICE Strategies.EGEMS (Wash DC). 2016 Feb 8;4(1):1189. doi: 10.13063/2327-9214.1189. eCollection 2016. EGEMS (Wash DC). 2016. PMID: 27014709 Free PMC article.
-
Ethical issues in nanomedicine: Tempest in a teapot?Med Health Care Philos. 2017 Mar;20(1):3-11. doi: 10.1007/s11019-016-9720-7. Med Health Care Philos. 2017. PMID: 27522374
-
Transparent reporting of data quality in distributed data networks.EGEMS (Wash DC). 2015 Mar 23;3(1):1052. doi: 10.13063/2327-9214.1052. eCollection 2015. EGEMS (Wash DC). 2015. PMID: 25992385 Free PMC article.
References
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical