Core concepts in pharmacoepidemiology: Validation of health outcomes of interest within real-world healthcare databases
- PMID: 36057777
- PMCID: PMC9772105
- DOI: 10.1002/pds.5537
Core concepts in pharmacoepidemiology: Validation of health outcomes of interest within real-world healthcare databases
Abstract
Real-world healthcare data, including administrative and electronic medical record databases, provide a rich source of data for the conduct of pharmacoepidemiologic studies but carry the potential for misclassification of health outcomes of interest (HOIs). Validation studies are important ways to quantify the degree of error associated with case-identifying algorithms for HOIs and are crucial for interpreting study findings within real-world data. This review provides a rationale, framework, and step-by-step approach to validating case-identifying algorithms for HOIs within healthcare databases. Key steps in validating a case-identifying algorithm within a healthcare database include: (1) selecting the appropriate health outcome; (2) determining the reference standard against which to validate the algorithm; (3) developing the algorithm using diagnosis codes, diagnostic tests or their results, procedures, drug therapies, patient-reported symptoms or diagnoses, or some combinations of these parameters; (4) selection of patients and sample sizes for validation; (5) collecting data to confirm the HOI; (6) confirming the HOI; and (7) assessing the algorithm's performance. Additional strategies for algorithm refinement and methods to correct for bias due to misclassification of outcomes are discussed. The review concludes by discussing factors affecting the transportability of case-identifying algorithms and the need for ongoing validation as data elements within healthcare databases, such as diagnosis codes, change over time or new variables, such as patient-generated health data, are included in these data sources.
Keywords: algorithm; database; electronic health records; methods; misclassification; validation.
© 2022 John Wiley & Sons Ltd.
Figures
Comment in
-
Letter to the Editor re Davis et al., 2023: BELpREG, the first of its kind real-world data source on medication use in pregnancy in Belgium.Pharmacoepidemiol Drug Saf. 2024 Feb;33(2):e5751. doi: 10.1002/pds.5751. Pharmacoepidemiol Drug Saf. 2024. PMID: 38362651 No abstract available.
Similar articles
-
Identifying ventricular arrhythmia and sudden cardiac arrest in clinical notes of an electronic health record database.Future Cardiol. 2025 Jun;21(8):593-598. doi: 10.1080/14796678.2025.2506956. Epub 2025 May 18. Future Cardiol. 2025. PMID: 40383962 Free PMC article.
-
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217. Cochrane Database Syst Rev. 2022. PMID: 36321557 Free PMC article.
-
Intra-database validation of case-identifying algorithms using reconstituted electronic health records from healthcare claims data.BMC Med Res Methodol. 2021 May 1;21(1):95. doi: 10.1186/s12874-021-01285-y. BMC Med Res Methodol. 2021. PMID: 33933001 Free PMC article.
-
Identifying health outcomes in healthcare databases.Pharmacoepidemiol Drug Saf. 2015 Oct;24(10):1009-16. doi: 10.1002/pds.3856. Epub 2015 Aug 18. Pharmacoepidemiol Drug Saf. 2015. PMID: 26282185 Review.
-
A Computable Phenotype Algorithm for Postvaccination Myocarditis/Pericarditis Detection Using Real-World Data: Validation Study.J Med Internet Res. 2024 Nov 25;26:e54597. doi: 10.2196/54597. J Med Internet Res. 2024. PMID: 39586081 Free PMC article.
Cited by
-
Validation of an Algorithm to Identify Venous Thromboembolism in Health Insurance Claims Data Among Patients with Rheumatoid Arthritis.Clin Epidemiol. 2023 Jun 1;15:671-682. doi: 10.2147/CLEP.S402360. eCollection 2023. Clin Epidemiol. 2023. PMID: 37284517 Free PMC article.
-
Usefulness and caveats of real-world data for research on hypertension and its association with cardiovascular or renal disease in Japan.Hypertens Res. 2024 Nov;47(11):3099-3113. doi: 10.1038/s41440-024-01875-5. Epub 2024 Sep 11. Hypertens Res. 2024. PMID: 39261703 Free PMC article. Review.
-
Methods for identifying health status from routinely collected health data: An overview.Integr Med Res. 2025 Mar;14(1):101100. doi: 10.1016/j.imr.2024.101100. Epub 2024 Nov 15. Integr Med Res. 2025. PMID: 39897572 Free PMC article.
-
Erlotinib or Gefitinib for Treating Advanced Epidermal Growth Factor Receptor Mutation-Positive Lung Cancer in Aotearoa New Zealand: Protocol for a National Whole-of-Patient-Population Retrospective Cohort Study and Results of a Validation Substudy.JMIR Res Protoc. 2024 Jul 2;13:e51381. doi: 10.2196/51381. JMIR Res Protoc. 2024. PMID: 38954434 Free PMC article.
-
Guidance of development, validation, and evaluation of algorithms for populating health status in observational studies of routinely collected data (DEVELOP-RCD).Mil Med Res. 2024 Aug 6;11(1):52. doi: 10.1186/s40779-024-00559-y. Mil Med Res. 2024. PMID: 39107834 Free PMC article.
References
-
- Lanes S, Brown JS, Haynes K, Pollack MF, Walker AM. Identifying health outcomes in healthcare databases. Pharmacoepidemiol Drug Saf. 2015;24(10):1009–1016. - PubMed
-
- van Walraven C, Bennett C, Forster AJ. Administrative database research infrequently used validated diagnostic or procedural codes. J Clin Epidemiol. 2011;64(10):1054–1059. - PubMed
-
- Benchimol EI, Manuel DG, To T, Griffiths AM, Rabeneck L, Guttmann A. Development and use of reporting guidelines for assessing the quality of validation studies of health administrative data. J Clin Epidemiol. 2011;64(8):821–829. - PubMed
-
- Lash TL, Olshan AF. Epidemiology announces the “Validation Study” submission category. Epidemiology. 2016;27(5):613–614. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources