Pooling of primary care electronic health record (EHR) data on Huntington's disease (HD) and cancer: establishing comparability of two large UK databases
- PMID: 38355188
- PMCID: PMC10868307
- DOI: 10.1136/bmjopen-2022-070258
Pooling of primary care electronic health record (EHR) data on Huntington's disease (HD) and cancer: establishing comparability of two large UK databases
Erratum in
-
Correction: Pooling of primary care electronic health record (EHR) data on Huntington's disease (HD) and cancer: establishing comparability of two large UK databases.BMJ Open. 2025 Jun 16;15(6):e070258corr1. doi: 10.1136/bmjopen-2022-070258corr1. BMJ Open. 2025. PMID: 40523796 Free PMC article. No abstract available.
Abstract
Objectives: To explore whether UK primary care databases arising from two different software systems can be feasibly combined, by comparing rates of Huntington's disease (HD, which is rare) and 14 common cancers in the two databases, as well as characteristics of people with these conditions.
Design: Descriptive study.
Setting: Primary care electronic health records from Clinical Practice Research Datalink (CPRD) GOLD and CPRD Aurum databases, with linked hospital admission and death registration data.
Participants: 4986 patients with HD and 1 294 819 with an incident cancer between 1990 and 2019.
Primary and secondary outcome measures: Incidence and prevalence of HD by calendar period, age group and region, and annual age-standardised incidence of 14 common cancers in each database, and in a subset of 'overlapping' practices which contributed to both databases. Characteristics of patients with HD or incident cancer: medical history, recent prescribing, healthcare contacts and database follow-up.
Results: Incidence and prevalence of HD were slightly higher in CPRD GOLD than CPRD Aurum, but with similar trends over time. Cancer incidence in the two databases differed between 1990 and 2000, but converged and was very similar thereafter. Participants in each database were most similar in terms of medical history (median standardised difference, MSD 0.03 (IQR 0.01-0.03)), recent prescribing (MSD 0.06 (0.03-0.10)) and demographics and general health variables (MSD 0.05 (0.01-0.09)). Larger differences were seen for healthcare contacts (MSD 0.27 (0.10-0.41)), and database follow-up (MSD 0.39 (0.19-0.56)).
Conclusions: Differences in cancer incidence trends between 1990 and 2000 may relate to use of a practice-level data quality filter (the 'up-to-standard' date) in CPRD GOLD only. As well as the impact of data curation methods, differences in underlying data models can make it more challenging to define exactly equivalent clinical concepts in each database. Researchers should be aware of these potential sources of variability when planning combined database studies and interpreting results.
Keywords: EPIDEMIOLOGY; Epidemiology; Health informatics; PRIMARY CARE.
© Author(s) (or their employer(s)) 2024. Re-use permitted under CC BY. Published by BMJ.
Conflict of interest statement
Competing interests: DD and RW are full-time employees of the Medicines and Healthcare Products Regulatory Agency.
Figures




Similar articles
-
Comparison of antibiotic prescribing records in two UK primary care electronic health record systems: cohort study using CPRD GOLD and CPRD Aurum databases.BMJ Open. 2020 Jun 22;10(6):e038767. doi: 10.1136/bmjopen-2020-038767. BMJ Open. 2020. PMID: 32571866 Free PMC article.
-
Comparison of characteristics of patients with lung cancer in U.K. primary care databases: Clinical Practice Research Datalink Aurum and GOLD.Pharmacoepidemiol Drug Saf. 2023 Oct;32(10):1161-1177. doi: 10.1002/pds.5637. Epub 2023 May 22. Pharmacoepidemiol Drug Saf. 2023. PMID: 37309816
-
Methods to refine and extend a Pregnancy Register in the UK Clinical Practice Research Datalink primary care databases.Pharmacoepidemiol Drug Saf. 2023 Jun;32(6):617-624. doi: 10.1002/pds.5584. Epub 2023 Jan 16. Pharmacoepidemiol Drug Saf. 2023. PMID: 36522838
-
Evaluating the quality of prostate cancer diagnosis recording in CPRD GOLD and CPRD Aurum primary care databases for observational research: A study using linked English electronic health records.Cancer Epidemiol. 2025 Feb;94:102715. doi: 10.1016/j.canep.2024.102715. Epub 2024 Nov 30. Cancer Epidemiol. 2025. PMID: 39616870
-
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217. Cochrane Database Syst Rev. 2022. PMID: 36321557 Free PMC article.
References
-
- CPRD CPRD GOLD release notes. 2022. https://www.cprd.com/sites/default/files/2022-02 CPRD GOLD Release Notes... Available.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Medical
Research Materials