Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Feb 14;14(2):e070258.
doi: 10.1136/bmjopen-2022-070258.

Pooling of primary care electronic health record (EHR) data on Huntington's disease (HD) and cancer: establishing comparability of two large UK databases

Affiliations

Pooling of primary care electronic health record (EHR) data on Huntington's disease (HD) and cancer: establishing comparability of two large UK databases

Daniel Dedman et al. BMJ Open. .

Erratum in

Abstract

Objectives: To explore whether UK primary care databases arising from two different software systems can be feasibly combined, by comparing rates of Huntington's disease (HD, which is rare) and 14 common cancers in the two databases, as well as characteristics of people with these conditions.

Design: Descriptive study.

Setting: Primary care electronic health records from Clinical Practice Research Datalink (CPRD) GOLD and CPRD Aurum databases, with linked hospital admission and death registration data.

Participants: 4986 patients with HD and 1 294 819 with an incident cancer between 1990 and 2019.

Primary and secondary outcome measures: Incidence and prevalence of HD by calendar period, age group and region, and annual age-standardised incidence of 14 common cancers in each database, and in a subset of 'overlapping' practices which contributed to both databases. Characteristics of patients with HD or incident cancer: medical history, recent prescribing, healthcare contacts and database follow-up.

Results: Incidence and prevalence of HD were slightly higher in CPRD GOLD than CPRD Aurum, but with similar trends over time. Cancer incidence in the two databases differed between 1990 and 2000, but converged and was very similar thereafter. Participants in each database were most similar in terms of medical history (median standardised difference, MSD 0.03 (IQR 0.01-0.03)), recent prescribing (MSD 0.06 (0.03-0.10)) and demographics and general health variables (MSD 0.05 (0.01-0.09)). Larger differences were seen for healthcare contacts (MSD 0.27 (0.10-0.41)), and database follow-up (MSD 0.39 (0.19-0.56)).

Conclusions: Differences in cancer incidence trends between 1990 and 2000 may relate to use of a practice-level data quality filter (the 'up-to-standard' date) in CPRD GOLD only. As well as the impact of data curation methods, differences in underlying data models can make it more challenging to define exactly equivalent clinical concepts in each database. Researchers should be aware of these potential sources of variability when planning combined database studies and interpreting results.

Keywords: EPIDEMIOLOGY; Epidemiology; Health informatics; PRIMARY CARE.

PubMed Disclaimer

Conflict of interest statement

Competing interests: DD and RW are full-time employees of the Medicines and Healthcare Products Regulatory Agency.

Figures

Figure 1
Figure 1. Huntington’s disease (HD) incidence and prevalence, 1990–2019: CPRD GOLD and CPRD Aurum. CPRD, Clinical Practice Research Datalink.
Figure 2
Figure 2. Age-standardised incidence of 14 common cancers, 1990–2019: CPRD GOLD and CPRD Aurum primary care data only (columns A and C), and with linked hospital admission (HES APC) and death registrations (ONS) data (columns B and D). Reference rates (red solid and dashed lines) are National Cancer Registration statistics for England. CPRD, Clinical Practice Research Datalink; HES APC, Hospital Episode Statistics Admitted Patient Care; ICD, International Classification of Diseases 10th Revision; ONS, Office for National Statistics.
Figure 3
Figure 3. Impact of practice up-to-standard date (UTS) on age-standardised and sex-standardised incidence of four most common cancers in a subset of overlapping practices, 1990–2019: CPRD GOLD (yellow symbols) and CPRD Aurum (blue symbols). Incidence was calculated in two ways: applying the UTS filter to exclude data prior to UTS date (‘+UTS’, filled symbols), ignoring the UTS filter and including data prior to UTS date (‘no UTS’; unfilled symbols). CPRD, Clinical Practice Research Datalink; ICD-10, International Classification of Diseases 10th Revision.
Figure 4
Figure 4. Standardised differences for baseline characteristics of incident Huntington’s disease (HD) and cancer patients in CPRD GOLD versus CPRD Aurum. Each row is a patient characteristic. Each point is a comparison between GOLD and Aurum for that characteristic in patients with an incident outcome. Symbol colour indicates outcome (ie, condition/cancer site). Symbol shape indicates direction of difference: circle symbol: mean/proportion highest in CPRD GOLD; triangle symbol: mean/proportion highest in CPRD Aurum. BMI, body mass index; BNF, British National Formulary; BP, blood pressure; CPRD, Clinical Practice Research Datalink; CVD, cardiovascular disease; GP, general practice.

Similar articles

References

    1. Madigan D, Ryan PB, Schuemie M, et al. Evaluating the impact of database heterogeneity on observational study results. Am J Epidemiol. 2013;178:645–51. doi: 10.1093/aje/kwt010. - DOI - PMC - PubMed
    1. Bazelier MT, Eriksson I, de Vries F, et al. Data management and data analysis techniques in pharmacoepidemiological studies using a pre-planned multi-database approach: a systematic literature review. Pharmacoepidemiol Drug Saf. 2015;24:897–905. doi: 10.1002/pds.3828. - DOI - PMC - PubMed
    1. Dedman D, Cabecinha M, Williams R, et al. Approaches for combining primary care electronic health record data from multiple sources: a systematic review of observational studies. BMJ Open. 2020;10:e037405. doi: 10.1136/bmjopen-2020-037405. - DOI - PMC - PubMed
    1. CPRD CPRD GOLD release notes. 2022. https://www.cprd.com/sites/default/files/2022-02 CPRD GOLD Release Notes... Available.
    1. Herrett E, Gallagher AM, Bhaskaran K, et al. Data resource profile: clinical practice research datalink (CPRD) Int J Epidemiol. 2015;44:827–36. doi: 10.1093/ije/dyv098. - DOI - PMC - PubMed

Publication types