Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr 4:16:235-247.
doi: 10.2147/CLEP.S437937. eCollection 2024.

A Harmonised Approach to Curating Research-Ready Datasets for Asthma, Chronic Obstructive Pulmonary Disease (COPD) and Interstitial Lung Disease (ILD) in England, Wales and Scotland Using Clinical Practice Research Datalink (CPRD), Secure Anonymised Information Linkage (SAIL) Databank and DataLoch

Affiliations

A Harmonised Approach to Curating Research-Ready Datasets for Asthma, Chronic Obstructive Pulmonary Disease (COPD) and Interstitial Lung Disease (ILD) in England, Wales and Scotland Using Clinical Practice Research Datalink (CPRD), Secure Anonymised Information Linkage (SAIL) Databank and DataLoch

Sara Hatam et al. Clin Epidemiol. .

Abstract

Background: Electronic healthcare records (EHRs) are an important resource for health research that can be used to improve patient outcomes in chronic respiratory diseases. However, consistent approaches in the analysis of these datasets are needed for coherent messaging, and when undertaking comparative studies across different populations.

Methods and results: We developed a harmonised curation approach to generate comparable patient cohorts for asthma, chronic obstructive pulmonary disease (COPD) and interstitial lung disease (ILD) using datasets from within Clinical Practice Research Datalink (CPRD; for England), Secure Anonymised Information Linkage (SAIL; for Wales) and DataLoch (for Scotland) by defining commonly derived variables consistently between the datasets. By working in parallel on the curation methodology used for CPRD, SAIL and DataLoch for asthma, COPD and ILD, we were able to highlight key differences in coding and recording between the databases and identify solutions to enable valid comparisons.

Conclusion: Codelists and metadata generated have been made available to help re-create the asthma, COPD and ILD cohorts in CPRD, SAIL and DataLoch for different time periods, and provide a starting point for the curation of respiratory datasets in other EHR databases, expediting further comparable respiratory research.

Keywords: COPD; HER; ILD; asthma; data curation; harmonisation.

PubMed Disclaimer

Conflict of interest statement

AS has been supported by institutional research grants from the Industrial Strategy Challenge Fund, the Medical Research Council and Health Data Research, and from the UK and Scottish Governments for the Usher Data Driven Innovation Hub which manages DataLoch. JKQ has been supported by institutional research grants from the Industrial Strategy Challenge Fund, the Medical Research Council, Health Data Research, GSK, BI, asthma+lung UK, AZ and received personal fees for advisory board participation, consultancy or speaking fees from GlaxoSmithKline, Evidera, Chiesi, AstraZeneca, Insmed. SS reports grants from Industrial Strategy Challenge Fund, grants from Medical Research Council, grants from Health Data Research UK, during the conduct of the study; grants from Industrial Strategy Challenge Fund, grants from e Medical Research Council, grants from Health Data Research UK, outside the submitted work. CO reports grants from Medical Research Council, during the conduct of the study. The authors report no other conflicts of interest in this work.

Figures

Figure 1
Figure 1
(a) CPRD asthma cohort inclusion and exclusion criteria with counts, (b) CPRD COPD cohort inclusion and exclusion criteria with counts, (c) CPRD ILD cohort inclusion and exclusion criteria with counts.
Figure 2
Figure 2
(a) SAIL asthma cohort inclusion and exclusion criteria with counts, (b) SAIL COPD cohort inclusion and exclusion criteria with counts, (c) SAIL ILD cohort inclusion and exclusion criteria with counts.
Figure 3
Figure 3
(a) DataLoch asthma cohort inclusion and exclusion criteria with counts, (b) DataLoch COPD cohort inclusion and exclusion criteria with counts, (c) DataLoch ILD cohort inclusion and exclusion criteria with counts.
Figure 4
Figure 4
Proportion (%) of asthma, COPD and ILD populations by age at first mention of condition (ie earliest valid record with diagnosis) stratified by sex per data source: CPRD, SAIL and DataLoch. Age at first mention of condition limited to <=93 years across all EHR databases to minimise small numbers. The minimum age threshold for asthma, COPD and ILD are ages 1, 35 and 40, respectively.

Similar articles

Cited by

References

    1. Momtazmanesh S, Moghaddam SS, Ghamari S-H, et al. Global burden of chronic respiratory diseases and risk factors, 1990-2019: an update from the global burden of disease study 2019. eClinicalMedicine. 2023;59:1. - PMC - PubMed
    1. British Lung Foundation. Asthma Statistics; 2023.
    1. All Party Parliamentary Group on Respiratory Health. Report on inquiry into respiratory deaths. https://www.blf.org.uk/sites/default/files/APPG-Respiratory-deaths-2014-....Accessed March 27, 2024.
    1. Halpin DM, Miravitlles M. Chronic obstructive pulmonary disease: the disease and its burden to society. Proc Am Thorac Soc. 2006;3(7):619–623. doi:10.1513/pats.200603-093SS - DOI - PubMed
    1. Devereux G. Definition, epidemiology, and risk factors. BMJ. 2006;332(7550):1142–1144. doi:10.1136/bmj.332.7550.1142 - DOI - PMC - PubMed