Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Dec 1;26(12):1545-1559.
doi: 10.1093/jamia/ocz105.

UK phenomics platform for developing and validating electronic health record phenotypes: CALIBER

Affiliations

UK phenomics platform for developing and validating electronic health record phenotypes: CALIBER

Spiros Denaxas et al. J Am Med Inform Assoc. .

Abstract

Objective: Electronic health records (EHRs) are a rich source of information on human diseases, but the information is variably structured, fragmented, curated using different coding systems, and collected for purposes other than medical research. We describe an approach for developing, validating, and sharing reproducible phenotypes from national structured EHR in the United Kingdom with applications for translational research.

Materials and methods: We implemented a rule-based phenotyping framework, with up to 6 approaches of validation. We applied our framework to a sample of 15 million individuals in a national EHR data source (population-based primary care, all ages) linked to hospitalization and death records in England. Data comprised continuous measurements (for example, blood pressure; medication information; coded diagnoses, symptoms, procedures, and referrals), recorded using 5 controlled clinical terminologies: (1) read (primary care, subset of SNOMED-CT [Systematized Nomenclature of Medicine Clinical Terms]), (2) International Classification of Diseases-Ninth Revision and Tenth Revision (secondary care diagnoses and cause of mortality), (3) Office of Population Censuses and Surveys Classification of Surgical Operations and Procedures, Fourth Revision (hospital surgical procedures), and (4) DM+D prescription codes.

Results: Using the CALIBER phenotyping framework, we created algorithms for 51 diseases, syndromes, biomarkers, and lifestyle risk factors and provide up to 6 validation approaches. The EHR phenotypes are curated in the open-access CALIBER Portal (https://www.caliberresearch.org/portal) and have been used by 40 national and international research groups in 60 peer-reviewed publications.

Conclusions: We describe a UK EHR phenomics approach within the CALIBER EHR data platform with initial evidence of validity and use, as an important step toward international use of UK EHR data for health research.

Keywords: electronic health records; medical informatics; personalized medicine; phenotyping.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The CALIBER platform (https://www.caliberresearch.org) links national structured electronic health records (EHRs) across primary care, secondary care, and mortality for research. EHR-derived phenotypes are created using an iterative methodology and 6 independent approaches of evidence are generated to assess algorithm accuracy. More than 50 phenotypes are published in an open-access resource, the CALIBER Portal (https://www.caliberresearch.org/portal), and are used in >60 publications.
Figure 2.
Figure 2.
CALIBER Portal entry for the heart failure phenotype (available at https://www.caliberresearch.org/portal/phenotypes/heartfailure). Each entry in the Portal contains implementation details on the logic and the terms from controlled clinical terminologies associated with the phenotyping algorithm. Additionally, the 6 approaches of validation evidence are presented and the research output that has used the phenotype is provided.
Figure 3.
Figure 3.
Assessing the recording and concordance of 3 electronic health record (EHR)–derived phenotypes (heart failure, nonfatal acute myocardial infarction [AMI], and bleeding) across 3 EHR data sources: primary care (Clinical Practice Research Datalink [CPRD]), hospital care (Hospital Episode Statistics [HES]), and mortality (Office for National Statistics [ONS]) or disease registry data (Myocardial Ischaemia National Audit Project [MINAP]). Only a very small proportion (9% for heart failure, 31% for AMI, and <1% for bleeding) of cases are identified concurrently by all 3 data sources. ICD-10: International Classification of Diseases–Tenth Revision.
Figure 4.
Figure 4.
Risk factors for initial presentation of heart failure (HF) phenotype: hazard ratio (HR) and 95% confidence interval of smoking status, type 2 diabetes mellitus (T2DM), systolic blood pressure (BP) and heart rate based on previously published CALIBER studies,, compared with estimates obtained from investigator-led studies derived using manually curated research data. All individual analyses have been adjusted for age and sex and other covariates. Scale: 279 × 215 mm (72 × 72 dots per inch).

References

    1. Williams T, van Staa T, Puri S, Eaton S.. Recent advances in the utility and use of the general practice research database as an example of a UK primary care data resource. Ther Adv Drug Saf 2012; 32: 89–99. - PMC - PubMed
    1. Ludwick DA, Doucette J.. Adopting electronic medical records in primary care: lessons learned from health information systems implementation experience in seven countries. Int J Med Inform 2009; 781: 22–31. - PubMed
    1. Turnbull C, Scott RH, Thomas E, Jones L, Murugaesu N, Pretty FB.. The 100 000 genomes project: bringing whole genome sequencing to the NHS. BMJ 2018; 361: k1687. - PubMed
    1. Denaxas SC, Fatemifar G, Patel R, Hemingway H.. Deriving research-quality phenotypes from national electronic health records to advance precision medicine: a UK Biobank case-study In: Proceedings of the BHI-2017 International Conference on Biomedical and Health Informatics. Orlando, FL: IEEE Engineering in Medicine and Biology Society (EMBS; ); 2017.
    1. Schnier C, Denaxas S, Eggo R, et al. Identification and validation of myocardial infarction and stroke outcomes at scale in UK Biobank. Int J Pop Data Sci 2017; 11: 337. doi: 10.23889/ijpds.v1i1.358.

Publication types