Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul 10;22(1):288.
doi: 10.1186/s12916-024-03499-5.

Consistency, completeness and external validity of ethnicity recording in NHS primary care records: a cohort study in 25 million patients' records at source using OpenSAFELY

Affiliations

Consistency, completeness and external validity of ethnicity recording in NHS primary care records: a cohort study in 25 million patients' records at source using OpenSAFELY

OpenSAFELY Collaborative et al. BMC Med. .

Abstract

Background: Ethnicity is known to be an important correlate of health outcomes, particularly during the COVID-19 pandemic, where some ethnic groups were shown to be at higher risk of infection and adverse outcomes. The recording of patients' ethnic groups in primary care can support research and efforts to achieve equity in service provision and outcomes; however, the coding of ethnicity is known to present complex challenges. We therefore set out to describe ethnicity coding in detail with a view to supporting the use of this data in a wide range of settings, as part of wider efforts to robustly describe and define methods of using administrative data.

Methods: We describe the completeness and consistency of primary care ethnicity recording in the OpenSAFELY-TPP database, containing linked primary care and hospital records in > 25 million patients in England. We also compared the ethnic breakdown in OpenSAFELY-TPP with that of the 2021 UK census.

Results: 78.2% of patients registered in OpenSAFELY-TPP on 1 January 2022 had their ethnicity recorded in primary care records, rising to 92.5% when supplemented with hospital data. The completeness of ethnicity recording was higher for women than for men. The rate of primary care ethnicity recording ranged from 77% in the South East of England to 82.2% in the West Midlands. Ethnicity recording rates were higher in patients with chronic or other serious health conditions. For each of the five broad ethnicity groups, primary care recorded ethnicity was within 2.9 percentage points of the population rate as recorded in the 2021 Census for England as a whole. For patients with multiple ethnicity records, 98.7% of the latest recorded ethnicities matched the most frequently coded ethnicity. Patients whose latest recorded ethnicity was categorised as Other were most likely to have a discordant ethnicity recording (32.2%).

Conclusions: Primary care ethnicity data in OpenSAFELY is present for over three quarters of all patients, and combined with data from other sources can achieve a high level of completeness. The overall distribution of ethnicities across all English OpenSAFELY-TPP practices was similar to the 2021 Census, with some regional variation. This report identifies the best available codelist for use in OpenSAFELY and similar electronic health record data.

Keywords: Data curation; Electronic health records; Ethnicity; Primary care health sciences.

PubMed Disclaimer

Conflict of interest statement

All authors declare the following: BG has received research funding from the Bennett Foundation, the Laura and John Arnold Foundation, the NHS National Institute for Health Research (NIHR), the NIHR School of Primary Care Research, NHS England, the NIHR Oxford Biomedical Research Centre, the Mohn-Westlake Foundation, NIHR Applied Research Collaboration Oxford and Thames Valley, the Wellcome Trust, the Good Thinking Foundation, Health Data Research UK, the Health Foundation, the World Health Organisation, UKRI MRC, Asthma UK, the British Lung Foundation, and the Longitudinal Health and Wellbeing strand of the National Core Studies programme; he is a Non-Executive Director at NHS Digital; he also receives personal income from speaking and writing for lay audiences on the misuse of science. BMK is also employed by NHS England working on medicines policy and clinical lead for primary care medicines data.

Figures

Fig. 1
Fig. 1
Bar plot showing proportion of registered TPP population with a recorded ethnicity by clinical and demographic subgroups, based on primary care records (solid bars) and when supplemented with secondary care data (pale bars)
Fig. 2
Fig. 2
Bar plot showing proportion of registered TPP population with a recorded ethnicity by clinical and demographic subgroups, based on primary care records (solid bars) and when supplemented with secondary care data (pale bars)
Fig. 3
Fig. 3
Boxplot showing the 5th, 25th, 50th, 75th and 95th percentiles of completeness of ethnicity recording across practices with at least 1000 registered patients
Fig. 4
Fig. 4
Sankey plot comparing the categorisation of ethnicity in primary care and secondary care
Fig. 5
Fig. 5
Bar plot showing the proportion of 2021 Census and primary care populations per ethnicity grouped into 5 groups (excluding those without a recorded ethnicity (21.8% SNOMED:2020 and 7.5% supplemented with ethnicity data from secondary care)). Data labels indicate the percentage point difference between 2021 Census and TPP populations
Fig. 6
Fig. 6
Bar plot showing the proportion of 2021 Census and TPP populations in each ethnicity group by region (excluding those without a recorded ethnicity (21.8% in primary care and 7.5% supplemented with ethnicity data from secondary care)). Data labels indicate percentage point difference between 2021 Census and TPP populations

References

    1. Irizar P, Pan D, Kapadia D, Bécares L, Sze S, Taylor H, et al. Ethnic inequalities in COVID-19 infection, hospitalisation, intensive care admission, and death: a global systematic review and meta-analysis of over 200 million study participants. EClinicalMedicine. 2023;57:101877. doi: 10.1016/j.eclinm.2023.101877. - DOI - PMC - PubMed
    1. Mathur R, Rentsch CT, Morton CE, Hulme WJ, Schultze A, MacKenna B, et al. Ethnic differences in SARS-CoV-2 infection and COVID-19-related hospitalisation, intensive care unit admission, and death in 17 million adults in England: an observational cohort study using the OpenSAFELY platform. Lancet. 2021;397(10286):1711–1724. doi: 10.1016/S0140-6736(21)00634-6. - DOI - PMC - PubMed
    1. Garlick S. Ethnic group, England and Wales - Office for National Statistics. Office for National Statistics; 2022. Available from: https://www.ons.gov.uk/peoplepopulationandcommunity/culturalidentity/eth.... Cited 2023 May 24.
    1. Knox S, Bhopal RS, Thomson CS, Millard A, Fraser A, Gruer L, et al. The challenge of using routinely collected data to compare hospital admission rates by ethnic group: a demonstration project in Scotland. J Public Health. 2020;42(4):748–755. doi: 10.1093/pubmed/fdz175. - DOI - PubMed
    1. Scobie S, Spencer J, Raleigh V. Ethnicity coding in English health service datasets. Available from: https://www.nuffieldtrust.org.uk/files/2021-06/1622731816_nuffield-trust.... Cited 2023 Feb 12.

Publication types

LinkOut - more resources